Research:Wikipedia Readership Survey 2011

This page documents a completed research project.

General Questions edit

Why is the Wikimedia Foundation conducting a survey of Wikipedia readers? edit

We are conducting the survey of Wikipedia readers with the following objectives in mind:

  • Gather baseline demographic data about Wikipedia readers: gender, languages, education, income, etc.
  • Opinion about Wikipedia content: what kind of content do they read on Wikipedia, what content is good on Wikipedia, how do they rate different articles on Wikipedia, what makes an article good in their opinion?
  • Editing Behavior: have they ever edited the Wikipedia, what are the barriers to editing Wikipedia?
  • Understand device ecology of Wikipedia readers: what kinds of technological devices do readers own, what devices do they use to read Wikipedia, what are the pros and cons with each device?
  • Online behaviors: how do Wikipedia readers spend their time online, what kinds of sites do they visit, what kind of online contributions do they make?
  • Delve deeper into Wikipedia Mobile Usage: how do Wikipedia users read Wikipedia on mobile devices? Leverage findings from qualitative research.

What is the methodology? Why are we not conducting the survey on our website? edit

The survey will be an online survey. Unlike the UNU-Merit survey which combined Wikipedia readers and editors into one survey, we decided to split the survey to be able to do an online household survey to get a broader view, and more robust and representative sample of Wikipedia readers. In the past, we have found that surveys conducted on our website are more biased towards editors, males and heavy readers of Wikipedia. By doing a survey at the household level, we will ensure that we have a good representation of casual and female readers of Wikipedia. If you are wondering, we are currently analyzing data from the recently concluded editor survey, and will start sharing the results from the survey soon.

What does the survey cover? edit

The survey will be conducted in the following 16 countries: US, Japan, Germany, UK, France, Canada, Italy, Brazil, India, Russia, Poland, Mexico, Spain, Australia, Egypt and South Africa. In addition to English, the survey will be translated into the following languages: Japanese, German, French, Italian, Portuguese, Russian, Polish, Spanish, Arabic, Hindi, French Canadian and Zulu. The total sample size for the study is 4000 with a sample of 250 in each country.

Why is the foundation conducting in the survey in the following countries? What about the countries that are not being surveyed? edit

The foundation has had to make some tough decisions on countries where the survey will be conducted as each additional country increases the cost of the survey. The countries were selected based on two criteria: number of page views and strategic importance as per the strategic plan. The countries selected account for about 70 percent of Wikipedia page views. We are planning to conduct the readership survey frequently (one to one and a half year interval) and we plan to keep some core countries and rotate the rest to get data from other countries at each iteration of the survey. You can provide feedback about the countries that you would like to be included in the next round of the survey.

Who is conducting the survey? edit

The survey will be fielded in end of June. The foundation has hired Resolve Market Research to conduct the survey. Resolve is working with the foundation in the design of the questionnaire, they are responsible for providing sample for the survey, translations, programming of the survey, data cleaning and initial analysis. At the foundation, Mani Pande, Head of Global Development Research, is the primary contact for the survey.

How will the survey data be handled? What about privacy? edit

The survey data will be anonymized and analyzed collectively. No response will be associated with an individual participant.

Will the report and other information from the readership survey be shared? edit

Yes, the report from the readership survey will be available on meta wiki when it is ready. We will also do a series of blog posts to share interesting insights from the study. Data files in a CSV format will also be shared with researchers who are interested in conducting additional analysis.

How can I provide feedback? edit

We have posted the initial draft of the readership survey on meta-wiki. We are hoping that the community will provide its feedback to the questions/questionnaire. Please feel free to copy edit the questions as long as it does not change the meaning of the question. Please provide your feedback here. The deadline for providing feedback to the questionnaire is May 31, Tuesday.

Survey Design edit

Who designed the survey, and how? edit

The Wikimedia Foundation designed the survey, along with Resolve Market Research. We have got internal input from the foundation, and are looking for input from the community in the design of the survey.

How can I design and help improve the survey? edit

Please feel free to copy edit the survey as long as it does not change the meaning of the questions. If you believe that we are missing some important questions, please provide feedback via the feedback space that has been set up on meta. If you believe we are off the mark regarding some questions, please provide your feedback.

What is the methodology for the survey? Why is the survey limited to the following countries and languages only? edit

It is an online survey that will be conducted in 16 countries that account for about 70 percent of Wikipedia page views. Each country will have a sample size of about 250. Resolve Market Research is using a sample provider for getting the sample for the survey. The countries were chosen based on strategic importance and page views. Due to financial reasons, we had to limit the number of countries where we can conduct the survey. But we will rotate the countries in each iteration of the survey to get insights from new countries each time.

Survey Implementation edit

How will the survey be implemented? edit

The survey will be conducted online using a sample from a sample provider in each country.

Will the referencing strike be tracked? edit

No IP address will be stored.

Will unfinished survey be saved? Will the answers be completed? edit

Yes, unfinished surveys will be saved. We will share toplines/frequencies based on completed surveys. But we will share data from unfinished surveys also.

How is the survey being translated? edit

Resolve Market Research is responsible for the translation process, and they will be using their network of translators.

Privacy and transparency edit

Is the survey covered by the Wikimedia Foundation’s privacy policies? edit

Yes, the survey will be covered by the foundation’s privacy policies.

Who will have access to the raw data? Will the raw data be shared publicly? edit

The foundation will have access to raw data to conduct analysis. A sanitized version of the data will be available for researchers who are interested in conducting further analysis.

How long will the data be kept? edit

At minimum the data will be kept for about two years to look at longitudinal trends in Wikipedia readership.

What do you mean by anonymized? edit

Individual responses will not be analyzed, instead we will only look at all the responses collectively.

Results edit

Results from the survey were posted on the Wikimedia blog.

In addition, all the results are compiled in a wiki format here: Research:Wikipedia Readership Survey 2011/Results