Research:Knowledge Gaps Index/Measurement/Readers Survey 2023

Tracked in Phabricator:
Task T341890
11:19, 6 September 2023 (UTC)
Duration:  2023-07 – ??-??

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.

This project aims to understand the demographics and motivations of Wikipedia Readers across language editions. It is part of the Knowledge Gaps Index focus on Readers of Wikipedias, and continues the work of the 2019 Readers survey.

Progress on this project can be followed at T341890.

Key Takeaways


Here is a summary of the 2023 Global Reader Survey results. For the full results please check the Results section.


  • Wikipedia readers skew young, although this varies by project. However, readers aged 18-24 are a plurality of those 18+ in nearly all surveyed projects (the exceptions are dewiki and nlwiki).


  • Wikipedia readers across all projects identify disproportionately as solely men. By project, readership of ukwiki, rowiki, and ruwiki are closest to gender parity.


  • Wikipedia readers are highly-educated. Many readers are current students. Across each surveyed project, a majority of readers under age 30 are current students.


  • Wikipedia readers are highly multilingual: a majority speak two or more languages fluently. However, readers of any given Wikipedia are overwhelmingly reading in a primary language. Readers of enwiki are most likely to be non-native language speakers.

Reader Behavior


Data Collection


This project employed simple random sampling of Wikipedia readers using the QuickSurveys extension. The QuickSurveys opt-in was displayed to non-logged in users and asked them whether they would like to participate in a survey to help improve Wikipedia. Survey responses were collected using LimeSurvey, an external survey tool.

The goals of the survey are to make demographic estimates of Wikipedia readers across different language projects within the scope of the Knowledge Gaps Index, to understand motivations for reading Wikipedia, and to analyze whether there are differences by motivation and demographics in who reads which type of content.

Analyses of the survey data will primarily follow the 2019 edition of the survey.


Date Milestone
September 22—October 4, 2023 enwiki pilot survey
November 14—November 22, 2023 enwiki full sample survey
November 30—December 14, 2023 arwiki, cswiki, dewiki, elwiki, eswiki, fawiki, frwiki, hewiki, hiwiki, jawiki, kowiki, nlwiki, plwiki, ptwiki, rowiki, ruwiki, trwiki, viwiki, ukwiki, zhwiki full sample surveys

Policy, Ethics and Human Subjects Research


This survey is governed by the Global Readers Survey privacy statement.

Survey Administration Results


Surveys were fielded across 23 projects from November 14--December 18, 2023. A total of 80,242 complete survey responses were collected.

Global Readers Demographic Survey (2023) Fielding Summary
Project Fielding Dates QuickSurvey Sampling Ratio Total LimeSurvey Initiations Total Completes
arwiki (Arabic) 28/11- 18/12 12.4% 40526 5186
cswiki (Czech) 28/11- 11/12 10.0% 4592 1618
dewiki (German) 28/11- 11/12 5.2% 23797 9589
elwiki (Greek) 28/11- 13/12 20.0% 5557 1537
enwiki (English) 14-22/11 2.0% 40497 9479
eswiki (Spanish) 28/11- 18/12 6.6% 39071 8769
fawiki (Farsi) 28/11- 11/12 2.1% 7689 1850
frwiki (French) 28/11- 11/12 9.7% 23368 6617
hewiki (Hebrew) 28/11- 13/12 8.6% 5044 1609
hiwiki (Hindi) 28/11- 18/12 20.0% 62278 715
idwiki (Indonesian) 28/11- 13/12 15.0% 13671 1516
itwiki (Italian) 28/11- 11/12 2.5% 5855 1996
jawiki (Japanese 28/11- 18/12 1.6% 7023 1905
kowiki (Korean) 28/11- 18/12 20.0% 12800 1575
nlwiki (Dutch) 28/11- 18/12 7.5% 5420 1528
plwiki (Polish) 28/11- 11/12 7.5% 12341 3672
ptwiki (Portuguese) 28/11- 18/12 15.0% 32457 4619
rowiki (Romanian) 28/11- 11/12 21.1% 9820 2399
ruwiki (Russian) 28/11- 18/12 1.5% 15637 5357
trwiki (Turkish) 28/11- 11/12 7.5% 8568 1792
ukwiki (Ukrainian) 28/11- 11/12 6.4% 6576 2094
viwiki (Vietnamese) 28/11- 18/12 7.5% 6111 1075
zhwiki (Simplified and Traditional Chinese) 28/11- 18/12 7.1% 15841 3745

Responses Results


Age Screener


Only readers aged 18 years and older were considered eligible for the survey. As a result, all readers who opted into the survey were first shown an age-based screener question. Those who indicated they were under 18 had their survey sessions terminated.

Are you at least 18 years of age?
○ Yes
○ No

Unfortunately, legal protections for people under 18 mean we cannot survey you. Thank you for your interest!

Reader Motivation


Consistent with previous survey research conducted by the Wikimedia Foundation [1] [2] , we asked readers about their motivations for reading Wikipedia. However, in this survey, we allowed respondents to select multiple motivations and to write-in other motivations that were not listed as answer options.

I am reading this article because ...

Please select all answers that apply

□ I have a work or school-related assignment
□ I need to make a personal decision based on this topic (e.g., buy a book, choose a
travel destination)
□ I have a work or school-related assignment
□ I need to make a personal decision based on this topic (e.g., buy a book, choose a
travel destination)
□ I want to know more about a current event (e.g., a soccer game, a recent earthquake,
somebody’s death)
□ the topic was referenced in a piece of media (e.g., TV, radio, article, film, book)
□ the topic came up in a conversation
□ I am bored or randomly exploring Wikipedia for fun
□ this topic is important to me and I want to learn more about it (e.g., to learn about a culture)
□ Other:__________________________

When asked what motivated them to read the article they were sampled from during the 2023 Global Readers Survey, respondents were overall most likely to say the article topic was "personally important" to them. (Respondents were able to select multiple motivations).

Motivations cited by Wikipedia readers for reading the article they were sampled from

Similarly, at the project level, readers of all surveyed projects except for Korean Wikipedia were most likely to say they were reading the article because it is personally important to them. Korean Wikipedia readers were most likely to say they were "bored or randomly exploring Wikipedia for fun".

Cited motivations for reading article by readers of each surveyed Wikipedia project

Reader Information Needs


Again, following previous survey research, we asked readers about the specific information needs that motivated them to read the article from which they were sampled.

I am reading this article to …
○ look up a specific fact or to get a quick answer
○ get an overview of the topic
○ get an in-depth understanding of the topic

Overall, Wikipedia readers are most likely to say they are are reading to "get an overview of the topic". However, reader information needs are fairly evenly distributed with 41.2% saying they are reading for an "overview", 32.1% to "look up a specific fact or to get a quick answer", and 26.0% to "get an in-depth understanding of the topic".

Information sought in Wikipedia articles by readers

At the project level, Farsi Wikipedia readers are most likely to say they are looking for "an in-depth understanding" (52.5%), Hebrew Wikipedia readers are most likely to say they are seeking "an overview" (50.0%), and Vietnamese Wikipedia readers are most likely to say they need to "look up a specific fact or...get a quick answer" (42.2%).

Wikipedia readers' information needs by project

Reader Topic Prior Knowledge


We presented readers with the same survey question measuring their prior knowledge of the topic of the article they were reading that was used in previous readers surveys.

Prior to visiting this article … 
○ I was already familiar with the topic
○ I was not familiar with the topic, and I am learning about it for the first time

Overall, Wikipedia readers are more likely to say that they are already familiar with the topic they are reading about (55.0%) than not (44.2%).

Wikipedia readers' topic familiarity

Readers of most language projects are similarly more likely to be reading articles on topics with which they are already familiar. However, there are some exceptions: readers of Chinese Wikipedia are particularly likely to be reading on unfamiliar topics (59.5%). In contrast, Dutch Wikipedia readers are most likely to read on familiar topics (74.6%).

Wikipedia readers' topic familiarity by project

Reader Age


Age has been robustly associated with a broad range of social attitudes and behaviors[1][2] (and even survey response quality[3]) in addition to internet use and digital proficiency. Moreover, previous Wikimedia Foundation research has found that Wikipedia readers are disproportionately young. We measured age with the following item drawn from the Community Insights survey.

What is your age?

○ 18-24
○ 25-29
○ 30-39
○ 40-49
○ 50-59
○ 60-69
○ 70+
○ I prefer not to say

Of those 18 and older, respondents across all surveyed projects are most likely to be aged 18-24 (27.9% of readers 18+). However, the age distribution of readers varies considerably across the surveyed projects.

Reader age across all sampled projects (18+ only)

In particular, readers of Vietnamese Wikipedia are most likely to be under the age of 30 (61.5% aged 18-29), while Dutch Wikipedia (21.8% aged 18-29) and German Wikipedia (21.0% aged 18-29) readers are least likely to be under the age of 30.

Age of Wikipedia readers by surveyed project (respondents 18+ only)
Wikipedia readers aged 18-29 by surveyed project

Reader Gender Identity


Johnson et al. (2021)[4] demonstrate key gender differences in Wikipedia readership; specifically, that men are overrepresented among Wikipedia readers and read more frequently and for longer sessions and that men and women show distinctive topical preferences. This is consistent with the well-known and persistent gender-based bias of Wikipedia content and persistent overrepresentation of men among Wikipedia editors.

In order to facilitate comparisons between surveys of Wikipedia readers and contributors to Wikimedia projects, this research employed a gender identity survey item aligned with that used in the 2024 Community Insights survey. Note that respondents to the arwiki, fawiki, and inwiki surveys were not presented with the "transgender", "non-binary", and "genderfluid" response options.

Which of these categories describe your gender
identity? Select all that apply.

□ Man
□ Woman
□ Transgender
□ Non-binary
□ Genderfluid
□ Other: _________________
○ I prefer not to say

Across all surveyed projects, a clear majority (63.3%) of respondents identified solely as men, 25.1% identified solely as women, 6.4% identified as genderdiverse, and 5.1% declined to provide an answer.

Gender identities of Wikipedia readers (responses recoded to mutually exclusive categories)

Readers identifying solely as men made up an outright majority in every surveyed project, but projects like Romanian Wikipedia (54.6% readers identifying as men only) and Ukrainian Wikipedia (51.7%) are substantially closer to gender parity than projects like Turkish Wikipedia (71.7%) or Indonesian Wikipedia (70.6%).

Share of readers who identify solely as men by surveyed Wikipedia project

Reader Education


As summarized in the Taxonomy of Knowledge Gaps, a substantial body of research demonstrates that Wikipedia readers are disproportionately highly-educated. A related body of research suggests that English Wikipedia articles[5] may not be readable for less-highly-literate readers, particularly for health-related content[6][7], while more recent research suggests these findings can be extended to most other language versions.

Measuring educational attainment cross-nationally is a longstanding methodological challenge in survey research[8]. This is further complicated in our case by the fact that Global Readers surveys are designed and sampled by language project rather than by geography (e.g., enwiki respondents alone are educated under a wide variety of very different educational systems). We also sought to balance survey item simplicity with cross-system comparability. Together, these constraints made it difficult for us to substantially localize our measures of educational attainment.

In this survey, we measured education with two survey items: one asking whether respondents were currently enrolled as students and a subsequent item asking non-students to indicate their level of educational attainment based loosely on the ISCED-1997 classifications. We employed this scheme rather than years of education completed as used in previous readers survey research to facilitate more direct comparisons both cross-nationally[9] and with Community Insights data on contributors.

Are you currently enrolled as a student in school (for example, high school, vocational or trade school, a college or university)?
○ Yes
○ No
○ I'm not sure
○ I prefer not to say

Only shown to respondents who selected "No" above

What is the highest level of formal education you have completed?
○ I have no formal schooling
○ Some primary or elementary school
○ Primary or elementary school
○ Lower secondary or middle school
○ Upper secondary or high school
○ A post-secondary technical or vocational degree or certificate
○ A post-secondary or university degree
○ A post-graduate degree (e.g., master's, doctorate, or professional degree)
○ I'm not sure
○ I prefer not to say

Current students


Substantial shares of readers in every surveyed project indicated that they are currently enrolled students, although this varies considerably from fewer than one-in-five overall among Dutch (19.7%) and German (19.5%) Wikipedia readers to an outright majority of Vietnamese Wikipedia readers (54.2%).

Share of readers who are currently students by surveyed project

In addition, current students comprise a majority of younger readers (those 18-29) in each surveyed project.

Proportion of readers aged 18-29 who are currently enrolled as students by project

Educational attainment (non-students)


Overall, Wikipedia readers are highly-educated: a majority of non-students (56.0% total) have completed a Bachelors' degree (28.8%) or a post-graduate degree (27.2%).

Proportional shares of Wikipedia readers by educational attainment (non-students only)

At the project level, Indonesian Wikipedia readers are most likely to report an educational attainment at the upper secondary (high school) level or lower, while Polish Wikipedia readers are most likely to report holding a post-graduate degree.

Reader educational attainment (non-students only) by project

Among non-students, Ukrainian Wikipedia readers (76.5%) are most likely overall to report having at least a Bachelor's degree, while Indonesian Wikipedia readers are the least likely (38.1%) relative to other surveyed projects.

Proportion of readers with Bachelor's or Post-graduate degrees by surveyed project (non-students only)

Reader Languages


In general, Wikipedia readers are highly multilingual. When asked what languages they speak fluently, fewer than half (44%) say they are fluent in only one language, while more than one-in-five (21.5%) say they speak three or more fluently. However, readers are overwhelmingly reading in (one of) their primary languange(s).

Readers by the number of languages they speak fluently

In all but one surveyed project, about nine-in-ten (or more) readers say they are reading in one of their primary languages. The relative exception to this finding is English Wikipedia, where more than one-in-four say English is not one of their primary languages.

Readers reading in (one of) their primary language(s) by project

In contrast, the prevalence of monolinguality in the project language varies considerably by project. In general, East Asian language projects (and Greek Wikipedia) show the highest levels of monolinguality among readers—especially readers of Japanese Wikipedia (90.1%). Conversely, readers of German Wikipedia (22.3%) and Turkish Wikipedia (22.8%) were least likely to say they were monolingual in the project language.

Monolingual fluency among Wikipedia readers across 22 surveyed projects

Reader Identities


In order to measure cultural background gaps, as described in the Taxonomy of Knowledge Gaps we employ survey items adapted from the European Social Survey[10] (and also used in the Community Insights survey of Wikimedia contributors) designed to measure whether respondents belong to:

  • A minority ethnicity in the country where they live
  • A group that is discriminated against in the country where they live
  • Why they are discriminated against (if applicable)

Minority Ethnicity

Do you belong to a minority ethnic group in the country where you currently live?
○ Yes
○ No
○ I'm not sure
○ I prefer not to say

The UN Office of the High Commissioner for Human Rights (UNHCR), roughly estimates that 10-20 percent of the world population belongs ot a national, ethnic, religious, or linguistic minority. This is broadly consistent with our Global readers sample, where 15% of respondents indicate that they belong to an ethnic minority in the country where they live.

Share of Wikipedia readers belonging to a minority ethnic group

At the project level, readers of idwiki (20.7%) and enwiki (19.6%) are most likely to identify as an ethnic minority. Conversely, readers of itwiki (4.0%) and elwiki (3.7%) are least likely to identify as belonging to a minority ethnic group.

Share of Wikipedia readers belonging to a minority ethnic group in each surveyed project

Discriminated Group Belonging

Sometimes people are discriminated against based on characteristics like abilities, physical appearance, or group belonging.

Would you describe yourself as a member of a group that has been discriminated against in the country where you currently live?

○ Yes
○ No
○ I'm not sure
○ I prefer not to say

One-in-four (25.0%) readers indicated that they belong to a group that is discriminated against in the country where they live. These findings are broadly similar to those reported in the 2023 Community Insights survey of Wikimedia contributors.

Share of Wikipedia readers belonging to discriminated groups

Readers of English wikipedia appear most likely to describe themselves as belonging to a discriminated group (31.9%). Readers of Vietnamese wikipedia are the least likely to identify that way (5.5%). Unfortunately, we are not able at this point to determine the extent to which project-level variation on this item is the product of different experiences, varying levels of willingness to identify as belonging to a marginalized group, or varying understandings of what it means to be discriminated against.

Share of Wikipedia readers belonging to discriminated groups by project

Readers who indicated that they belonged to a discriminated group were then asked to indicate on what grounds their identity/identities are discriminated against. Respondents were able to select as many as applied. Overall, readers were most likely to say they were discriminated against due to their gender (29.4%) or their skin color or race (27.6%).

Reasons for discrimination named by Wikipedia readers, ordered by frequency





This project employed simple random sampling of Wikipedia readers using the QuickSurveys extension. Sampling rates vary by project and are shown above. The QuickSurveys opt-in was displayed to non-logged-in readers only and asked whether they would consent to "Take a short survey and help us improve Wikipedia". We chose to employ the QuickSurvey tool to sample readers (rather than e.g., a Central Notice Banner) both for consistency with previous readers research conducted by the Wikimedia Foundation and to avoid sampling readers from non-article pages (e.g., talk pages, community pages, Wikipedia home pages).


Readers who consented to the survey were then linked out to a survey hosted on LimeSurvey, an open-source survey platform.




In order to account for sampling design and to better match the global population of Wikipedia readers, we apply weights based on global population parameters following the method described in DeBell and Krosnick (2009)[11] implemented using the 'anesrake' software package written for R.

Survey responses were weighted at the project level by OS family (Android, iOS, Windows, other), referrer class (external via search engine, internal, other), session length (one, two, three or more), geography (weighting categories vary by project). For analyses at the global level, responses were also weighted by project shares of overall traffic during the time when the surveys were in the field.

Planned Future Analysis


We plan to conduct the following further analyses of the 2023 Global Readers survey data and to share their results here:

  • Analysis linking demographic data from this survey with reader behavior (e.g., topics read) building on work conducted by Johnson et al. (2021)[4].
  • Analysis linking educational attainment with language-agnostic article readability scores developed by Trokhymovych et al. (2024) [12]
  • A comparative analysis of reader demographics from this study with data collected through a central notice banner-sampled study of German, English, Spanish, French, Italian, Portuguese, Turkish, and Ukrainian Wikipedias conducted by Cruciani et al.[13] in May-June of 2023.


  1. Neundorf, Anja; Niemi, Richard G. (2014). "Beyond political socialization: New approaches to age, period, cohort analysis". Electoral Studies: 1–6. ISSN 0261-3794. doi:10.1016/j.electstud.2013.06.012. 
  2. Dinas, Elias; Stoker, Laura (2014). "Age-Period-Cohort analysis: A design-based approach". Electoral Studies: 1–6. ISSN 0261-3794. doi:10.1016/j.electstud.2013.06.006. 
  3. Andrews, Frank M.; Herzog, A. Regula (1986). "The Quality of Survey Data as Related to Age of Respondent". Journal of the American Statistical Association 81 (394): 403–410. doi:10.1080/01621459. 
  4. a b Johnson, Isaac; Lemmerich, Florian; Sáez-Trumper, Diego; West, Robert; Strohmaier, Markus; Zia, Leila (2021). "Global Gender Differences in Wikipedia Readership". Proceedings of the International AAAI Conference on Web and Social Media, 15(1): 254–265. doi:10.1609/icwsm.v15i1.18058. 
  5. Lucassen, Teun; Dijkstra, Roald; Schraagen, Jan Maarten (2012). "Readability of Wikipedia". First Monday 17 (9). ISSN 1396-0466. doi:10.5210/fm.v0i0.3916. 
  6. Reavley, NJ; Mackinnon, AJ; Morgan, AJ; Alvarez-Jimenez, M; Hetrick, SE; Killackey, E; Nelson, B; Purcell, R; Yap, MBH; Jorm, AF (2012). "Quality of information sources about mental disorders: a comparison of Wikipedia with centrally controlled web and printed sources". Psychological Medicine 42 (8): 1753–1762. doi:10.1017/S003329171100287X. 
  7. Brezar, Aleksandar; Heilman, James (2019). "Readability of English Wikipedia's health information over time". WikiJournal of Medicine 6 (1): 1–6. ISSN 2002-4436. doi:10.15347/wjm/2019.007. 
  8. Connelly, Roxanne; Gayle, Vernon; Lambert, Paul S. (2016). "A review of educational attainment measures for social survey research". Methodological Innovations 9: 1–11. ISSN 2059-7991. doi:10.1177/2059799116638001. 
  9. Schneider, Silke L.; Gayle (2010). "Nominal comparability is not enough: (In-)equivalence of construct validity of cross-national measures of educational attainment in the European Social Survey". Research in Social Stratification and Mobility 28: 343–357. doi:10.1016/j.rssm.2010.03.001. 
  10. European Social Survey European Research Infrastructure (ESS ERIC) (2023), ESS round 10 - 2020. Democracy, Digital social contacts. Sikt - Norwegian Agency for Shared Services in Education and Research., doi:10.21338/NSD-ESS10-2020 
  11. DeBell, Matthew; Krosnick, Jon A. (2009). "Computing Weights for American National Election Study Survey Data" (PDF). ANES Technical Report series (nes012427): 1–14. 
  12. Trokhymovych, Mykola; Sen, Indira; Gerlach, Martin (June 3, 2024). "An Open Multilingual System for Scoring Readability of Wikipedia". arXiv:2406.01835. .
  13. Cruciani, Caterina; Joubert, Léo; Jullien, Nicolas; Mell, Laurent; Piccione, Sasha; Vermeirsche, Jeanne (2023-12-01). "Surveying Wikipedians: a dataset of users and contributors' practices on Wikipedia in 8 languages". arXiv:2311.07964.  Dataset: Cruciani, Caterina; Joubert, Léo; Jullien, Nicolas; Mell, Laurent; Piccione, Sasha; Vermeirsche, Jeanne (2023-12-01). Surveying Wikipedians: a dataset of users and contributors' practices on Wikipedia in 8 languages. doi:10.34847/nkl.4ecf4u8m.