Grants:Programs/Wikimedia Research Fund/Changes in Wikipedia’s Gender Gap over Time and by Subject

statusnot funded
Changes in Wikipedia’s Gender Gap over Time and by Subject
start and end datesJuly 2023 - July 2024
budget (USD)35,000 USD
fiscal year2022-23
applicant(s)• Michael Mandiberg

Overview edit

Applicant(s)

Michael Mandiberg

Affiliation or grant type

College of Staten Island and The Graduate Center, CUNY

Author(s)

Michael Mandiberg

Wikimedia username(s)

Michael Mandiberg: User:Theredproject

Project title

Changes in Wikipedia’s Gender Gap over Time and by Subject

Research proposal edit

Description edit

Description of the proposed project, including aims and approach. Be sure to clearly state the problem, why it is important, why previous approaches (if any) have been insufficient, and your methods to address it.

I seek to understand how the gender gap on English Wikipedia has changed over time, and across subject areas. This research explores four interrelated questions:

RQ1: How has the gender gap changed in the ~20 years since Wikipedia’s origin? Do some subject areas have more significant gaps than other areas? For example, it seems likely that military articles have a larger gap than articles about musicians. Have gender gap community organizing closed the gap more in those areas, such as art and science?

RQ2: How does the gender gap change by birthdate of the article subject? Clearly the gap will be greater further back in history, versus living subjects, but how pronounced is this effect?

RQ3: Has the quality of articles about cisgendered women, trans and non-binary people improved relative to articles about men? Has this change been more or less pronounced by subject area?

RQ4: How has editor behavior changed in order to close this gap? Have editors who once primarily created articles about men shifted to creating articles about women? Or have new editors joined to contribute articles about women? Have these articles been created by very active editors, casual/one-time editors, or a combination of the two?

I will perform data analysis using the topic model I created to categorize all of English Wikipedia’s 1.8M biographies. Using the page IDs, and QIDs I will query the Wikipedia API, Wikidata API, and Wikipedia toolserver database instances to determine: page creation date, page creator, the gender of the article subject, and every revision IDs. From the revision IDs I will use the ORES API to score each article over time, following Aaron Halfaker’s “Keilana Effect” methodology.

These are all novel questions. No one has explored the gender gap over time. Beyond Halfaker’s paper, no one has evaluated the gender gap by overall article quality (only quantity). And I don’t believe anyone has looked at longitudinal changes in editor behavior. While this is an ambitious project, the scope of work is reasonable for the grant funding period, given I will be able to dedicate myself to this work.

I am uniquely suited to conduct this analysis. I have worked as a gender gap event facilitator and community organizer since 2014, and have been exploring Wikipedia’s data structures since 2008. I know what questions to ask, how to find the data and have the experiential knowledge to contextualize what I discover.

Personnel edit

N/A

Budget edit

Approximate amount requested in USD.

35,000 USD

Budget Description

Briefly describe what you expect to spend money on (specific budgets and details are not necessary at this time).

32,500 Salary/Stipend (This is the CUNY designated amount required to trigger Scholar Incentive Award matching funds and relieve me of my teaching for the year, allowing me to dedicate myself to this research.)

32,500 Matching Salary/Stipend (Scholar Incentive Award, secured from CUNY)

2000.00 Conference travel, registration, and accommodation (Secured from CUNY)

2500.00 OA publishing costs

5000.00 High Performance Computing Cluster (Secured from CUNY)

Total: 74,500

WMF Request: 35,000

Impact edit

Address the impact and relevance to the Wikimedia projects, including the degree to which the research will address the 2030 Wikimedia Strategic Direction and/or support the work of Wikimedia user groups, affiliates, and developer communities. If your work relates to knowledge gaps, please directly relate it to the knowledge gaps taxonomy.

This research aids the 2030 Wikimedia Strategic Direction’s knowledge equity goals. The community and the WMF have dedicated much time, energy, and resources to the effort to change Wikipedia’s culture and content. The WMF has supported dozens of initiatives with millions of dollars. Tens of thousands of people have made tens of millions of edits. Tools like the Wikipedia Diversity Observatory have allowed realtime calculations of the current gender gap. A limited number of papers have evaluated the impact of feminist editing campaigns. Yet the foundation and the community lack a more nuanced understanding of what has changed, and how it has changed. This knowledge will help the community evaluate, iterate, and adapt for the future.

Dissemination edit

Plans for dissemination.

I will write a general audience text for The Atlantic on RQ3 and two articles on RQ1/RQ2 and RQ4 (targeting New Media & Society, International Journal of Communication.) I will focus my conference presentations on community events, including Wikimania, WikiConNA, Wiki Workshop, and the Wikimedia Research Showcase. As with my previous research, I will share code on Github and data on Zenodo that can be used in future research. And I will disseminate my findings via the international press.

Past Contributions edit

Prior contributions to related academic and/or research projects and/or the Wikimedia and free culture communities. If you do not have prior experience, please explain your planned contributions.

My exploration of the Wikipedia database began in 2008, when I started working on Print Wikipedia, which I completed in 2015 (https://en.wikipedia.org/wiki/Print_Wikipedia). In 2014 I was one of four co-founders of Art+Feminism; I remained a lead organizer and then board member until 2020 (https://en.wikipedia.org/wiki/Art%2BFeminism). In the last few years, I have shifted my focus to publishing data analysis research. I published a topic model of all of English Wikipedia’s biographies in Art Documentation and have a forthcoming (March 2023) article in Social Text that explores English Wikipedia’s and Wikidata’s race and ethnicity gap. I have also published more accessible texts in The Atlantic written for a general audience.


I agree to license the information I entered in this form excluding the pronouns, countries of residence, and email addresses under the terms of Creative Commons Attribution-ShareAlike 4.0. I understand that the decision to fund this Research Fund application, the application itself along with all the information entered by my in this form excluding the pronouns, country of residences, and email addresses of the personnel will be published on Wikimedia Foundation Funds pages on Meta-Wiki and will be made available to the public in perpetuity. To make the results of your research actionable and reusable by the Wikimedia volunteer communities, affiliates and Foundation, I agree that any output of my research will comply with the WMF Open Access Policy. I also confirm that I have read the privacy statement and agree to abide by the WMF Friendly Space Policy and Universal Code of Conduct.

Yes