Grants:Programs/Wikimedia Research Fund/Social and Language Influence in Wikipedia Articles for Deletion Debates

Social and Language Influence in Wikipedia Articles for Deletion Debates
start and end datesMarch 2022 - August 2022
budget (USD)5,000-9,999 USD
applicant(s)• Linda Wang

Overview edit


Applicant's Wikimedia username. If one is not provided, then the applicant's name will be provided for community review.

Linda Wang

Project title

Social and Language Influence in Wikipedia Articles for Deletion Debates

Entity Receiving Funds

Provide the name of the individual or organization that would receive the funds.

Linda Wang

Research proposal edit

Description edit

Description of the proposed project, including aims and approach. Be sure to clearly state the problem, why it is important, why previous approaches (if any) have been insufficient, and your methods to address it.

Broadly, “herding” occurs when everyone does what everyone else is doing, even when private information suggests otherwise (Banerjee 1992). Understanding the factors that drive herding is crucial, because herding may influence social decision-making in high-stakes contexts like financial markets and political elections. A better understanding can improve policies on decision-making processes, especially if the factors that drive herding are associated with suboptimal decisions.

There is evidence indicating that herding may be related to the difficulty, expertise, and or sentiment associated with the choice or action at hand (Spyrou 2013). Intuitively, it is worthwhile to study herding with text, given that text can capture these and other factors. Our goal is to determine the relative importance of social and textual factors in guiding decisions within selected text data.

Thus, we focus on the Wikipedia Articles for Deletion (AfD) debates. Removing a Wikipedia article requires a Wikipedia user to post a nomination on AfD. Other users then post a “keep” or “delete” vote (among other options, which we omit for simplicity), along with a comment describing their rationale. By examining the votes and comments, we can determine the extent to which users conform to the existing majority of the debate.

Several works support the existence of herding behavior in AfD. Taraborelli and Ciampaglia (2010), for example, used baseline probabilities to provide evidence of herding on early votes. Mayfield and Black (2019) later showed that language features in AfD could predict herding using a BERT-based model. We hope to further contribute by using a model to predict individual votes rather than debate outcomes in AfD; controlling for the confound of debate length; and partially addressing the endogeneity issue of user preferences. We will do this by, firstly, constructing a binary logistic model of user votes using subsets of features of the preceding votes. These features will include proportions of previous “keep” or “delete” votes, sentiment scores, and references to previous users. Secondly, we will separately compute features on the first and second halves of the preceding votes. Lastly, we will re-compute the features from an artificial one-shot game setting (i.e. by focusing only on votes whose timestamps fall under a predetermined threshold) to check if herding is distinguishable from preferences for particular vote types.

Budget edit

Approximate amount requested in USD.


Budget Description

Briefly describe what you expect to spend money on (specific budgets and details are not necessary at this time).

We would like to spend the money on stipend and computing costs for the duration of the project. Stipend costs will approximately be

6000,andcomputingcostswillapproximatelybe 300.

Impact edit

Address the impact and relevance to the Wikimedia projects, including the degree to which the research will address the 2030 Wikimedia Strategic Direction and/or support the work of Wikimedia user groups, affiliates, and developer communities. If your work relates to knowledge gaps, please directly relate it to the knowledge gaps taxonomy.

We believe that this project may assist the administrators who determine the final outcomes of articles on AfD, especially since these outcomes are not necessarily equivalent to popular vote. Whether herding assists or detracts from the efficacy of AfD debates, administrators may be better able to determine and adjust to it.

More generally, our project may be useful to the 2030 Wikimedia Strategic Direction’s recommendation for “improving user experience”. To enable more participation in Wikimedia projects, it is necessary to understand the conditions under which users are willing to participate when they hold opinions that oppose those of existing users. Herding is an indirect contributor to those conditions.

Dissemination edit

Plans for dissemination.

The results of this project will be disseminated on the author(s)’ web site. Additional dissemination will occur at presentations and conferences. We will comply with the WMF Open Access Policy and create project pages and reports as required by Wikimedia.

Past Contributions edit

Prior contributions to related academic and/or research projects and/or the Wikimedia and free culture communities. If you do not have prior experience, please explain your planned contributions.

Previously, the authors worked on an earlier version of the project in a graduate-level computer science class at Cornell University. This earlier version contained more exploratory analyses of AfD, and we utilized tools such as fighting words and topic modeling to distinguish the language of “keep” versus “delete” votes.

Additionally, the primary applicant previously worked on data projects related to labor market networks in Python and R at the Federal Reserve Bank of New York, and she later transitioned to research in economics and computational social sciences in graduate school. The quantitative skill set gained from these experiences would be an asset for this project.

I agree to license the information I entered in this form excluding the pronouns, countries of residence, and email addresses under the terms of Creative Commons Attribution-ShareAlike 4.0. I understand that the decision to fund this Research Fund application, the application itself along with all the information entered by my in this form excluding the pronouns, country of residences, and email addresses of the personnel will be published on Wikimedia Foundation Funds pages on Meta-Wiki and will be made available to the public in perpetuity. To make the results of your research actionable and reusable by the Wikimedia volunteer communities, affiliates and Foundation, I agree that any output of my research will comply with the WMF Open Access Policy. I also confirm that I have read the privacy statement and agree to abide by the WMF Friendly Space Policy and Universal Code of Conduct.