Research:Wikipedia Knowledge Integrity Risk Observatory/Literature review

Given the relevance of Wikipedia as a global knowledge infrastructure and the proliferation of misbehaviour and misinformation in web platforms, different works have analyzed the integrity of knowledge on Wikimedia projects. In this section, several studies conducted by WMF employees, academic researchers, and journalists are reviewed. Excerpts from these studies are presented in tables underlining findings and conclusions corresponding to relevant risks to knowledge integrity on Wikipedia.

WMF reports

Sáez-Trumper (2019)

This work performs a literature review to identify the most popular mechanisms to spread online disinformation and, also, to fight against disinformation^[1]. The article listed different vulnerabilities on Wikipedia, two of them marked as high risk:

Web Brigades: “a set of users coordinated to introduce fake content by exploiting the weakness of communities and systems”.
Circular reporting: “a situation where a piece of information appears to come from multiple independent sources, but in reality comes from only one source”.

Risk category	Excerpt
Community capacity	Moreover, the manual fact checking done by Wikipedia editors it is difficult to scale in projects without enough volunteers, increasing the risks of attacks in smaller or under-resourced communities.

Morgan (2019)

This report compiled the conclusions of a research project to understand needs, priorities, and workflows of patrollers in Wikimedia projects^[2]. The study borrows the notion of ‘threat model’ from cybersecurity to describe how malicious actors could exploit the way Wikimedia’s sociotechnical systems are designed to subvert those systems and circumvent procedures. Threats are expressed, distinguishing between:

fast patrolling (generally mediated by the RecentChanges feed), and
slow patrolling (most frequently mediated by editors’ personal watchlists, but also sometimes mediated through dashboards, bots and WikiProjects),

considering the following factors:

Types of vandalism
Content model (longform text for Wikipedia projects vs rich media and structured data for Commons and Wikidata)
Project size in terms of edit volume
Project size in terms of active registered editors
Elevated user rights
Availability of specialized tools

Risk category	Excerpt
Community capacity	On larger projects, there are often enough editors who self-select into patrolling work (with or without the patroller userright) that they are able to provide near-real-time coverage of recent changes on a consistent basis. Projects with fewer active editors may not be able to ensure real-time review—however, if the volume of edits is correspondingly small, edit review may be a matter of a couple editors performing a daily or weekly batch review of recent changes and catching the majority of recent vandalism that way. (...) A project with at least a few active editors with user rights such as rollbacker, administrator, and checkuser will be better able to identify and address many forms of vandalism efficiently. (...) There is no canonical list of all specialized tools that editors have developed and deployed to support patrolling. (..) Major bots and assistive editing programs do not work with many projects. (...) Smaller Wikipedias tend to have fewer local tool-builders and too-maintainers (...) Effective AbuseFilters are difficult to develop (...) Smaller projects may lack sufficient local editors with elevated userrights or subject matter expertise (...) Smaller projects are less well equipped to deal with coordinated high-volume external attacks, and may need to reach out to Stewards and Global CheckUsers, so they have less control on the timing or outcome of the intervention (...) Smaller Wikipedias have a higher risk of being hijacked by insiders.
Community governance	There may not be local rapid-response noticeboards (like AN/I on English Wikipedia) available on smaller wikis
Community demographics	Editors may create accounts and then let them lay inactive for a while (potentially after making a small number of innocuous edits) to avoid certain patrolling mechanisms that call attention to activity by very new accounts and/or accounts with very few edits. (...) There are hundreds of thousands of Wikipedia accounts that have never edited, or that havent made edits in many years. If these accounts are no longer monitored by their creators and dont have secure passwords, they are susceptible to being hacked
Media	When an article, or set of related articles, receive a great deal of traffic from social media sites like Facebook or YouTube (which use Wikipedia to fact check controversial UGC) or forums like Reddit (which has been used in the past to coordinate large-scale vandalism), and the article subsequently receives a high volume of edits from IPs or newly registered accounts, this may be a sign of coordinated vandalism.
Geopolitics	Vandalism can range from persistent disruption-for-disruptions-sake to externally-coordinated long-term disinformation campaigns run by well resourced interested parties such as ideologically-motivated interest groups, corporations, or even potentially nation states.

Academic research

Joshi et al. (2020)

This paper proposes a machine learning-based framework to identify undisclosed paid articles on Wikipedia^[3]. The framework is based on article-based features (age of user account at article creation, infoboxes, number of references, number of photos, number of categories, content length, network-based features) and user-based features (username-based features, average size of added text, average time difference, ten-byte ratio, percentage of edits on User or Talk pages).

Risk category	Excerpt
Community demographics	The percentage of edits that are less than 10 bytes in size is higher for users who created undisclosed paid articles: the value of the ten-byte ratio feature is, on average, 0.38 for positive articles and 0.34 for negative ones. This pattern aligns with the typical behavior of UPEs who make around 10 minor edits, then remain quiet for a few days waiting for becoming autoconfirmed users (the process takes 4 days), and then create a promotional article followed by the account going silent.

Spezzano et al. (2019)

This work presents a binary classifier^[4] to identify which of four different Wikipedia projects (English, German, French, and Italian Wikipedia) should be protected using:

Base features: Total average time between revisions, Total number of users making five or more revisions, Total average number of revisions per user, Total number of revisions by non-registered users, Total number of revisions made from mobile device, Total average size of revisions
Temporal features: Previous features within time frames
Page category-based features: Number of categories

Risk category	Excerpt
Content Controversiality	We investigate the relationship between controversial topics and page protection and between page popularity and page protection. In fact, edit wars often happen on controversial pages and edit warring is one of the main reasons why pages are protected, while, to gain visibility, vandalism may happen on popular pages. (...) the controversy level of a page is not a good indicator for deciding if protecting Wikipedia pages. (...) Edit-warring is a phenomenon that happens on both controversial and protected pages, but this does not imply that controversial and protected pages are necessarily related. (...) page popularity is more correlated than the page controversy level

Lewoniewski et al. (2019)

This study does not focus on problems with knowledge integrity, but on how popular selected topics are among readers and authors^[5]. However, results are based on an exhaustive assessment of quality in Wikipedia articles in over 50 projects.

Risk category	Excerpt
Content quality	In order to discern the quality of content, the Wikipedia community created a grading system for articles. However, each language version can use its own standards and grading scale.
Content verifiability	High-quality articles are expected to use reliable sources.

Lewoniewski et al. (2017)

To overcome the limitation that each Wikipedia project has its own rules for grading content’s quality, this work assesses the relative quality and popularity of millions of articles in multiple projects^[6].

Risk category	Excerpt
Content quality	If an article has a relatively large number of editors and edits, then often this article will be of high quality (...) A quality evaluation of Wikipedia articles can also be based on special quality flaw templates.

Kumar et al. (2016)

This paper presents the study of thousands of hoaxes on Wikipedia to understand their impact, characteristics and detection^[7]. Machine learning is deployed to classify articles as hoaxes based on appearance features (plain-text length, plain-text-to-markup ratio), link network features (ego-network clustering coefficient), support features (frequency with which an article’s name appears in other Wikipedia articles before it is created), and editor features (number of prior edits and editor age). Results indicate that appearance features do no better than random, but link network features (heavy to compute) and editor features boost the performance.

Risk category	Excerpt
Community demographics	The originators of typical legitimate articles are established members of the Wikipedia community

Kumar et al. (2015)

This article describes a machine learning system for vandal early warning on Wikipedia that beat best known algorithms for vandalism detection at that time^[8].

Risk category	Excerpt
Community demographics	Benign users: * are much more likely to occur on meta-pages. * take longer to edit a page than a vandal user. * are much more likely to re-edit the same page quickly (within 3 minutes) as compared to vandals, possibly because they wanted to go back and improve or fix something they previously wrote.

Rogers et al. (2012)

The last academic work of this review examines differences in the Wikipedia article about the Srebrenica massacre in multiple language editions (English, Dutch, SerboCroatian, Bosnian, Serbian, Croatian), in particular, flaws in relation to NPOV policy in specific communities^[9].

Risk category	Excerpt
Content controversiality	The benefits of topic self-selection by editors (passion, knowledge) may not adhere as well to controversial articles, where versions of events are emotionally contested
Community governance	Five of the top ten power editors of the English-language article on the Srebrenica massacre have been blocked indefinitely or suspected of socking by using multiple user names. After one or more usernames are blocked, one may return as an anonymous editor, and see that IP addressed blocked as well.
Community demographics	While it would be difficult to term any a universal article, there are what we could call instead umbrella articles, with two varieties, one created through the work of many, and the other the work of the few. There is a highly contested one 54 with many interlanguage editors (the English) and a softened, rather unifying one with very few editors (the Serbo-Croatian).
Community demographics / Content quality	It also was found that, contrary to earlier findings, the number of editors and number of edits by registered users did not correlate with featured article status, suggesting distinctive cultural quality mechanisms.

Journalism

Sato (2021)

This recent article, focused on the Japanese Wikipedia, concludes that non-English editions might have misinformation problems^[10].

Risk category	Excerpt
Community demographics	The English Wikipedia is viewed by “people across the globe.” But other language versions, such as Japanese, are primarily viewed and edited by people of a particular nation.
Community capacity	Japanese Wikipedia also has the lowest number of administrators per active user of all language editions.
Community governance	These cultural differences are reflected in the way people use Wikipedia. The talk pages on Japanese Wikipedia show how a group of editors often silence those with opposing views. Users who challenge them risk being accused of “political activism” or violating rules and have their accounts blocked. It’s similar to ijime (bullying), a societal problem in Japan. The community is far from what the “public sphere” is supposed to be.

Song (2020)

This story reviews problems in small Wikipedia projects (Scots, Croatian, Cebuano, and Azerbaijani) to detect gaps in community-based oversight^[11].

Risk category	Excerpt
Community capacity	Scots Wikipedia has historically been too small for bureaucrats, checkusers, or oversighters. Added to that, there’s been little participation in discussions from existing administrators.
Community demographics	According to the Signpost, an internal Wikipedia newsletter, an edit war ensued but ultimately, a small group of Neo-Nazi administrators essentially now have full reign over the Croatian Wikipedia. “Many editors, including some of the dissenting admins, have left Croatian Wikipedia,” the Signpost reports. “Those who haven’t abandoned Wikipedia altogether are resigned to edit elsewhere, chiefly at Serbo-Croatian Wikipedia. Since there is no opposition left, change has become impossible without outside intervention.” (...) “The Cebuano Wikipedia mostly consists of bot-generated stubs,” reads a thread on Meta-Wiki, a larger oversight forum for Wikipedia admins, on whether to shut the entire project down. “There are virtually no active users other than Lsj, his bot, a few vandals and the MediaWiki message delivery bot.”
Community governance	Last year, Meta-Wiki also received a request to remove all the admins of Azerbaijani Wikipedia. The community there had several complaints about admins for not swiftly acting against one admin, who had abused the block tool, introduced copyright issues, and used their admin status to push their personal opinion about a number of topics.
Community capacity / Community demographics	“Small wikis are where problems like these are much less likely to be noticed, and thus where they are more likely to manifest,” says Vermont. “Local administrators, being fewer in number and often ideologically similar, have more control than a project with dozens or hundreds of admins and a large, interested community for their oversight.”

Shubber (2014)

This article in Wired reported an incident when Wikipedia edits were found from Russian government IP addresses^[12].

Risk category	Excerpt
Community demographics / Geopolitics	Thanks to a Twitter bot that monitors Wikipedia edits made from Russian government IP addresses, someone from the All-Russia State Television and Radio Broadcasting Company (VGTRK) has been caught editing a Russian-language Wikipedia reference to MH17 in an article on aviation disasters.

References

↑ Saez-Trumper, Diego. 2019. Online disinformation and the role of wikipedia. arXiv preprint arXiv:1910.12596
↑ Morgan, Jonathan. 2019. Research: Patrolling on Wikipedia. Report-Meta
↑ Joshi, Nikesh and Spezzano, Francesca and Green, Mayson and Hill, Elijah. 2020. Detecting Undisclosed Paid Editing in Wikipedia. In Proceedings of The Web Conference 2020. 2899–2905.
↑ Spezzano, Francesca and Suyehira, Kelsey and Gundala, Laxmi Amulya. 2019. Detecting pages to protect in Wikipedia across multiple languages. Social Network Analysis and Mining 9, 1, 1–16.
↑ Lewoniewski, Włodzimierz and Węcel, Krzysztof and Abramowicz, Witold. 2019. Multilingual ranking of Wikipedia articles with quality and popularity assessment in different topics. Computers 8, 3, 60.
↑ Lewoniewski, Włodzimierz and Węcel, Krzysztof and Abramowicz, Witold. 2017. Relative quality and popularity evaluation of multilingual Wikipedia articles. In Informatics, Vol. 4. Multidisciplinary Digital Publishing Institute, 43.
↑ Kumar, Srijan and West, Robert and Leskovec, Jure. 2016. Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes. In Proceedings of the 25th international conference on World Wide Web. 591–602.
↑ Kumar, Srijan and Spezzano, Francesca and Subrahmanian, VS. 2015. Vews: A wikipedia vandal early warning system. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 607–616.
↑ Rogers, Richard and Sendijarevic, Emina and others. 2012. Neutral or National Point of View? A Comparison of Srebrenica articles across Wikipedia’s language versions. In unpublished conference paper, Wikipedia Academy, Berlin, Germany, Vol. 29
↑ Sato, Yumiko. 2021. Non-English Editions of Wikipedia Have a Misinformation Problem. https://slate.com/technology/2021/03/japanese-wikipedia-misinformation-non-english-editions.html
↑ Song, Victoria. 2020. A Teen Threw Scots Wiki Into Chaos and It Highlights a Massive Problem With Wikipedia. https://www.gizmodo.com.au/2020/08/a-teen-threw-scots-wiki-into-chaos-and-it-highlights-a-massive-problem-with-wikipedia
↑ Shubber, Kadhim. 2021. Russia caught editing Wikipedia entry about MH17. https://www.wired.co.uk/article/russia-edits-mh17-wikipedia-article

[saez2019online-1] Saez-Trumper, Diego. 2019. Online disinformation and the role of wikipedia. arXiv preprint arXiv:1910.12596

[morgan2019research-2] Morgan, Jonathan. 2019. Research: Patrolling on Wikipedia. Report-Meta

[joshi2020detecting-3] Joshi, Nikesh and Spezzano, Francesca and Green, Mayson and Hill, Elijah. 2020. Detecting Undisclosed Paid Editing in Wikipedia. In Proceedings of The Web Conference 2020. 2899–2905.

[spezzano2019detecting-4] Spezzano, Francesca and Suyehira, Kelsey and Gundala, Laxmi Amulya. 2019. Detecting pages to protect in Wikipedia across multiple languages. Social Network Analysis and Mining 9, 1, 1–16.

[lewoniewski2019multilingual-5] Lewoniewski, Włodzimierz and Węcel, Krzysztof and Abramowicz, Witold. 2019. Multilingual ranking of Wikipedia articles with quality and popularity assessment in different topics. Computers 8, 3, 60.

[lewoniewski2017relative-6] Lewoniewski, Włodzimierz and Węcel, Krzysztof and Abramowicz, Witold. 2017. Relative quality and popularity evaluation of multilingual Wikipedia articles. In Informatics, Vol. 4. Multidisciplinary Digital Publishing Institute, 43.

[kumar2016disinformation-7] Kumar, Srijan and West, Robert and Leskovec, Jure. 2016. Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes. In Proceedings of the 25th international conference on World Wide Web. 591–602.

[kumar2015vews-8] Kumar, Srijan and Spezzano, Francesca and Subrahmanian, VS. 2015. Vews: A wikipedia vandal early warning system. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 607–616.

[rogers2012neutral-9] Rogers, Richard and Sendijarevic, Emina and others. 2012. Neutral or National Point of View? A Comparison of Srebrenica articles across Wikipedia’s language versions. In unpublished conference paper, Wikipedia Academy, Berlin, Germany, Vol. 29

[sato2021-10] Sato, Yumiko. 2021. Non-English Editions of Wikipedia Have a Misinformation Problem. https://slate.com/technology/2021/03/japanese-wikipedia-misinformation-non-english-editions.html

[song2020-11] Song, Victoria. 2020. A Teen Threw Scots Wiki Into Chaos and It Highlights a Massive Problem With Wikipedia. https://www.gizmodo.com.au/2020/08/a-teen-threw-scots-wiki-into-chaos-and-it-highlights-a-massive-problem-with-wikipedia

[shubber2021-12] Shubber, Kadhim. 2021. Russia caught editing Wikipedia entry about MH17. https://www.wired.co.uk/article/russia-edits-mh17-wikipedia-article

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]