Research:Implications of ChatGPT for knowledge integrity on Wikipedia

Created
06:07, 12 July 2023 (UTC)
Duration:  2023-July – 2024-June
ChatGPT, knowledge integrity, misinformation, AI, large language models

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.


Large Language Models (LLMs) like ChatGPT have captured global attention, promising great advances to knowledge. Some in the Wikimedia community have identified the possibilities of LLMs: enabling editors to generate a first draft of an article, to summarise sources, to produce transcriptions of video and to more easily query Wikidata content.[1][2] Others have highlighted the possible risks of LLMs producing vast swathes of AI-generated content or automated comments to simulate the appearance of discussion, debate and consensus that make the job of maintaining quality, verified, consensus-driven content difficult.[1][2] The aim of this project is to explore the implications of content-generating AI systems such as ChatGPT for knowledge integrity on Wikipedia and to investigate whether Wikipedia rules and practices are robust enough to deal with the next generation of AI tools. Knowledge integrity is a foundational principle of Wikipedia practice: the verifiability of Wikipedia content is a core content policy and Wikimedians have been working to understand how to counter systemic bias on the project for almost two decades. Rooted in what Wikimedians are saying about the implications of ChatGPT, and applied to the knowledge integrity framework, this project maps out the most important areas for possible policy expansion and adjustment of current practice to deal with possible risks to Wikimedia and the larger internet infrastructure. This work supports the 2030 Strategic Direction in its aim to ensure that Wikimedia constitutes essential infrastructure of the ecosystem of free knowledge.[3]

Our project sought to answer the following questions:

  • Where do Wikimedians see the potential opportunities of generative AI for Wikipedia?
  • Where do Wikimedians see potential challenges for Wikipedia arising from generative AI?
  • What steps would they take (if any) to govern their project in the wake of this change?

Our data included the analysis of hundreds of thousands of words from on-wiki discussions between November 2022 and November 2023. We supplemented this with 15 in-depth interviews with Wikimedians at the 2023 Wikimania Singapore conference and online.

The result of these discussions is important because Wikimedians are at the coalface of the open knowledge ecosystem and have keen insights into the role of machine learning and automation in public information systems. Wikimedians have had conversations about AI and its impact for at least the past 15 years, and the Wikimedia Foundation has historically embraced automated technologies. Machine learning and AI tools have been used to detect potential vandalism, assess article quality and importance, translate content and classify images. But Wikimedia volunteers generally deploy these tools critically. The use of bots and other automated tools is debated and regulated according to community governance procedures so that their applications are limited and human judgement remains paramount. This occurs in the context of a long-standing communitarian approach to policy formation and content governance. Wikipedia’s core content policies, including verifiability, neutral point of view, and no original research, reflect a common vision of Wikipedia as an open encyclopedic project.

In 2018, the Wikimedia Foundation launched a cross-departmental program known as Knowledge Integrity. Although the program as a whole did not endure beyond 2019, several research strands continue to develop and inform Wikimedia Foundation strategy. The program recognises the critical part that Wikimedia projects play in the broader information ecosystem, with a goal to secure Wikimedia as the “hub of a federated, trusted knowledge ecosystem”.

With the rise of large language models and other generative AI systems, the importance of Wikimedia projects in online knowledge infrastructure has increased even further. Wikimedia projects are crucial sources of high-quality data for AI training or retrieval-augmented generation and as a source of groundtruth for testing. The implications of generative AI for Wikimedia projects thus extend beyond opportunities to enhance editing workflows or concerns about how to mitigate threats to the integrity of Wikimedia content, to the role that Wikimedia projects play in the broader information ecosystem. But a perception of opportunity is tempered for many by a fear that generative AI, particularly its incorporation into search engines, may undermine Wikipedia’s centrality and even threaten the open knowledge ecosystem itself.

Understanding how Wikimedians are thinking about potential opportunities and risks of generative AI for is therefore vital for informing both community policy and WMF strategy.

Methods

edit

In order to understand the implications of ChatGPT and other generative AI tools for Wikipedia and the broader information environment, we first looked to practitioners’ own understandings of the issue. We analysed on-wiki discussions about the implications of ChatGPT and generative AI tools for Wikipedia and asked a group of Wikimedians questions about their implications for knowledge integrity in a series of interviews and focus groups.

Following on-wiki data collection, we engaged in an inductive and emergent qualitative analysis of the data to identify critical themes of the discussion. We then verified the applicability of the emergent coding through the interview content and used the interviews to flag any novel themes that were distinct from those arising from Wikipedia’s online discussions. The interview cohort included many active members of the on-wiki discussions but we also sought to increase the representativeness of the data by identifying potential subjects amongst the culturally and linguistically diverse attendees at Wikimania 2023.

The research design partially mirrors the methodology utilised by Graham and Wright (2015), whereby content analysis approaches were paired with stakeholder interviews to surface new insights. However, each medium required a distinct approach to collecting and analysing the data.

Phase 1 data collection: online discussions and texts

edit

The text-based data for this analysis came from blogs, discussion threads, and internal debates between Wikimedia editors in public online forums such as the “Village Pump”, as well as the Wikimedia online periodical, the “Signpost” and the Wikimedia blog, “Diff”. These spaces were chosen as they were indicated as primary sites for discussing the challenges, practices, and policies by the Wikimedia community or because they arose as part of search processes or via reference in other included content.

Many of the conversations took place on English Wikipedia's Village Pump pages, such as the Village Pump (policy), Village Pump (idea lab), and Village Pump (proposals) pages. These are central discussion forums for the English Wikipedia community. While there were a few mentions of the potential impact of AI on non-English Wikipedias (such as the possibility of using AI-assisted translation to improve the quality of content across languages) the primary focus of the discussions was on the English Wikipedia.

Data was collected in 2023 between 13 July and 25 July. Over 160,000 words of content were coded across 88 articles, blogs, and discussions [see Table 1]:

Data sources for phase 1 analysis
Data Source Description Scope of dataset
Village Pump Discussion forums for proposal and debate of policies, initiatives, and other proposals 56,105 words,

126+* editors,

16 topic pages

Wikimedia-l Publicly archived and moderated list for discussion by the Wikimedia community and allied organisations supporting its work 22,558 words,

53 respondents,

12 discussion threads

Wikimedia ‘Diff’ blog A Wikimedia-community-focused blog established by the communications department at the Wikimedia Foundation (mostly in English but also in others) 7,334 words

14 contributors

14 blog posts and discussions

Wikimedia Signpost An online newspaper for and produced by the Wikimedia editor community 29,920 words

73+* editors

24 feature articles and discussions

English Wikipedia administrative, policy and working pages Discussions about LLMs across policy, page patrolling, criteria for deletion etc. pages on English Wikipedia 47,941 words

140+ contributors

22 pages

* Excluding anonymous authors listed by IP address.

Interviews

edit

Interviewees were selected from two sources: online discussions, and attendees at the Wikimania 2023 event in Singapore. Participants in the online discussions were contacted based on their activity and express interest in the subject. Potential participants were ranked by the frequency of their contributions to discussions, exhibited knowledge and/or interest in generative AI, and discussion of Wikipedia policies, practices, and tools. Highest-ranked candidates were contacted first until 10 respondents from online sources agreed to be interviewed. Some of these participants were interviewed at Wikimania 2023 in addition to others.

In total, 15 participants were interviewed, including 8 Wikimania attendees and 7 online forum participants. Interview participants included 5 participants who identified as women and 10 whose first language was not English. One of the participants was directly employed by the Wikimedia Foundation, but many at the time held official positions in Wikimedia chapters and projects.

The research interviews had 2 distinct focuses, based on the interviewees:

  • Wikimedians: starting with two in-person focus groups at Wikimania Singapore and interviews with individuals identified in Wikimedia-l conversations, followed by individual interviews. The goal was to understand to what extent LLMs are already having an impact on Wikipedia practice, which areas of practice might be most affected, and whether there are other risks not already identified that would be useful to consider. We focused on community members who have direct experience working in areas most likely to be affected or related to LLMs (e.g. in new page patrol, bot policy etc).
  • Wikimedia Research Team members: particularly those connected to the Knowledge Integrity program. The goal with this group is to understand how knowledge integrity relates in practice to questions of verifiability and provenance and to garner ideas about what is possible in terms of governing LLMs (given previous practice in relation to governing other automated processes and tools).

Coding

edit

The data from the online discussions and from the interviews were then coded, but each data set required a different approach. The online discussions provided a large and broad base of data that presented a wide range of themes. The challenge was to consolidate the large amount of content into insights that could provide some answers to the research questions. Conversely, the interviews provided more direct answers to these questions but consequently less opportunities for emergent analysis that could identify novel and unforeseeable problems and solutions. Drawing these sources of data together was a key task for answering the research questions.

Policy, Ethics and Human Subjects Research

edit

This research focuses on achieving data saturation through non-intrusive and non-disruptive methods. While it uses emergent approaches, all contact was be preceded by desk-based research through Wikimedia texts as well as scholarly research and reports. Interviews and focus groups were conducted in discussion-focused environments like Wikimania and discussed topics of direct relevance to participants' daily practices. All participation was voluntary and candidates were able to decide whether, and how much, to contribute, as well as whether and how they wish to be identified. The research adhered to rigorous academic ethical standards monitored by the ethical review processes of the University of Technology Sydney

Results

edit

A detailed summary of results follows. These will be expanded as the final write-up stage of the research is completed.

1. What opportunities do Wikipedians see in generative AI?

edit

1.1 AI-Assisted Content Creation

edit
  • Drafting article stubs or outlines for human editors
  • Suggesting missing topics or information gaps
  • Aiding with knowledge base queries and information synthesis
  • Improving language, correcting linguistic errors and formatting
  • Supporting editors’ writing in non-native languages
  • Creating new formats, illustration

1.2 Content Enhancement and Optimization

edit
  • Improving language translation between Wikipedia versions
  • Suggesting relevant internal links and connections
  • Automating multimedia classification and captioning

1.3 Editor Workflow Augmentation

edit
  • Prioritizing articles for improvement based on quality scores
  • Flagging potential vandalism or low-quality edits
  • Assisting with referencing and citation recommendations
  • Assisting with consensus evaluation
  • Writing code to compare and analyse articles

2. Where do Wikipedians see challenges arising from generative AI?

edit
edit
  • Training data sources and potential copyright infringement
  • Commercial exploitation of openly licensed content
  • Lack of attribution for AI-generated text

2.2 Reliability and verifiability issues

edit
  • Potential for proliferation of AI-generated misinformation or biased content, particularly on smaller language versions
  • Difficulty in fact-checking and verifying AI outputs
  • Lack of transparency in AI language model training
  • Violation of other content policies, e.g. puffery

2.3 Risks to editorial practices

edit
  • Concerns about AI-generated content bypassing human curation
  • Many see a risk in the potential for misuse of AI by malicious actors
  • Challenges in detecting and distinguishing AI-written text
  • Risks of over-reliance on AI at the expense of human knowledge
  • Challenges to core policies and concepts including authorship

2.4 Uneven distribution of risk

edit
  • Lack of contributors is a risk for smaller language versions in both editorial resourcing and policy development

2.5 Threats to the sustainability of Wikipedia and to the knowledge and information ecosystem

edit
  • Wikipedia plays a key role in the information ecosystem
  • Wikipedia may become unsustainable as people turn to AI for their information needs
  • Potential for degradation of information ecosystem and AI model collapse
  • AI companies’ lack of transparency and accountability undermines open knowledge

3. What should be done to address risks and embrace opportunities?

edit

3.1 Safeguards for AI Integration in Wikipedia

edit
  • For many, internal risks can likely be managed via existing policy and practices, while others urge the need for AI-specific policy
  • Most see value in technical measures to mitigate risks, e.g. detect and mark AI-generated content, AI plugins
  • The need to preserve human editorial oversight and curation processes is paramount
  • Transparency is important for a wide range of AI use
  • Education and support on genAI are needed for new editors

3.2     Addressing external ecosystem and market risks

edit
  • Transparency and accountability for AI companies are critical for managing knowledge integrity in the broader information ecosystem and existential threats to Wikipedia
  • Responsibility needs to be distributed amongst many actors and stakeholders
  • The Wikimedia Foundation and Wikimedia community need to be pro-active in advocating for Wikimedia and open knowledge, including in copyright and licensing.

Discussion

edit

This section will be expanded as the final write-up stage of the research is completed.

  • Proposal and failure of LLM policy
  • External ecosystem and market risks are of potentially far-reaching and profound impact both on WP and open knowledge ecosystem more broadly. These risks cannot be addressed through internal policy and practice alone.

Resources

edit

Grants:Programs/Wikimedia Research Fund/Implications of ChatGPT for knowledge integrity on Wikipedia

References

edit
  1. a b Harrison, S. (January 12, 2023). “Should ChatGPT Be Used to Write Wikipedia Articles?”. Slate. https://slate.com/technology/2023/01/chatgpt-wikipedia-articles.html
  2. a b Wikimedia contributors (2023a). Community Call Notes. Accessed 29 March, 2023. https://meta.wikimedia.org/w/index.php?title=Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/External_Trends/Community_call_notes&oldid=24785109
  3. Wikimedia contributors (2023b). Movement Strategy. Accessed 29 March, 2023. https://meta.wikimedia.org/w/index.php?title=Movement_Strategy&oldid=24329161