Research:Machine Learning Assisted Wikipedia Editing
Over the last couple of years, there has been substantial progress in machine learning in general, and Natural Language Processing in particular. Today, language models can be used to generate various forms of short and long form text and even do their own web search to find relevant information. However, they are still far away from reliably generating language that would match the quality, neutrality and verifiability standards of Wikipedia. In turn, it is unclear to what extent (and how) this technology could be used to support Wikipedia editors right now. Our high level research goal is to answer this question.
Beyond scientific publications, a major output will be a UI/gadget/tool that allows editors to rapidly create new or improve existing Wikipedia articles. We will also fully open source our software and machine learning models.
Our core hypotheses are that a) for language models to be useful in the editing process, humans will need fine-grained control over the behavior of these models and b) language models will need to be able to retrieve relevant information from the web (that is, we need retrieval augmented models). We are working on several work streams that test and develop these hypotheses at the moment.
Citation Verification and Recommendation Edit
We have developed a method to automatically spot claims that cannot be verified by existing Wikipedia sources, and to recommend better sources based on a web index search. We hope that this method could be used by Wikipedia editors to both prioritise which citations should receive further attention and consider proposed candidate citations that the existing ones can be replaced with. In general, our model seems to work well when assessed by annotators, but so far none of these have Wikipedia edit experience and we would like to understand whether it would actually be useful for Wikipedia editors.
To give an example, for the claim
Joe Hipp, Heavyweight boxer, the first Native American to compete for the WBA World Heavyweight Title.
from https://en.wikipedia.org/wiki/Blackfoot_Confederacy the current citation fails to support the claim. Our system proposes this newspaper article with matching support:
In 1989 at the twilight of his career, Camel fought Joe Hipp of the Blackfeet Nation. Hipp, who became the first Native American to challenge for the world heavyweight championship, said the fight was one of the weirdest of his career.
Controllable language generation Edit
We are developing a method by which editors can control edits proposed by a language model using an instruction language/interface.
For example, an editor could ask the model — in natural language or via a dedicated interface — to "simplify this paragraph”, "remove unsourced opinions”, "add citations”, “add wikilinks” or "improve grammar" and it would suggest appropriate edits, which the human editor can then accept, revise, or reject.
Verifiable language generation Edit
We are investigating how language can be generated with as many as possible claims/subclaims supported by evidence (such that models produce easily verifiable content).
This could enable editors to leverage language models when writing new articles or adding new sections to existing articles, as they could easily verify whether the model’s generations are actually backed by a given set of sources. For instance, when asked to write about Coca-Cola’s history given a set of 10 different reference documents, the model might produce outputs like “Originally marketed as a temperance drink and intended as a patent medicine, it was invented in the late 19th century by John Stith Pemberton and was bought out by businessman Asa Griggs Candler” where each number in square brackets refers to the document containing the evidence for the preceding piece of text.
Prior Work Edit
The foundation of our work are retrieval augmented language models that generates the changes based on what Wikipedia editors instruct. We will also need to continuously measure the level of faithfulness and factuality of our model.
- Summer 2022: Publications covering the three topics above
- Fall 2022: Open Source release of software
- Winter 2022: Demo of UI
- Summer 2023: Release of UI
Policy, Ethics and Human Subjects Research Edit
- Bias: We are aware that any system that can generate or change content will introduce bias. For example, our citation retriever is trained with the citations currently in Wikipedia, and they point more often to theguardian newspaper than other news sources. Our system shows similar behavior, and this might produce a bias. Understanding and alleviating such biases is a critical goal for our research.
- Open Source and Access: We are 100% committed to open source and open research. All of our software and models will be open source, and our research papers will be published in open access repositories.
Once your study completes, describe the results and their implications here. Don't forget to make status=complete above when you are done.
- Patrick Lewis et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , NeurIPS 2021
- Izacard and Grave, Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
- Luca Massarelli et al., How Decoding Strategies Affect the Verifiability of Generated Text