Research:Develop a ML-based service to predict reverts on Wikipedia

Tracked in Phabricator:
Task T314384
Mykola Trokhymovych
Duration:  2022-07 – ??
References, Knowledge Integrity, Disinformation, Patrolling, Vandalism

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.

The Research team in collaboration with the ML-Platform team are creating a new service to help patrollers to detect revisions that might be reverted.


  • One single model for all Wikipedia languages.
  • Model should be primarily language agnostic
  • Model will be able to run for single revisions or batches
  • Model should be able to run in LiftWing.



Fighting disinformation and keeping knowledge integrity in our projects is one of the most important and difficult tasks for our movement. The existing content policies had positioned Wikipedia with a central role in the information ecosystem. However, the workload this implies to our communities seems to be one of the main limitations for keeping and improving content reliability. The usage of machine learning-based tools (a.k.a AI) appears as a powerful solution to support them. The technology department has developed several tools in that direction. However, these tools suffer from several limitations:

  • They are highly language dependent.
  • Rely on complex manual annotation campaigns that are difficult to scale, specially for small languages communities..
  • They were created as stand-alone applications, which requires dedicated software architecture and data pipelines.

Our Approach


Our proposal is to to design a new generation of Machine-Learning models, that are primarily language agnostic, based on implicit annotations (e.g. wikitext templates, reverts, etc) and with a standardized architecture. These models would help the developers community and other WMF teams to build tools to sustain and increase knowledge integrity and fight disinformation in Wikimedia projects.

Our recent research has shown that it is possible to build tools based mainly on language agnostic features, that can replace some of the current language-dependent models. Building these tools should (and can) be done taking into account the differences across projects. Having language agnostic models will allow to solve the scalability issues raised on the disinformation strategy and provide better support for patrollers in small Wikimedia projects.

We are also experimenting with multilingual Large Language Models (LLMs) to take advantage from the recent advancements on that field.

Revert Risk Model(s)


We have developed two models one Language Agnostic fully based on Edit Types, and a Multilingual one based on mBert.

MODEL error_rate model_processing_time AUC AUC anonymous AUC authorised Pr@R0.25
Language-agnostic RR 0.0061 0.538154 0.865301 0.699971 0.780341 0.258017
Multilingual RR 0.0063 3.130323 0.878529 0.781155 0.782728 0.288313
ORES NaN NaN 0.840866 0.682347 0.709546 0.235961

As seen in the table above, the Revert Risk Language Agnostic (RRLA) model has a similar AUC score with the Multilingual (RRML) version. RRML outperforms RRLA for anonymous users, but requires 6x time for processing each revision, and currently is limited to just 47 languages. Our current recommendation is to use the RRLA as default model, due its shorter serving time and larger language coverage (all languages).

More details about the experiments and data to reproduce these results can be found in this repository.



Models are available through the Lift Wing API, a service maintained by the ML-Platform team, and can be accessed using the following end-points:

Language Agnostic Model:


$ curl -X POST -d '{"rev_id": rev_id, "lang": "lang_code"}' -H "Content-type: application/json"

Multilingual Model:


$ curl -X POST -d '{"rev_id": rev_id, "lang": "lang_code"}' -H "Content-type: application/json"

Model Cards


Ongoing Work


We are currently working on updating the RRLA model to improve its performance on anonymous edits. You can track our weekly updates on these Phabricator task.