User:Diego (WMF)/drafts/RevertRisk Model card

Model card
Model card
This page is an on-wiki machine learning model card.
	A model card is a document about a machine learning model that seeks to answer basic questions about the model.
Model Information Hub
	This model uses ____ to predict ____ about ____.
	v; t; e;

This model card page currently has a draft status. It is a piece of model documentation that is in the process of being written. Once the model card is completed, this template should be removed.

How can we help editors to identify revisions that need to be “patrolled”? The goal of this model is to detect revisions that might be reverted, independently if they were made on good faith or with the intention of creating damage.

Patrolling content in more than 250+ Wikipedia projects is a difficult task. The amount of revisions, plus the different languages involved requires a complex human effort. The aim of this model is to help patrollers quickly identify potential problems, and revert damaging edits when needed.

Previous models had tried to solve this by creating language-specific solutions, however, that approach is difficult to escalate and maintain, because it requires as many models as languages used on the Wikimedia projects. Moreover, complex-language models are just available in certain languages, leaving out smaller Wikipedia editions. Therefore, this model is based on Language Agnostic features, making it possible to use it for any existing Wikipedia, and for new language projects that can appear in the future.

This model was trained using the two tables from the Wikimedia Data Lake. We used the MediaWiki History table, and the Wikitext History one. Meta-data was extracted for the former, and other features such as number of references, images and wikilinks, were extracted from the latter one.

This model is deployed on LiftWing. Right now, it is available for internal usage. You can see technical details on how to use it can be found here. This model can be used to detect revisions that might need to be reverted. A high “revert probability” output (over .9) would provide good precision, while lower threshold (0.5) would provide recall. This model should be used just for Wikipedia Articles (namespace 0), its features won't work outside Wikipedia.

Motivation

Knowledge Integrity is one of the strategic programs of Wikimedia Research with the goal of identifying and addressing threats to content on Wikipedia, increasing the capabilities of patrollers, and providing mechanisms for assessing the reliability of sources^[1]. The main goal of the project is to create a new generation of patrolling models, improving accuracy, fairness, and maintainability compared to previous state-of-the-art ORES^[2].

The current model is completely language agnostic and can run in any Wikipedia language edition.

Users and uses

Use this model for

Automatically find revisions that requires patrolling.
Vandalism detection.
Create bots for assisting admins and patrollers to remove vandalism or non good-faith edits.

Don't use this model for

Auto-removing edits that a user makes without another editor in the loop.
As ground-truth for training other models

Current uses

Research.
To be implemented on products soon.

Ethical considerations, caveats, and recommendations

The model is built using meta features that take into account user characteristics. However, it may exhibit bias against edits from new users or IP addresses due to past experiences. To address this issue, we have developed an alternative Multilingual Model that specifically mitigates such biases. We recommend using this alternative model for the 47 languages it covers. For the remaining languages, it is advisable to use the original model.

Model

This model uses this set of features:

Article features:
- We used the features developed for the Article Quality model.
- We computed the article quality features for the current and parent revision.
- We measured the quality differences between these revisions.
User features:
- Account "age" (difference between revision data and the user creation date)
- Number of previous revisions made.
- Number of users groups.

Performance

Implementation

Model architecture

The model is build using the XgBoost library.

The detailed model training procedure and configuration can be found in this repository.

Output schema

{
  lang: <language code string>,
  rev_id: <revision_id string>,
  score: {
     prediction: <boolean decision result>
     probability: {
        true: <probability of being reverted>,
        false: <probability of being NOT reverted>
  }
}

Example input and output

Example input:

curl "https://<endpoint>/v1/models/revert-risk-model:predict" -d @input.json -H "Host: revert-risk-model.experimental.wikimedia.org" --http1.1 -k An example for input.json: { "lang": "en", "rev_id": 123855516 }

Example output:

{
  lang: "en",
  rev_id: 123855516,
  score: {
     prediction: True
     probability: {
        true: 0.98,
        false: 0.92
  }
}

Data

Data pipeline

The model was trained on a dataset collected using the two tables from the Wikimedia Data Lake. We used the MediaWiki History table, and the Wikitext History one. Snapshot dated 2022-05 was used with the observation period from 2022-01-01 to 2022-01-01 (12 months). We filtered out revisions created by bots. We used the 70% of the data for training, and 30% testing, using a random split.

The data collection process can by found on this repository.

Training data

We randomly selected the 70% of the data mentioned above.

Test data

We used the remaining 30%.

Licenses

Code: Apache 2.0 License
Model: Apache 2.0 License

Citation

Cite this model as: ... to be added soon.

References

↑ Zia, Leila and Johnson, Isaac and Mansurov, Bahodir and Morgan, Jonathan and Redi, Miriam and Saez-Trumper, Diego and Taraborelli, Dario. 2019. Knowledge Integrity. https://doi.org/10.6084/m9.figshare.7704626
↑ https://www.mediawiki.org/wiki/ORES

[zia2019-1] Zia, Leila and Johnson, Isaac and Mansurov, Bahodir and Morgan, Jonathan and Redi, Miriam and Saez-Trumper, Diego and Taraborelli, Dario. 2019. Knowledge Integrity. https://doi.org/10.6084/m9.figshare.7704626

[2] ttps://www.mediawiki.org/wiki/ORES

[1]

[2]