Machine learning models/Production/Multilingual revert risk

How can we help editors to identify revisions that need to be “patrolled”?

Model card
This page is an on-wiki machine learning model card.
A diagram of a neural network
A model card is a document about a machine learning model that seeks to answer basic questions about the model.
Model Information Hub
Model creator(s)Mykola Trokhymovych, Muniza Aslam, Ai-Jou Chou, and Diego Saez-Trumper
Model owner(s)Diego Saez-Trumper
Codetraining and inference
Uses PIINo
In production?Yes
This model uses revision content and metadata to predict the risk of being reverted.


The goal of this model is to detect revisions that might be reverted independently if they were made in good faith or with the intention of creating damage. Wikipedia has a group of dedicated volunteer editors, known as patrollers, who work to ensure the accuracy and integrity of the information on the site. These patrollers review and edit articles, monitor for vandalism, and enforce community guidelines. However, their work is not easy, as they have to keep up with the fast pace and language diversity of Wikipedia, where on average, around 16 pages are edited per second in 250+ languages [1]. The aim of this model is to help patrollers quickly identify potential problems, prioritize the work, and revert damaging edits when needed.

This model is deployed on LiftWing. Right now, it is available for internal usage. This model can be used to detect revisions that might need to be reverted.

Motivation

edit

Knowledge Integrity is one of the strategic programs of Wikimedia Research with the goal of identifying and addressing threats to content on Wikipedia, increasing the capabilities of patrollers, and providing mechanisms for assessing the reliability of sources[2]. The main goal of the project is to create a new generation of patrolling models, improving accuracy, fairness, and maintainability compared to previous state-of-the-art ORES[3].

The current model is able to work on almost any Wikipedia article in any of the 47 chosen languages: ['ka', 'lv', 'ta', 'ur', 'eo', 'lt', 'sl', 'hy', 'hr', 'sk', 'eu', 'et', 'ms', 'az', 'da', 'bg', 'sr', 'ro', 'el', 'th', 'bn', 'no', 'hi', 'ca', 'hu', 'ko', 'fi', 'vi', 'uz', 'sv', 'cs', 'he', 'id', 'tr', 'uk', 'nl', 'pl', 'ar', 'fa', 'it', 'zh', 'ru', 'es', 'ja', 'de', 'fr', 'en']

Users and uses

edit
Use this model for
  • Define the revert risk of Wikipedia article revision
Don't use this model for
  • making predictions on language editions of Wikipedia that are not in the listed 47 languages or other Wiki projects (Wiktionary, Wikinews, Wikidata, etc.)
  • making predictions on the revisions that are created by bots
  • making predictions on the revisions that create a new article (the first revision of a page)
  • making predictions on a revision that is the only one for a page
  • As any AI/ML model, we recommend to keep humans in the loop, and not consider model's predictions as training data for other ML models.
Current uses

Ethical considerations, caveats, and recommendations

edit
  • This model was developed to improve the performance of it's Language Agnostic (RRLA) version. The Multilingual version shows a better performance, especially for IP edits. However, it requires more processing power, and might be slower (or given timeouts).

Model

edit

The presented model is based on content features extracted using fine-tuned language model mBERT[4], mwedittypes[5] based features, along with user and page metadata. It is built in a paradigm of having one generalized model for all covered languages, which is currently the 47 most frequently edited languages in Wikipedia. The system includes the following steps:

1. Text features preparation:

  • Process wikitext and compare with parent revision
  • Extract mwedittypes-based features
  • Extract texts that were added, removed, and changed

2. Masked Language Models (MLMs) features extraction:

  • Pass each of the texts that were added, removed, or changed to the pre-trained classification model
  • Apply mean and max pooling to the list of scores of each signal to extract the final unified feature set

3. Final Classification

  • Combine all extracted features with user and revision metadata
  • Pass the features to the final classifier
 
System design. Inference

Performance

edit

Implementation

edit

The presented model is a multistage solution that includes the fine-tuned masked language model (mBERT) for feature extraction and the final classifier (CatBoost) for getting the probability of being reverted based on the extracted features.


Model architecture

mBERT models tunning (four models for the title, changes, inserts, and removes):

  • Learning rate: 2e-5
  • Weight Decay: 0.01
  • Epochs: 5
  • Maximum input length: 512
  • Number of encoder attention layers: 12
  • Number of decoder attention layers: 12
  • Number of attention heads: 12
  • Length of encoder embedding: 768

CatBoost:

  • Iterations: 5000
  • Learning Rate: 0.01
  • Loss: Logloss
Output schema
{
  lang: <language code string>,
  rev_id: <revision_id string>,
  score: {
     prediction: <boolean decision result>
     probability: {
        true: <probability of being reverted>,
        false: <probability of being NOT reverted>
  }
}
Example input and output

Input

curl https://api.wikimedia.org/service/lw/inference/v1/models/revertrisk-multilingual:predict -X POST -d '{"rev_id": 123855516, "lang":"ru"}'

Output

{
  lang: "ru",
  rev_id: 123855516,
  score: {
     prediction: true
     probability: {
        true: 0.9392203688621521,
        false: 0.0607796311378479
  }
}

Data

edit

The model was trained on a dataset collected using the two tables from the Wikimedia Data Lake. We used the MediaWiki History table, and the Wikitext History one. Snapshot dated 2022-07 was used with the observation period from 2022-01-01 to 2022-07-01 (6 months) for training and the following week for testing. We also filtered out revisions related to edit wars and revisions created by bots.


Data Pipeline

The data was collected using Wikimedia Data Lake and Wikimedia Analytics cluster.

For each language, we collected revisions data. Then we merged the wikitext data and extracted the required features from the content using udf functions. Data collection pipeline for one language can be found in data collection script
Training data
  • Data period: 6 months
  • Number of revisions: 8,586,362
  • IP users edits rate: 0.17
  • Revert rate: 0.08
  • Random sample of up to 300,000 revisions per language
Test data
  • Data period: 1 week
  • Number of revisions: 1,079,265
  • IP users edits rate: 0.19
  • Revert rate: 0.07

Licenses

edit

Citation

edit

Cite this model as:

@inproceedings{trokhymovych2023fair,
  title={Fair multilingual vandalism detection system for Wikipedia},
  author={Trokhymovych, Mykola and Aslam, Muniza and Chou, Ai-Jou and Baeza-Yates, Ricardo and Saez-Trumper, Diego},
  booktitle={Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages={4981--4990},
  year={2023}
}

References

edit
  1. https://stats.wikimedia.org/
  2. Zia, Leila and Johnson, Isaac and Mansurov, Bahodir and Morgan, Jonathan and Redi, Miriam and Saez-Trumper, Diego and Taraborelli, Dario. 2019. Knowledge Integrity. https://doi.org/10.6084/m9.figshare.7704626
  3. https://www.mediawiki.org/wiki/ORES
  4. https://huggingface.co/bert-base-multilingual-cased
  5. https://github.com/geohci/edit-types