Research talk:Revision scoring as a service/Work log/2016-02-23

Active discussions

Tuesday, February 23, 2016Edit

OK. Today I'm trying to do what we were doing with Urdu Wikipedia but with Polish Wikipedia instead. Here's list of 500K randomly sampled edits: http://quarry.wmflabs.org/query/7543

Prelabel is running now Amir (talk) 18:24, 23 February 2016 (UTC)[]

OK. It's done:

(3.4)ladsgroup@ores-compute:~/editquality/datasets$ wc plwiki.prelabeled_revisions.500k_2015.tsv 
  499736  1933819 13821243 plwiki.prelabeled_revisions.500k_2015.tsv
(3.4)ladsgroup@ores-compute:~/editquality/datasets$ cat plwiki.prelabeled_revisions.500k_2015.tsv | grep "True" | wc
  82484  264812 1720937
(3.4)ladsgroup@ores-compute:~/editquality/datasets$ cat plwiki.prelabeled_revisions.500k_2015.tsv | grep "reverted" | wc
  14861   59444  416108

So 16.5% of edits needs review. That's good :) and 3% are reverted.

I sampled 5K to load it up to Wikilabels:

(
  echo "rev_id\tneeds_review\treason";
  (
    cat datasets/plwiki.prelabeled_revisions.500k_2015.tsv | \
    grep "True" | \
    shuf -n 2500; \
    cat datasets/plwiki.prelabeled_revisions.500k_2015.tsv | \
    grep "False" | \
    shuf -n 2500 \
 ) | \
 shuf \
) > datasets/plwiki.revisions_for_review.5k_2015.tsv

Using shuffle, I extracted 20K revs to build the reverted model:

cat datasets/plwiki.sampled_revisions.500k_2015.tsv | \
    shuf -n 20000 > datasets/plwiki.sampled_revisions.20k_2015.tsv

Then we should add "rev_id" to the first line and check if "rev_id" is not accidentally added to revs. (check) Then running label reverted:

cat datasets/plwiki.sampled_revisions.20k_2015.tsv | \
    ./utility label_reverted \
        --host https://pl.wikipedia.org \
        --revert-radius 3 \
        --verbose > datasets/plwiki.rev_reverted.20k_2015.tsv

It's labeling them.

Now I'm extracting features:

cat datasets/plwiki.rev_reverted.20k_2015.tsv | \
        revscoring extract_features \
                editquality.feature_lists.plwiki.reverted \
                --host https://pl.wikipedia.org \
                --include-revid \
                --verbose > \
        datasets/plwiki.features_reverted.20k_2015.tsv

OK. I ran tuning reports and turned out RF is the best. Strange. Everything I touch turns into RF :D

Running with best settings:

>         revscoring train_test \
>                 revscoring.scorer_models.RF \
>                 editquality.feature_lists.plwiki.reverted \
>                 --version 0.1.0 \
>                 -p 'max_features="log2"' \
>                 -p 'criterion="entropy"' \
>                 -p 'min_samples_leaf=7' \
>                 -p 'n_estimators=640' \
>                 -s 'pr' -s 'roc' \
>                 -s 'recall_at_fpr(max_fpr=0.10)' \
>                 -s 'filter_rate_at_recall(min_recall=0.90)' \
>                 -s 'filter_rate_at_recall(min_recall=0.75)' \
>                 --balance-sample-weight \
>                 --center --scale \
>                 --label-type=bool > \
>         models/plwiki.reverted.rf.model
2016-02-23 22:21:47,424 INFO:revscoring.utilities.train_test -- Training model...
2016-02-23 22:22:08,411 INFO:revscoring.utilities.train_test -- Testing model...
ScikitLearnClassifier
 - type: RF
 - params: oob_score=false, scale=true, center=true, warm_start=false, criterion="entropy", random_state=null, max_leaf_nodes=null, class_weight=null, n_jobs=1, n_estimators=640, min_samples_leaf=7, min_weight_fraction_leaf=0.0, verbose=0, balanced_sample_weight=true, min_samples_split=2, max_depth=null, max_features="log2", bootstrap=true
 - version: 0.1.0
 - trained: 2016-02-23T22:22:08.411624

         ~False    ~True
-----  --------  -------
False      3723      155
True         50       66

Accuracy: 0.9486730095142714

PR-AUC: 0.327
Filter rate @ 0.9 recall: threshold=0.08, filter_rate=0.736, recall=0.905
Recall @ 0.1 false-positive rate: threshold=0.918, recall=0.017, fpr=0.0
Filter rate @ 0.75 recall: threshold=0.235, filter_rate=0.903, recall=0.75
ROC-AUC: 0.912

Look "Recall @ 0.1" Wooot! Amir (talk) 22:37, 23 February 2016 (UTC)[]

Size of the model:

(3.4)ladsgroup@ores-compute:~/editquality/models$ ls -Ssh | grep plwiki.reverted.rf.model
 17M plwiki.reverted.rf.model

OK. We are good to go! Amir (talk) 22:51, 23 February 2016 (UTC)[]

Return to "Revision scoring as a service/Work log/2016-02-23" page.