Objective Revision Evaluation Service/draftquality

This model was trained based on comments left by admins when they delete pages (see mw:Manual:Logging_table). It is useful for supporting new page curation workflows. See en:WP:CSD for a list of quick deletion reasons for English Wikipedia. For the English model, we used G3 "vandalism", G10 "attack", and G11 "spam".

This model is trained to predict new pages that are of the most problematic variety -- articles that must be deleted immediately. However, there are some issues with new articles (such as a lack of notability) that are less immediately concerning and this model does not make that type of judgement.

Contexts (Wikis) edit

English Wikipedia (enwiki) edit

https://ores.wmflabs.org/v2/scores/enwiki/draftquality/?model_info

ScikitLearnClassifier
 - type: GradientBoosting
 - params: min_samples_leaf=1, max_depth=7, warm_start=false, n_estimators=700, min_samples_split=2, min_weight_fraction_leaf=0.0, center=false, balanced_sample=false, max_features="log2", subsample=1.0, scale=false, verbose=0, max_leaf_nodes=null, random_state=null, balanced_sample_weight=false, loss="deviance", init=null, presort="auto", learning_rate=0.01
 - version: 0.0.1
 - trained: 2017-01-24T17:12:29.354134

Table:
	             ~OK    ~attack    ~spam    ~vandalism
	---------  -----  ---------  -------  ------------
	OK         24848          5     1081           323
	attack        82        250      499          1228
	spam         607         17    15929          1146
	vandalism    647        197     1814          3845

Accuracy: 0.854
ROC-AUC:
	-----------  -----
	'OK'         0.983
	'attack'     0.93
	'spam'       0.97
	'vandalism'  0.923
	-----------  -----

F1:
	---------  -----
	spam       0.86
	attack     0.198
	vandalism  0.589
	OK         0.948
	---------  -----