Research:New page reviewer impact analysis/Number of new page patrollers
How has the implementation of the New Page Reviewer Right impacted the number of people doing new page reviews?
The new page reviewer right restricted the ability to review new pages to only those users vested with the right. Here we provide a very simple metric to find out the number of users performing new page patrol around the implementation of the user-right. Described below is the workflow to get the aforementioned metric.
Getting data
editNumber of users doing page reviews per month is obtained via the below SQL query after running it on Quarry.
use enwiki_p;
SELECT EXTRACT(YEAR FROM DATE_FORMAT(log_timestamp,'%Y%m%d%H%i%s')) AS `year`,
EXTRACT(MONTH FROM DATE_FORMAT(log_timestamp,'%Y%m%d%H%i%s')) AS `month`,
log_user,
count(*) as reviews_performed
from logging_logindex
WHERE log_type='pagetriage-curation'
AND log_timestamp
between 20151101000000
and 20170801000000 GROUP BY `year`, `month`, log_user
ORDER BY `year` ASC,
`month` ASC;
Parsing dataset
editAfter downloading the above dataset, it was parsed through the below python script to generate the graph:
dataset parsing
|
---|
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import matplotlib.dates as mdates
import pdb
review_usersset = 'quarry-20824-users-doing-page-reviews-run196795.tsv'
col = 'log_user'
df = pd.read_csv(review_usersset, delimiter='\t')
# get total years to iterate on
years = df['year'].unique()
review_users = np.array([])
avg_reviews = np.array([])
for y in years:
df_tmp = df[df['year'] == y]
# Get unique months in the year
months = df_tmp['month'].unique()
for m in months:
reviewers_per_month = df_tmp[df['month'] == m][col].count()
# Add per month review users to the array
review_users = np.append(review_users, reviewers_per_month)
# Add per month average reviews to the array
avg_reviews = np.append(avg_reviews, df_tmp[df['month'] == m]['reviews_performed'].mean())
# Generate year-months for x-axis
months = pd.date_range('2015-11', periods=review_users.shape[0], freq='1m')
f = open('reviewers_parser.wiki','w')
for i, m in enumerate(months):
f.write('|-\n|{:%Y-%m}\n|{}\n|{}\n'.format(m, review_users[i], avg_reviews[i]))
f.close()
multiple_bars = plt.figure()
plt.plot(months, review_users, label="users doing review")
plt.plot(months, avg_reviews, label="mean review per user that month")
plt.ylabel('Average editors reviewing / Mean reviews')
plt.xlabel('Months')
plt.legend()
xfmt = mdates.DateFormatter('%d-%m-%y')
plt.axvline('2016-11', color='b', linestyle='dashed', linewidth=2, label="NPP right implementation")
plt.text('2016-11', plt.gca().get_ylim()[1]+5,'NPP user right implementation', ha='center', va='center')
plt.show()
|
Results
editThe number of users doing New Page Patrol has been continually decreasing as shown by the plot.
Some useful observations can be made:
- The users performing new page patrol has been constantly decreasing.
- The number of user doing NPP showed a downward spike just after the NPP rights implementation.
- The average reviews per user in each month remained roughly the same before the November 2016, then began to increase a bit. This means that the users having the New Page Patrol right have been doing more work than before.
Dataset
editYear-Month | # of reviewers | Average reviews per user |
---|---|---|
2015-11 | 232.0 | 88.84913793103448 |
2015-12 | 216.0 | 87.39814814814815 |
2016-01 | 224.0 | 93.00446428571429 |
2016-02 | 205.0 | 89.82926829268293 |
2016-03 | 204.0 | 99.6029411764706 |
2016-04 | 220.0 | 107.11363636363636 |
2016-05 | 236.0 | 95.02542372881356 |
2016-06 | 212.0 | 51.320754716981135 |
2016-07 | 205.0 | 63.765853658536585 |
2016-08 | 241.0 | 50.024896265560166 |
2016-09 | 240.0 | 57.40833333333333 |
2016-10 | 225.0 | 98.72444444444444 |
2016-11 | 276.0 | 56.97463768115942 |
2016-12 | 176.0 | 90.57954545454545 |
2017-01 | 151.0 | 111.51655629139073 |
2017-02 | 179.0 | 121.50837988826815 |
2017-03 | 155.0 | 109.14838709677419 |
2017-04 | 151.0 | 137.20529801324503 |
2017-05 | 171.0 | 126.39766081871345 |
2017-06 | 162.0 | 121.44444444444444 |
2017-07 | 170.0 | 126.55882352941177 |