Research talk:Autoconfirmed article creation trial/Work log/2018-02-16

Friday, February 16, 2018 edit

Today I'll work on H13, then work on checking our AfC dataset before completing H22, and possibly H17.

H13: The size of the backlog of articles in the New Page Patrol queue will remain stable. edit

There is not historical data available on the size of the New Page Patrol queue, so we've gathered our own dataset since late August, 2017. The dataset contains a count of the number of non-redirect main namespace pages in the NPP queue, and is recorded four times per day. We decide to focus on a daily measurement of the size and therefore only keep the entry that's recorded shortly after midnight UTC on any given day. Plotting this data from September 1 to November 15, covering through the first two months of ACTRIAL, gives the following graph (note that the Y-axis is truncated and the lowest shown value is 12,000 articles):

 

Looking at the graph, it one could describe it as having five phases. The first phase lasts the first week of September, where we see a rapid decrease of the queue from 16,464 articles down to 14,315 articles, reducing the queue by 13.1%. There's then a second phase until the start of ACTRIAL were the size of the queue stabilizes around 14,500. The third phase starts at midnight on September 15, about an hour and a half after ACTRIAL started. The size of the queue is 14,353 articles, and from then on up to the end of September the queue decreases almost every day. At the end of the month it is 12,954 articles, down 9.7% since the start of the trial.

We see a different pattern in October, the fourth phase, where most days the queue continues to decrease but at a slower rate. There are some days where it increases, and of those some are fairly large. The lowest point on the graph is on October 30, where there are 12,371 articles in the queue. That is a drop of 4.5% since the start of the month, less than half the percentage drop that happened during the first two weeks of ACTRIAL.

Lastly we have the fifth phase from October 30 to the end of the graph, where the queue increases on thirteen out of fifteen days. At the end the queue contains 13,116 unreviewed articles.

In order to formalize these patterns a bit more, we also calculated the slope of the graph for each day, using a seven-day moving window. This means that we look three days back and three days forward from the current day and calculate the average increase/decrease of the queue across that seven day span. We use a seven-day span because Wikipedia's activity tends to follow a weekly cycle, and because it also reduces some of the variance in the result. If the slope is negative, the queue is decreasing, and similar if it is positive the queue is increasing. The graph looks like this:

 

This graph echoes the previous description of transitions in the graph. We see the initial phase with a large reduction of the queue, where the slope is consistently negative and the average reduction in articles/day is large. Next we see the more stable phase prior to ACTRIAL, followed by the initial two weeks of the trial where the queue again decreases, at times quite rapidly. In October the slope tends to be negative, but we can see that the magnitude is much lower than before and we have days with a positive slope. Lastly we see the beginning of November where the queue grows again.

H13 hypothesizes that the queue will remain stable. That has not been the case during the first two months of ACTRIAL. Instead, we have seen a large reduction of the queue followed by a substantial increase. H13 is therefore not supported.

Return to "Autoconfirmed article creation trial/Work log/2018-02-16" page.