Research:Patroller work load
This page in a nutshell:The number of new pages that human editors patrol has been going down since 2007. This suggests that the workload of new page patrollers has also been decreasing.
When a new editor creates an article, how it and they are treated by the community has a large impact on whether they continue to edit or not. One factor that may have a large influence on whether these new article authors are bitten is if those doing new page patrol are overworked. Editors who are overworked and stressed are likely to feel less patient and accommodating towards new editors, so measuring the workload of new page patrollers is important if the movement is going to figure out how to be more open and welcoming to new editors.
- Has the workload for new page patrollers increased over time? (RQ1.16)
New Page Patrol is a vital function of many Wikipedias as the front line of interaction between new authors and community members devoted to policing the quality of the project. It has variety of detailed, quite complex possible actions for patrolling pages in all namespaces. Below is a simplified flow of possible outcomes and actions in patrolling new articles in the main namespace.
This sprint analyzed all page patrolling activity for each calendar month of Wikipedia history from 2007 (when the patrol logs were started) up to June of 2011. We initially wanted to cover the activity of both new page patrol and vandalfighting, but due to downtime and time constraints we reoriented towards patrolling only. Note: This analysis only measures patrolling that appears in the log. If patrolling outside the queue occurred, it has been discounted.
Page patrolling activity was gathered from the Wikipedia database with the following query:
SELECT * FROM logging WHERE log_type="patrol" AND log_action="patrol"
Patrolling events were then grouped by the user who performed the action and the day the action took place. All plots and statistics come from this dataset.
Removing bots Edit
Error bars Edit
Standard error bars are included in each plot where applicable. In some plots they are too small to be visible due to the high number of observations.
Results and discussion Edit
Patrolling work is distributed across users as a power law (see figure above). This means that a very small proportion of English Wikipedia editors are doing the majority of page patrolling work. The top 50 patrollers plot to the right shows the change in editor trends since page patrol logging began in 2007. From 2008 onward, a small cohort of editors is responsible for the majority of page patrolling that takes place. The increasing activity of these top 50 editors suggests that the work load required for each page patroller is increasing.
Patrolling activity by month vs year Edit
When examining at the amount of patrolling that patrollers do in a year, the work load per patroller appears to be decreasing linearly.
Patrollers' monthly activities also appear to be decreasing linearly (β=-1.050 p<0.001). The Mean patrolling actions per user-month plot shows the decreasing trend in the amount of patrolling per editor-month. This trend suggests that editors are doing less patrolling on a monthly basis.
Both of these plots refute the hypothesis that patroller workload is increasing on a per user basis.
Mean patrolling actions per user-month The average number of patrolled pages per patroller-month is plotted for each month of complete data. A best fit regression line (β=-1.050 p<0.001) is plotted with the data.
Mean patrolling actions per user-year The average patrolled pages per patroller-year is plotted for each year of complete data. The best fit regression line (β=-35.72 p=0.0279) is plotted.
Workload for just the top 50 Edit
The result that workload was decreasing was surprising to say the least, so we re-focused our analysis on the work load of the top 50 users. Again we found that the per-patroller workload has been decreasing.
Mean patrolling actions per user-month (top 50) The average number of patrolled pages per patroller-month is plotted for the top 50 patrollers in each month of complete data. A best fit regression line (β=-14.81 p<0.001) is plotted with the data.
Mean patrolling actions per (top 50) user-year The average amount of patrolled pages per top 50 patroller is plotted for the years with complete data. A best fit regression line (β=-1.986 p=0.067) is plotted with the data.
The results of this analysis refute the hypothesis that page patrollers' work is increasing. In fact, the workload has been decreasing and has been distributed among more patrollers than it used to be. A factor that may have added to the effect is the increased involvement of bots and autopatrolled editors to decrease the number of pages that patrollers must patrol, as well as a generally decreasing number of articles being created every month.
Future work Edit
- It is quite easy to run these same queries across other languages, so replicating the study across Wikipedias other than English is our next step.
- Are editors more or less specialized (performing only these tasks) than they used to be? Working harder over the years solely on new page patrol or vandalfighting may be leading to higher burnout rates among experienced editors, as well as a more hostile environment for new editors who are dealing with patrollers whose patience has worn thin.