Research:Blocks on the English-language Wikipedia

This page documents a completed research project.


Key Personnel edit

Project Summary edit

A vast amount of research has taken place exploring how Wikipedia defends itself against ill-intentioned users. This has mostly explored vandalism - bad faith edits made by those users - and the most direct consequence of those edits (reversion). The objective of this research project is to explore the ultimate consequence of poor intentions, namely blocking. After investigating trends in the overall block rate, we take the block logs, from 2006 to the present, and use a series of regular expressions to categorise blocks into one of six categories:

  1. Spam: blocks for using Wikipedia for advertising purposes;
  2. Disruption: blocks for BLP violations, defamation, personal attacks, threats (legal or otherwise), copyright violations, edit warring and POV-pushing, broadly construed;
  3. Sockpuppetry: the use of multiple accounts in violation of Wikipedia's policies, or long-term, multiple-account abuse of Wikipedia;
  4. Username blocks: blocks for violating Wikipedia's username policies;
  5. Proxy usage: the blocking of proxies.
  6. Misc: blocks for reasons not identified by the regular expressions.

The resulting data is then examined and compared with potential confounds with the block rate (such as registration rates or AbuseFilter hits) in an attempt to answer three core research questions:

  1. Has there been any noticeable shift in the types of actions that users are blocked for?
  2. Has there been any noticeable shift overall?
  3. If either is true: why?

Results edit

Shifts in the rate and type of user blocks edit

The first task is to investigate whether there have been any shifts in the rate and type of user blocks. With the knowledge that the actions of one group inevitably impacts the other, the dataset was split into two groups prior to analysis - one consisting of blocks of anonymous users, and one consisting of blocks of registered users. In both cases, data was gathered primarily from the logging table, and consists of all block actions between January 2006 and September 2013, excluding unblocks and the modification of existing blocks.

Overall shifts edit

Proportionate shifts edit

Exploring declines edit

Sudden decline (2008-2009) edit

Constant decline (2009-2013) edit

References edit

External links edit

Conclusion edit