Research:Onboarding new Wikipedians/OB6/Contribution quality and type

For our sixth A/B test, we want to know...

  1. What kind edits are users in the redirect funnel making, especially compared to users in the GettingStarted copyedit funnel?
  2. What kind of articles do users in the redirect funnel edit, and how do these compare to copyediting pages we suggest?



To get a qualitative assessment of the articles and edits to them, we plan to draw a random sample of 250 revisions by editors in our test group. These revisions are all to the "returnto" page and made by editors who accepted the "Edit this page" call to action.

Our qualitative look is similar to what we previously assessed. Our past analysis showed that across tests, users in the GettingStarted "copyedit", "clarify", and "add links" task groups primarily did complete those suggested tasks. We also know that, left on their own, editors not suggested a task do a wider variety of things. The purpose of this handcoding round is to get a sense of what kind of edits users might make to their returnto page. We'll code edits based on the following types:

  1. Adding or refactoring content: users making relevant editorial changes to the text or images. Does not include if a user left a Talk-style comment in a page, blanked without explanation, etc.
  2. Copy-editing: fixing of grammar and spelling errors without altering meaning.
  3. Formatting and markup: only fixing the wikitext of a page, adding internal links or templates, changing its visual format (i.e. italics to bold).
  4. Test edits: users making edits that were not helpful and may be broken, but which were not bad faith.
  5. Vandalism or spam: users making edits that were very likely to be in bad faith, and needed to be reverted as vandalism or external link spamming. Includes blanking, changing random figures without explanation, inserting curses/insults, etc.
  6. Other: please describe

This handcoding analysis is done by three people, each experienced with examining Wikipedia edits, and coding independently. When analysis is completed, we will measure the level of agreement between coders and the average frequency given to each type of edit.


We can use some descriptive statistics to get a feel for the kinds of articles are being edited by users in the test funnel. Measurements we're potentially interested in include:

  • Average page length
  • Number of page watchers
  • Number of incoming (internal) links
  • Pageviews

We are interested in these metrics for all pages edited by test group users, segmented by funnel (so returnto pages compared to gettingstarted-copyedit pages).



We found that across the board, users editing their "returnto" page were overwhelming making substantive edits to the content of articles, either adding new content, refactoring it, or otherwise changing its facts and meaning. These edits composed 47% of our 250-edit sample, nearly half. The two next most common edit types were fixing the markup or formatting of a page without changing the meaning (18%), and vandalism or spam (also 18%). The high level of formatting and markup is expected, as all users used the wikitext editor rather than VisualEditor, which was opt-in in Preferences and thus mostly hidden from view. The rate of vandalism was higher than previous samples we coded, though this should be taken with a grain of salt, since it is unknown how much seasonality impacts vandalism rates (for instance, they may go up after school is in session again, versus the summer).



TL;DR: ~67% of first edits are to the main namespace regardless of condition. Of those main edits, test and control generally edit the same length articles. However, test users tended to edit articles with significantly more watchers than control users.

To explore how OB6 affects the types of pages that newcomers make their first edits to, we gathered stats on the pages where OB6 users made their first edits.

Namespace first edit proportion. The proportion of editors first edits is plotted by namespace and split by experimental condition.

While only 67% of newcomers first edits were made to articles (namespace zero), only articles make sense when asking questions with regards to page length, watchers and views. So the following analyses examined only those first edits which changed NS0 pages.

Page length (at time of edit)Edit

The length of an article is one of the best predictors of the article’s quality level. Previous work has shown that, newcomers tend to be reverted when they edit longer pages[1].

How does the length of pages first edited by newcomers differ by condition and return to status?

Page length density. Probability density functions are presented for the page length of newcomers first edits based on their view of MediaWiki before registering.
Page length mean. Geometric mean page length of newcomers first edits based on their view of MediaWiki before registering.

There could be some minor variations related to OB6 vs. control, but they are not large enough for us to be sure that they are real.

# of page watchers (2013-10-31)Edit

Another way to look at the activities of newcomers is by the number of watchlisters who are watching the pages they edit.

Here, we do see some significant differences. Newcomers with OB6 tend to edit articles with more watchers in their first edit (t = -3.998, p < 0.001). This difference is relatively consistent across all return_to states.

How does the # of watchers of pages first edited by newcomers differ by condition and return to status?

Watchers density. Probability density functions are presented for # of watchers for pages that newcomers first edit based on their view of MediaWiki before registering.
Watchers mean. Geometric mean # of watchers for pages that newcomers first edit based on their view of MediaWiki before registering.