Research:Onboarding new Wikipedians/Qualitative analysis

We want to know what kinds of edits users were making as part of the GettingStarted flow, both for those users who clicked on one of the suggested articles to improve, and for those who did something else after signing up (returned to the article they were on and edited it, created a new article, etc.).

Our hypothesis is that the rate of non-constructive contribution—test edits or vandalism—would remain about the same as prior to the new interface. We are also interested in seeing if users who chose tasks were actually copyediting, or if they were making contributions that suggested they had more advanced knowledge of the topic (adding or changing the content of the article, not simply editing the form), thus demonstrating that they were looking for interest-dependent topics in the onboarding process rather than interest-independent ones.

Round one


Our first round of qualitative analysis focused on users exposed to the version of Special:GettingStarted which suggested to users that they "fix spelling and grammar" (i.e. copyedit) articles from a list of six given.



From the list of all revisions that had been generated from new users going through the account creation process during December 21-26, 2012, we pulled a random sample of 100 new users who were editing one of the recommended GettingStarted articles, and a random sample of 100 new users who were editing something else. Then we looked at the first edit of each user and coded it for type. Two coders went through the list independently to ensure reliability. The charts below represent the average of their scores for each category.


Users who accepted a GettingStarted task


Users who did something else


In July 2012, when looking at the rate of vandalism and test editing from new registered users, we observed about a 10% rate of vandalism and 7% test edit rate from a random sample of 100 new users' first edits (both live and deleted). It appears that serving the GettingStarted page post account creation did not significantly affect these rates either for users who edit one of the suggested article or for users who don't.

Of the users who were editing suggested articles, over half were indeed making copyedits (fixing spelling, grammar, punctuation), and only about 15% were making edits that suggested topical interest and/or expertise (changing facts, adding additional new information or removing old information).

Round two


Our second round of qualitative analysis focuses on the users exposed to the second major revision of Special:GettingStarted, which ask users to choose from one of three task types: fixing spelling and grammar (copyediting), add wiki links, and clarifying pages (those tagged as vague, confusing or unclear).



We gathered 194 revisions tagged as "gettingstarted edits" in RecentChanges, from March 21-27. We did not filter or organize these revisions by the user who performed them, or which cleanup issue (needs wiki links, copyediting, clarification) the page was tagged for, in order to ensure not pre-biasing our assessment based on the user or the pre-selected task type. We did remove three revisions conducted by test accounts created by WMF staff in order to verify the functionality of the RecentChanges tagging.

Two coders (Steven Walling, Maryana Pinchuk) then independently assessed each revision using the following categories:

  • "spelling and grammar"
  • "adding wikilinks"
  • "clarification"
  • "vandalism"
  • "test edit"
  • other

The first three categories correspond to the types of tasks these articles were tagged as needing, and which users were asked to do. Test edits and vandalism are slight variations in two types of edits considered harmful and needing to be reverted.


Raw data

Out of the 191 edits made via the latest version of Getting Started, we found that the majority (55%) were copyediting tasks. This corresponds roughly to the size of the cohort of users who accept the copyediting task (cohort data). The next most common edit type was adding internal wiki links (without doing other kinds of substantial changes), at 17-18%. All other task types were roughly equal in proportion after these two, and it should be particularly noted that the "clarification" type of task was even less common than the small number of vandalism or test edits. Despite the fact that we presented two additional task types to new editors, fixing the spelling and grammar of an article was clearly preferred by new editors who successfully completed edits via Getting Started.

The number of revisions coded as each type