Research:Wikimania unconference experiment

This is a proposed project of the growth team at the Wikimedia Foundation, in conjunction with attendees at the Unconference of Wikimania 2012.

Background edit

Justification edit

We believe that clean-up templates were supposed to entice readers into becoming editors, however the reverse might actually be true: templates may deter readers from editing an article tagged with a clean-up template.

We assume the following which may be relevant to the context of this experiment:

  • Templates are not useful for user engagement.
  • Templates also serve to advertise article credibility.
  • Wikipedians are anti-authoritarian.
  • Templates make it seem like there is an administrative team behind Wikipedia content.
  • Cleanup templates serve multiple purposes.
  • Templates are designed to send a message to readers.
  • Templates are applied with considerate thought; they are not automatic or arbitrary.
  • Design is a problem across templates.
  • About 400 templates total; 20-30 primary templates are used in the main article namespace; most article templates are yellow or orange, and color is used to indicate the extremity of the cleanup problem.[1]
  • Template wording/content is also a problem.
  • Different types of articles encourage readers to edit while others do not.


Impact: We anticipate that the removal or modification of clean-up templates can potentially improve engagement in an article. However, do we lose credibility (with readers) by suppressing these informational templates?

Community: This experiment has support from within the community. See the list of co-contributors below.

Workload The first iteration of this experiment should mostly involve 1) the selection of a set of eligible articles on which to test and 2) development work involving clicktracking for account creation (if we decide to use that metric), and 3) the implementation of template suppression for the user test group. The clicktracking effort could be re-used for account UX experiments among others.


Co-contributors

Research questions edit

RQ1: Does the inclusion of a cleanup template decrease the number of edits to an article?

RQ2: Does modifying the language of a cleanup template decrease the number of edits to an article?

Overall metrics edit

The metrics in this experiment will be measured on a per article basis.

page views
The page view data for the article set over the time period (squids emery/locke)
edit attempt count
The number of attempted edits on the article set over the time period (squids emery/locke)
RF: I need to identify the url pattern for this event
edit complete count
The number of completed edits on the article set over the time period (rev table)
edit success rate
Click through rate of edit attempts over completed edits (derived from edit attempt count and edit complete count)
created accounts
Click through rate of edit attempts over completed edits
RF: Can we do this? We would need to use clicktracking to monitor bucketed IPs creating accounts.
size of edit (bytes)
Click through rate of edit attempts over completed edits (rev table)
login clicks
login button clicks from non-registered users (squids emery/locke)
RF: I need to identify the url pattern for this event

Technical & feature requirements edit

  • Access to page view data for article set via squid logs on emery/locke
  • Click tracking extension to determine accounts created from bucketed users
  • Suppression of templates based on user bucketed to the test group. JS implementation(?)

Experiment #1 edit

Methods edit

  1. Bucket users into a test group where clean-up templates are suppressed on a subset of article pages.
  2. Experiment conditions will only be displayed to logged out users.
  3. Look at the transclusions and go from most useful to least useful templates.
  4. TBD whether experiment should include all templates or a set of templates.

Notes edit

  • Is there a good reason for not considing logged in users also?
  • We should try to ensure that different clean-up templates are selected so that they may be tested separately if so desired. This necessitates have a large enough sample of articles for each template type.

Experiment #2 edit

  • If the templates do not decrease engagement, test changing the template language.
  • If the templates do not significantly decrease engagement, change the template design.
  • If the templates do decrease engagement, experiment with different calls to action to achieve the same request to clean up articles: microtasks, tasks suggestions (fix this typo), upvoting of tasks that would improve the article, etc.

Methods edit

  1. Suppress clean-up templates on a subset of article pages.
  2. Experiment conditions will only be displayed to logged out users.

References edit