Research:Teahouse long term new editor retention

Tracked in Phabricator:
Task T113389
18:29, 28 September 2015 (UTC)
Duration:  2015-September – 2016-May
This page documents a completed research project.

The impact of the Wikipedia Teahouse on the user experience, editing activities, and short-term retention of new editors has been analyzed previously[1][2]. These study findings suggest that participating in the Teahouse has a positive impact on new editor retention. The current study builds on these findings by comparing the editing activity and tenure of a large sample of Teahouse invitees and a control group over the course of more than 6 months, and also analyzes the circumstances under which a Teahouse invite has the greatest impact on new editor retention.



The Teahouse offers personalized support to new editors, providing an opportunity to learn the ropes of Wikipedia in a safe, friendly, and engaging environment. Outreach is proactive--new editors who are active and appear to be participating in good faith receive a personalized invitation to participate in the forum soon after they register, whether or not they have already encountered difficulties or even formulated a question yet. The Teahouse finds them, they don't have to find it.

When they arrive at the Teahouse, they find a space is explicitly designed for new editors, with technical barriers to participation that are lower than elsewhere on Wikipedia, where every question receives one or more personalized, prompt, and polite answers from experienced Wikipedians.

The Teahouse has been active since Februrary 2012. As of October 2015, over 8,000 new editors had participated.

Limitations of previous research


Previous research on the impact of the Teahouse on new editor retention did not include a true control group: the analysis compared invitees who subsequently visited the Teahouse with a set of invited users who chose not to visit. Due in part to this limitation, the question of whether the Teahouse is an effective mechanism for retaining more good-faith newcomers has never been adequately answered.

In the current study, we attempt to address this gap in our understanding of the Teahouse by comparing the retention of Teahouse invitees (whether or not they subsequently participated in the Teahouse) with a group of editors who could have been invited to the Teahouse during the same time period, but were not.

We also expand on previous work by analyzing the circumstances under which being invited to the Teahouse was most effective. Specifically, we look at the timing of the invite, the invitee's editing activity up until that point, and the messages they had received from other Wikipedia editors at the time the invite was delivered.



We hypothesize (H1) that new editors who receive an invitation to participate in the Teahouse, a welcoming environment that provides valuable support for new editors, will be more likely to continue editing Wikipedia for a longer period of time than other editors who experience similar initial conditions but are not invited to participate--whether or not the invitee subsequently participates in the Teahouse.

We also hypothesize (H2, H3) that the impact of invitation (as measured by length of retention) will vary based on the new editor's experience on Wikipedia up to the point at which they were invited.

H1: Compared to control editors, people who receive a Teahouse invite are more likely be retained than those who are not.

  • H1a: Invitees are more likely to have made 1 or more edits 3-4 weeks, 1-2 months, and 2-6 months after the invite date than control.
  • H1b: Invitees are more likely to have made 5 or more edits 3-4 weeks, 1-2 months, and 2-6 months after the invite date than control.

H2: There will be an inverse relationship between early, negative experiences and an editor's likelihood of being retained, which will be mitigated by a Teahouse invite.

  • H2a: Teahouse invites will have a greater impact on retention for new editors who have previously had one or more edits reverted.
  • H2b: Teahouse invites will have a greater impact on retention for new editors who have previously received one or more warning messages on their talkpage.

H3: The sooner after registration someone receives a Teahouse invitation, the more likely they are to be retained.





On 71 days between October 20, 2014 and January 12, 2015 we gathered a sample of 14,766 new Wikipedia editors who had registered their accounts within the past 48 hours, had made at least 5 edits, and had not been blocked or received an Teahouse invitation or serious warning on their talkpage.

TODO: Describe second sample collected in Nov 15 - Jan 16

In each daily sample, 50 editors were randomly selected as a control group and invitations were not sent to these users, although they were screened for blocks and talkpage messages. The final control sample contained a total of 3,092 users after screening. The remaining 11,674 users in the sample received an invitation to the Teahouse from HostBot on their talkpage.

Although we did explicitly exclude editors who had previously created their account on another Wikimedia wiki, and were editing English Wikipedia for the first time, a post-hoc check against the logging table shows that these users accounted for only 5-6% of both the experimental and control groups.

To help ensure that we were not inviting users who had either been implicated in serious vandalism or disruptive behavior, as well as users who had previously been invited to the Teahouse by other Wikipedia editors, we excluded users from both the experimental and control samples if their talkpage contained any of the following strings: 'uw-vandalism4', 'final warning', '{{sock|', 'uw-unsourced4', 'uw-socksuspect', 'Socksuspectnotice', 'only warning','without further warning', 'Uw-socksuspect', 'sockpuppetry', 'Teahouse', 'uw-cluebotwarning4', 'uw-vblock', 'uw-speedy4'


Work log

Work log

  • univariate: binary survival of invited vs. control editors after 3-4 weeks, 1-2 months, 2-6 month
  • multivariate: predictors of survival (number/type of pre-invite edits, number/type of pre-invite talkpage messages, number of reverts, amount/type of participation in the Teahouse...)
    • needed to validate and contextualize the results of our preliminary analysis: if Teahouse invitees are retained longer at a higher rate, why is that the case?


  • October 2014 - January 2015: data collected
  • September - October 2015: data analyzed
  • November 2015 - January 2016: second round of data collected
  • March - April 2016: analyze full dataset
  • May 2016: - Publish findings



Preliminary findings: survival (Oct 14 - Jan 15 dataset)

October 21 research showcase slides
Slides with notes from a presentation at the WMF metrics and activities meeting on December 3, 2015.

The results of a preliminary comparative analysis of editor survival between groups show that Teahouse invitees were significantly more likely than control to have made 1 or more edits between 3-4 weeks after registration (χ^2 = 3.9, df =1, p = 0.04788, two-tailed) and to have made 5 or more edits between 2-6 months after registration (χ^2 = 4.6468, df =1, p = 0.03111, two-tailed). Editors in the experimental group were more than 10% more likely than those in the control to have met the minimum survival threshold in each of these survival windows.

Results from tests of number of edits (1+) between 1-2 months and 2-6 months, and number of edits (5+) between 1-2 months approached significance in the same direction. More editors in the experimental group met the minimum survival criteria than those in the control group. Although these results were not significant (p < 0.05), their directionality and p values suggest that the significant results presented above are not a statistical fluke.

See also