## Friday, June 30, 2017Edit

My goal for today is to determine how long the trial will need to run in order to get statistically significant results.

A trial should occur over a multiple of weeks due to the periodic nature of work-week/weekend. We'll need to look at all newcomers when measuring outcomes because we don't know who *would have* created articles during the trial period.

Assumptions:

- We'll want to look at measures of productivity and retention
- The most recent couple of weeks will represent the trial period well

My plan is to work from https://analytics.wikimedia.org/dashboards/standard-metrics/#projects=enwiki/metrics=(Beta)%20Monthly%20New%20Editors backwards using standard metrics to get some baseline rates. Then using these baseline rates, I'll be doing a power analysis based on substantial changes in rates.

Oh wait. It looks like that won't work because the data ends in 2016.

OK time to run some queries. After a bunch of digging, I got some stats that the analytics team have prepared.

hive (wmf)> SELECT metric, value FROM mediawiki_metrics WHERE wiki_db = "enwiki" and dt = "2017-04-01" AND metric LIKE "monthly%"; OK metric value monthly_new_editors 47620 monthly_new_registered_users 149449 monthly_surviving_new_editors 1626

So April has 30 days, so we have roughly 149449/(30/7) = 34871 new_registered_users, 47620/(30/7) = 11111 new_editors, and 1626/(30/7) = 379 surviving_new_editors.

OK. So that gives me a useful baseline. It looks like roughly 11111/34871 = 31.9% of registered users will make an edit and 1626/11111 = 14.6% of editors will stick around.

Even before I start with the power analysis, I'm pretty sure we're going to see significant differences at 1% change for this number of observations. Next I'll be plotting these values on a fancy chi^2 graph. --Halfak (WMF) (talk) 18:54, 30 June 2017 (UTC)

OK! Time for power analysis plots.

**Survival rate power analysis.**

*P values are plotted for a power analysis of the baseline survival rate (surviving new editors/new editors) in English Wikipedia for three change thresholds (1%, 2%, 3%). Vertical lines represent the number of new editors per week from April 2017.*

These plots show that we should expect enough observations to see significance if the survival rate or edit rate increase or decrease by 1% during the trial period if it lasts for a week. If we want to run a *controlled* experiment we'll need two weeks worth of observations. If we want to be *absolutely sure*, we could run the trial for two weeks and expect to get vanishingly small p-values. --Halfak (WMF) (talk) 21:07, 30 June 2017 (UTC)