Research talk:Anonymous editor acquisition/Signup CTA experiment/Work log/2014-05-30

Friday, May 30th edit

So, I'm having some weird issues with MySQL now. I have two tables with a perfect set of keys, but I can't seem to use them. I had a query to gather token info together that I kicked off last night despite this indexing issue, but it went all night without any success. I've been troubleshooting all morning. Suffice it to say that I have identical keys on two tables, one is an order of magnitude taller than the other. The keys are (wiki, token) and MySQL either refuses to use the index at all (if I'm using InnoDB tables) and will only use the (wiki) part of the index if I am using MyISAM. This results in substantial pain.


OK. It looks like I've got it. The two columns were VARCHAR, but they were different lengths (32 vs 191) in different tables. Once I modified them to be the same length, the index is finally being picked up again and I'm loading the data. Now, back to my query from yesterday morning.

SELECT
    wiki,
    bucket,
    IF(first_user_id IS NULL,
        "pure anon",
        IF(first_user_registration IS NULL OR first_user_registration <= 20140502000000,
        "old user",
        IF(first_user_registration <= 20140519180800,
        "tracked user",
        "experiment user"))) AS editor_class,
    COUNT(*) AS tokens
FROM staging.token_info
WHERE link_clicks > 0
GROUP BY 1,2,3;
> SELECT
    ->     wiki,
    ->     bucket,
    ->     IF(first_user_id IS NULL,
    ->         "pure anon",
    ->         IF(first_user_registration IS NULL OR first_user_registration <= 20140502000000,
    ->         "old user",
    ->         IF(first_user_registration <= 20140519180800,
    ->         "tracked user",
    ->         "experiment user"))) AS editor_class,
    ->     COUNT(*) AS tokens
    -> FROM staging.token_info
    -> WHERE link_clicks > 0
    -> GROUP BY 1,2,3;
+--------+-----------+-----------------+--------+
| wiki   | bucket    | editor_class    | tokens |
+--------+-----------+-----------------+--------+
| dewiki | control   | experiment user |    512 |
| dewiki | control   | old user        |    575 |
| dewiki | control   | pure anon       |  45844 |
| dewiki | control   | tracked user    |     41 |
| dewiki | post-edit | experiment user |    504 |
| dewiki | post-edit | old user        |    513 |
| dewiki | post-edit | pure anon       |  41525 |
| dewiki | post-edit | tracked user    |     41 |
| dewiki | pre-edit  | experiment user |    789 |
| dewiki | pre-edit  | old user        |    617 |
| dewiki | pre-edit  | pure anon       |  42014 |
| dewiki | pre-edit  | tracked user    |     40 |
| enwiki | control   | experiment user |   7542 |
| enwiki | control   | old user        |   3065 |
| enwiki | control   | pure anon       | 227378 |
| enwiki | control   | tracked user    |    368 |
| enwiki | post-edit | experiment user |   7198 |
| enwiki | post-edit | old user        |   2842 |
| enwiki | post-edit | pure anon       | 206706 |
| enwiki | post-edit | tracked user    |    317 |
| enwiki | pre-edit  | experiment user |  10758 |
| enwiki | pre-edit  | old user        |   3810 |
| enwiki | pre-edit  | pure anon       | 211617 |
| enwiki | pre-edit  | tracked user    |    362 |
| frwiki | control   | experiment user |    752 |
| frwiki | control   | old user        |    340 |
| frwiki | control   | pure anon       |  33794 |
| frwiki | control   | tracked user    |     42 |
| frwiki | post-edit | experiment user |    820 |
| frwiki | post-edit | old user        |    360 |
| frwiki | post-edit | pure anon       |  30435 |
| frwiki | post-edit | tracked user    |     32 |
| frwiki | pre-edit  | experiment user |   1212 |
| frwiki | pre-edit  | old user        |    509 |
| frwiki | pre-edit  | pure anon       |  31372 |
| frwiki | pre-edit  | tracked user    |     35 |
| itwiki | control   | experiment user |    286 |
| itwiki | control   | old user        |    211 |
| itwiki | control   | pure anon       |  20854 |
| itwiki | control   | tracked user    |     18 |
| itwiki | post-edit | experiment user |    345 |
| itwiki | post-edit | old user        |    202 |
| itwiki | post-edit | pure anon       |  18781 |
| itwiki | post-edit | tracked user    |     15 |
| itwiki | pre-edit  | experiment user |    571 |
| itwiki | pre-edit  | old user        |    321 |
| itwiki | pre-edit  | pure anon       |  20410 |
| itwiki | pre-edit  | tracked user    |     30 |
+--------+-----------+-----------------+--------+
48 rows in set, 10518 warnings (3.27 sec)

Yay! OK. Let's break this down for itwiki again.

class control post-edit pre-edit
old user 211 202 321
tracked user 18 15 30
pure anon 20854 18781 20410
experimental user 286 345 571
  • Control: 286/(286+20854) = 0.0135
  • Post-edit: 345/(345+18781) = 0.0180
  • Pre-edit: 571/(571+20410) = 0.0272

How about enwiki?

class control post-edit pre-edit
old user 3065 2842 3810
tracked user 368 317 362
pure anon 227378 206706 211617
experimental user 7542 7198 10758
  • Control: 7542/(7542+227378) = 0.0321
  • Post-edit: 7198/(7198+206706) = 0.0337
  • Pre-edit: 10758/(10758+211617) = 0.0484

--Halfak (WMF) (talk) 15:59, 30 May 2014 (UTC)Reply


Time for a plot.

 
Registration rate. Registration rates are plotted by experimental bucket during the signup CTA experiment.\


OK. So it's time to unpack this. It looks like different wikis have different baselines. In English Wikipedia, nearly 3.5% of anons in the control condition who click either the "edit" or "create account" links will go on to register an account. For German Wikipedia, the proportion is lowest at a little over 1.1%. Italian and French Wikipedias land in the middle with 1.4% and 2.2% respectively.

All wikis had a roughly equivalent change in registration rates (relative to their baselines) for the pre- and post-edit conditions cross wiki. In all cases, the experimental conditions increased the conversation rates, but the pre-edit was substantially more effective in this regard than the post-edit CTA.

wiki pre-edit delta post-edit delta pre-edit proportional delta post-edit proportional delta
dewiki 0.0074 0.0009 0.6689 0.0857
enwiki 0.0163 0.0015 0.5069 0.0482
frwiki 0.0154 0.0045 0.7087 0.2052
itwiki 0.0137 0.0045 1.0116 0.3333

The table above summarizes the differences between the experimental conditions and the control.

delta
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "http://localhost:6011/meta.wikimedia.org/v1/":): {\displaystyle \text{reg\_rate}(\text{exp}) - \text{reg\_rate}(\text{control})}
proportional delta
Failed to parse (syntax error): {\displaystyle \text{reg\_rate}(\text{exp}) - \text{reg\_rate}(\text{control})/\text{reg\_rate}(\text{control})}

--Halfak (WMF) (talk) 20:17, 30 May 2014 (UTC)Reply


Filtering users who didn't click edit during the experiment edit

 
Registration rate. Registration rates are plotted by experimental bucket during the signup CTA experiment.


wiki Baseline % (est. n) Pre-edit % (delta) Post-edit % (delta) Pre-edit factor % (est. n) Post-edit factor % (est. n)
dewiki 0.524 (620) 1.339 (+0.815) 0.61 (+0.086) 255.5 (1584) 116.5 (723)
enwiki 1.184 (6874) 3.165 (+1.980) 1.402 (+0.218) 267.2 (18367) 118.4 (8138)
frwiki 0.75 (665) 2.462 (+1.712) 1.163 (+0.412) 328.1 (2182) 155 (1031)
itwiki 0.611 (347) 2.028 (+1.417) 1.016 (+0.405) 332.1 (1151) 166.3 (576)
Return to "Anonymous editor acquisition/Signup CTA experiment/Work log/2014-05-30" page.