Research talk:Teahouse long term new editor retention/Work log/2015-10-20
Tuesday, October 20, 2015
editToday I'm extending my analysis to include multivariate regressions. I'll be using a en:logistic regression to look for differences in the proportion of surviving newcomers. I'll be including the pre-invite statistics I worked on last time to control for random effects around the effect of invitation. I expect the effect invitation to become more prominent after controlling for these random effects. I also expect to see some interactions between either initial investment (# of edits pre-invite)/negative feedback and the invitation condition. A positive relationship would suggest that newcomers who are highly invested and get negative feedback gain more "survivalness" from the invite.
Checking for bucketing bias
editThe following list of wilcox and Chi^2 tests check for significant differences between the pre-invite predictors between conditions. Scalars are noted by quantiles (0%, 25%, 50%, 75% and 100%). Logicals by their proportion.
- edits control=5-6-7-11-215 invited=5-6-7-12-241 W=18158354.5 p=0.597
- main_edits control=0-4-6-9-212 invited=0-4-6-9-241 W=18327891 p=0.181
- talk_edits control=0-0-0-0-35 invited=0-0-0-0-37 W=17914856.5 p=0.188
- user_edits control=0-0-0-1-74 invited=0-0-0-1-232 W=17823485.5 p=0.163
- user_talk_edits control=0-0-0-0-24 invited=0-0-0-0-90 W=17958249 p=0.444
- wp_edits control=0-0-0-0-19 invited=0-0-0-0-79 W=18091557 p=0.581
- other_edits control=0-0-0-0-96 invited=0-0-0-0-125 W=18104004.5 p=0.552
- vandal_warning control=0.121 invited=0.123 X-squared=0.153 p=0.696
- spam_warning control=0.027 invited=0.026 X-squared=0.081 p=0.776
- copyright_warning control=0.003 invited=0.003 X-squared=0.304 p=0.582
- general_warning control=0.221 invited=0.214 X-squared=0.57 p=0.45
- block control=0.001 invited=0.002 X-squared=0.729 p=0.393
- welcome control=0 invited=0 X-squared=0 p=1
- csd control=0.035 invited=0.031 X-squared=1.234 p=0.267
- deletion control=0.051 invited=0.047 X-squared=0.696 p=0.404
- afc control=0 invited=0 X-squared=NaN p=NaN
- teahouse control=0 invited=0 X-squared=NaN p=NaN
No significant differences here.
Predicting 1+ edits
editNow to build some logistic models that account for these pre-invite predictors.
- 3 to 4 weeks
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.40157 0.28915 -11.764 < 2e-16 *** grpinvited -0.45260 0.31556 -1.434 0.151491 log(edits + 1) 0.36207 0.12951 2.796 0.005181 ** log(main_edits + 1) 0.09796 0.04883 2.006 0.044832 * log(talk_edits + 1) 0.19515 0.07423 2.629 0.008563 ** log(user_edits + 1) 0.02817 0.04907 0.574 0.565918 log(user_talk_edits + 1) 0.22418 0.06657 3.368 0.000758 *** log(wp_edits + 1) 0.13143 0.09053 1.452 0.146561 general_warningTRUE -0.52963 0.19096 -2.773 0.005547 ** csdTRUE -1.52286 0.72156 -2.111 0.034813 * deletionTRUE -0.41958 0.37541 -1.118 0.263717 grpinvited:log(edits + 1) 0.23479 0.12669 1.853 0.063855 . grpinvited:general_warningTRUE 0.04394 0.21149 0.208 0.835434 grpinvited:csdTRUE 1.09061 0.75507 1.444 0.148633 grpinvited:deletionTRUE -0.21884 0.42406 -0.516 0.605816 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 8869.9 on 14765 degrees of freedom Residual deviance: 8564.5 on 14751 degrees of freedom AIC: 8594.5 Number of Fisher Scoring iterations: 6
First, the obvious effects. We see the usual suspects here. The more edits you do -- overall, but especially talking -- the more likely you are to be retained. We also see some substantially negative effects of warning messages and CSD notifications.
We see a negative effect of invitation here, but it looks like the combined effect of grpinvited:log(edits + 1)
counteracts that for editors who saved (log(x+1)=2, x=6) edits or more when the invite was posted. For any editor who saved more than 6 edits (highly motivated), it looks like the invite might be substantially improving retention at scale with how much editing they are doing. But the effect remains insignificant (marginal @ 0.064).
Counter to my suspicions, I don't think we're seeing solid evidence of an interaction between being invited to the teahouse and surviving despite negative feedback (csd & warning). It could be that this is due to too low of observations.
Just for the sake of making sure that my previous analysis wasn't totally off, let's try the model with just the invite as a predictor.
- 3 to 4 weeks (single predictor)
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.44393 0.06633 -36.843 <2e-16 *** grpinvited 0.14830 0.07369 2.012 0.0442 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 8869.9 on 14765 degrees of freedom Residual deviance: 8865.7 on 14764 degrees of freedom AIC: 8869.7 Number of Fisher Scoring iterations: 5
Sure enough. Getting the invite seems to look significant on its own. OK! Now to try the long-term retention outcomes.
- 1 to 2 months
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.38622 0.24595 -13.768 < 2e-16 *** grpinvited 0.06769 0.27003 0.251 0.802051 log(edits + 1) 0.50057 0.11032 4.537 5.7e-06 *** log(main_edits + 1) 0.14552 0.04364 3.335 0.000854 *** log(talk_edits + 1) 0.17592 0.06710 2.622 0.008744 ** log(user_edits + 1) 0.05333 0.04358 1.224 0.220999 log(user_talk_edits + 1) 0.11024 0.06179 1.784 0.074370 . log(wp_edits + 1) 0.07868 0.08374 0.940 0.347399 general_warningTRUE -0.50829 0.15991 -3.179 0.001480 ** csdTRUE -0.95765 0.46844 -2.044 0.040918 * deletionTRUE -0.76776 0.35457 -2.165 0.030360 * grpinvited:log(edits + 1) 0.01426 0.10815 0.132 0.895065 grpinvited:general_warningTRUE -0.10018 0.17865 -0.561 0.574961 grpinvited:csdTRUE 0.57811 0.50508 1.145 0.252384 grpinvited:deletionTRUE 0.23804 0.39010 0.610 0.541725 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 11229 on 14765 degrees of freedom Residual deviance: 10845 on 14751 degrees of freedom AIC: 10875 Number of Fisher Scoring iterations: 5
- 2 to 6 months
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.19216 0.23718 -13.459 < 2e-16 *** grpinvited 0.08069 0.26070 0.310 0.75694 log(edits + 1) 0.43819 0.10755 4.074 4.61e-05 *** log(main_edits + 1) 0.18349 0.04316 4.252 2.12e-05 *** log(talk_edits + 1) 0.21654 0.06491 3.336 0.00085 *** log(user_edits + 1) 0.03663 0.04303 0.851 0.39460 log(user_talk_edits + 1) 0.05446 0.06177 0.882 0.37797 log(wp_edits + 1) 0.11562 0.08142 1.420 0.15563 general_warningTRUE -0.53228 0.15243 -3.492 0.00048 *** csdTRUE -0.55369 0.37929 -1.460 0.14435 deletionTRUE -0.69683 0.32396 -2.151 0.03148 * grpinvited:log(edits + 1) 0.01059 0.10513 0.101 0.91977 grpinvited:general_warningTRUE -0.10878 0.17073 -0.637 0.52403 grpinvited:csdTRUE 0.04731 0.42470 0.111 0.91130 grpinvited:deletionTRUE 0.14502 0.36032 0.402 0.68732 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 11889 on 14765 degrees of freedom Residual deviance: 11496 on 14751 degrees of freedom AIC: 11526 Number of Fisher Scoring iterations: 5
Similar story here, but it doesn't seem like the effect of the invite isn't even marginally significant. Onto the 5+ measures.
Predicting 5+ edits
editSame story as above except survival only counts when there's 5+ edits in the survival period.
- 3 to 4 weeks
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -4.65842 0.39320 -11.847 < 2e-16 *** grpinvited -0.61702 0.42874 -1.439 0.15011 log(edits + 1) 0.44489 0.17306 2.571 0.01015 * log(main_edits + 1) 0.20667 0.06855 3.015 0.00257 ** log(talk_edits + 1) 0.16150 0.10187 1.585 0.11289 log(user_edits + 1) 0.17094 0.06630 2.578 0.00993 ** log(user_talk_edits + 1) 0.21420 0.08913 2.403 0.01625 * log(wp_edits + 1) 0.19019 0.11490 1.655 0.09787 . general_warningTRUE -0.78168 0.30218 -2.587 0.00969 ** csdTRUE -1.42153 1.01750 -1.397 0.16239 deletionTRUE -0.29195 0.52487 -0.556 0.57806 grpinvited:log(edits + 1) 0.26699 0.16513 1.617 0.10591 grpinvited:general_warningTRUE 0.22127 0.33150 0.667 0.50447 grpinvited:csdTRUE 1.26795 1.05635 1.200 0.23001 grpinvited:deletionTRUE -0.48490 0.60548 -0.801 0.42322 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 5101.0 on 14765 degrees of freedom Residual deviance: 4814.7 on 14751 degrees of freedom AIC: 4844.7 Number of Fisher Scoring iterations: 7
Again, we see a lack of significant independent effect for the invitation. Again, we also see the marginially significant interaction with log(edits + 1) suggesting that the invitation might be more effective for newcomers who save a lot of edits before getting the invitation.
Onto the long-term outcomes:
- 1 to 2 months
Regression with multicollinearity problem
|
---|
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -4.07573 0.32035 -12.723 < 2e-16 *** grpinvited -0.40777 0.34891 -1.169 0.24253 log(edits + 1) 0.39092 0.14264 2.741 0.00613 ** log(main_edits + 1) 0.24877 0.05703 4.362 1.29e-05 *** log(talk_edits + 1) 0.20831 0.08312 2.506 0.01221 * log(user_edits + 1) 0.16667 0.05522 3.018 0.00254 ** log(user_talk_edits + 1) 0.18157 0.07557 2.403 0.01627 * log(wp_edits + 1) 0.15650 0.09935 1.575 0.11521 general_warningTRUE -0.55499 0.22216 -2.498 0.01249 * csdTRUE -12.75334 135.48352 -0.094 0.92500 deletionTRUE -1.16388 0.59382 -1.960 0.05000 * grpinvited:log(edits + 1) 0.21278 0.13657 1.558 0.11923 grpinvited:general_warningTRUE -0.07856 0.24671 -0.318 0.75015 grpinvited:csdTRUE 12.36522 135.48374 0.091 0.92728 grpinvited:deletionTRUE 0.43928 0.63691 0.690 0.49038 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 7482.1 on 14765 degrees of freedom Residual deviance: 7085.4 on 14751 degrees of freedom AIC: 7115.4 Number of Fisher Scoring iterations: 14 |
Yikes! here, we're seeing too much correlation between getting a 'csd' message and being invited. Going to need to drop the predictor.
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -4.07257 0.31851 -12.786 < 2e-16 *** grpinvited -0.41388 0.34713 -1.192 0.23315 log(edits + 1) 0.39039 0.14205 2.748 0.00599 ** log(main_edits + 1) 0.24097 0.05671 4.249 2.14e-05 *** log(talk_edits + 1) 0.18318 0.08288 2.210 0.02710 * log(user_edits + 1) 0.15665 0.05482 2.858 0.00427 ** log(user_talk_edits + 1) 0.17481 0.07540 2.318 0.02043 * log(wp_edits + 1) 0.15835 0.09922 1.596 0.11051 general_warningTRUE -0.62899 0.22163 -2.838 0.00454 ** deletionTRUE -1.28957 0.59291 -2.175 0.02963 * grpinvited:log(edits + 1) 0.22144 0.13593 1.629 0.10329 grpinvited:general_warningTRUE -0.02041 0.24601 -0.083 0.93387 grpinvited:deletionTRUE 0.52833 0.63556 0.831 0.40581 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 7482.1 on 14765 degrees of freedom Residual deviance: 7100.9 on 14753 degrees of freedom AIC: 7126.9 Number of Fisher Scoring iterations: 6
- 2 to 6 months
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -4.313717 0.293284 -14.708 < 2e-16 *** grpinvited 0.363524 0.320604 1.134 0.256848 log(edits + 1) 0.672387 0.128262 5.242 1.59e-07 *** log(main_edits + 1) 0.160990 0.051039 3.154 0.001609 ** log(talk_edits + 1) 0.249009 0.074702 3.333 0.000858 *** log(user_edits + 1) -0.002232 0.051036 -0.044 0.965120 log(user_talk_edits + 1) 0.047366 0.073726 0.642 0.520577 log(wp_edits + 1) 0.166464 0.092988 1.790 0.073429 . general_warningTRUE -0.731064 0.209830 -3.484 0.000494 *** deletionTRUE -0.977154 0.467432 -2.090 0.036575 * grpinvited:log(edits + 1) -0.089354 0.124897 -0.715 0.474347 grpinvited:general_warningTRUE -0.033336 0.232662 -0.143 0.886070 grpinvited:deletionTRUE 0.286314 0.510743 0.561 0.575081 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 8502.0 on 14765 degrees of freedom Residual deviance: 8129.1 on 14753 degrees of freedom AIC: 8155.1 Number of Fisher Scoring iterations: 6
Well, the direction and scale of the coefs don't change. We don't see independent significance in the effect of the invitation or it's interaction with previous activity.
Again, just to check my sanity, let's try the 2 to 6 month regression with the bucket as the single predictor.
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.52590 0.06867 -36.78 <2e-16 *** grpinvited 0.16681 0.07617 2.19 0.0285 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 8502 on 14765 degrees of freedom Residual deviance: 8497 on 14764 degrees of freedom AIC: 8501 Number of Fisher Scoring iterations: 5
Sure enough, there's the significant effect I saw in the simple Chi^2 test. --Halfak (WMF) (talk) 18:54, 20 October 2015 (UTC)