Research:Notifications/Experiment 1

Notifications (formerly known as Echo) allow registered users to receive updates on relevant events. In theory, notifications may be an effective way to boost participation in Wikipedia. Recent research suggests that notifications such as these can increase user activity[1]. However, notifications may also encourage burdensome newcomers to become more burdensome. We ran an experiment to identify effects that Echo has on Wikipedias newcomers.

Research questions edit

In this study, we focused our effort toward answering the following research questions:

RQ 1: How do Notifications affect the quantity of newcomer contributions? RQ 2: How do Notifications affect the quality/productivity of newcomers' contributions? RQ 3: How do Notifications affect the burden Wikipedians face in dealing with newcomers?

Methods edit

In order to test the effects of Notifications, we performed an experiment like in the English Wikipedia. For one week (2013-06-11 20:00:00 - 2013-06-18 20:00:00), newly registered user accounts were split (round robbin) into two experimental conditions. New users who received an even user ID on registration were given an experience that matches the way that Wikipedia worked before Notifications were deployed (control condition). New users with an odd user ID were given Notifications in it's current form (test condition) . During this week, 34,941 new accounts were registered. Due to some minor variations in the assignment of user IDs, 17,459 users were placed in the control condition (Pre-Echo) and 17,482 users were placed in the the experimental condition (Echo-Current).

We monitored the activity of these users for one week following each user's registration date and generated a set of metrics based on their activity intended to help answer our research questions.

Conditions edit

Pre-Echo edit

 
The legacy notification for new messages.

All notification types disabled (legacy talk notification enabled). The current defaults will be overridden by setting echo-notify-show-link to false. This hidden preference was also configured to:

  1. suppress the generation of Echo email notifications
  2. reinstate the legacy talk notification
  3. reinstate the legacy email notifications (for users with an authenticated email address).

Echo-Current edit

Users in the Echo-Current condition were displayed the talk page message indicator as documented on this page:

 

notification category web email
edit-user-talk on on
edit-thank on on
mention on on
article-linked on on
page-review on on
reverted off off

Metrics edit

To answer these research questions, we required a set of metrics for measuring the quality and quantity of newcomers work as well as their burden on other Wikipedians.

Some of these measures rely heavily on detecting reverted revisions. We opted to use the identity revert method for detection.

Productive edit
An article revision that was not reverted within 48 hours
Productive user
A user who made at least one productive edit during the week-long observation period
Block
A user who was blocked during the observation period (detected using a query to enwiki.logging)
Edit sessions
A proxy for the number of times an editor came to Wikipedia and began editing. See Research:Metrics/edit sessions
Time spent editing
The sum total time covered by edit sessions (with a supplemental 430 seconds added per Research:Metrics/edit sessions)

Timeline edit

2013-06-11 20:00:00
start split test
2013-06-18 20:00:00
end split test, bucketing is disabled, prefs for users in Pre-Echo bucket are preserved (coordinate with launch of VE split test).
2013-07-01 20:00:00
prefs for Pre-Echo users are reverted to new user system defaults

Results edit

Quantity of work / Productivity edit

Overall, we found that newcomers with Notifications enabled saved more revisions, came back to edit more often and spent more hours editing. However, it appears that these same users did less productive Wikipedia work.

As Figure 1 suggests, the (geometric) mean number of revisions per user was significantly larger for newcomers with Notifications enabled (t=2.121, p=0.034), but Figure 2 suggests that the average number of productive edits per user may be smaller for users in the Echo-Current (test) condition (t=1.7471, p=0.080). This suggests that, while users with Notifications enabled saved more edits, they made significantly fewer productive edits.

Figure 3 shows that users with Notifications enabled came back to edit significantly more times in their first week than users in the control condition (t=2.098, p=0.036). Our analysis of time spent editing suggests that users in Echo-Current also spent significantly more time editing (t=2.78, p=0.005).

 
Figure 1. Revisions per user. The geometric mean number of revisions made is plotted by bucket with standard error bars.
 
Figure 2. Productive edits per user. The geometric means of productive edits per editor (article edits, not reverted) are plotted by condition with standard error bars.
 
Figure 3. Edit session per user. The geometric mean number of edit sessions per editor is plotted by condition with standard error bars.
 
Figure 4. Distribution of revision count. A smoothed histogram is plotted for the number of revisions saved by editors per bucket.
 
Figure 5. Distribution of productive edit count. A smoothed histogram of the number of productive edits is plotted by experimental condition.
 
Figure 6. Distribution of edit sessions. A smoothed histogram of the number of sessions initiated by users in their first week of editing is plotted by experimental condition.

In order to get a sense for where the difference between the experimental condition manifested, we plotted a set of smoothed histograms of the observed metrics. Figure 4 shows that the observed difference in total revisions appears in the higher values -- that there are a relatively small set of users who made more than 100 edits that are pulling the statistic upwards. Figure 5 reflects the pattern seen in Figure 4, but it appears that there are far fewer newcomers who made it past the 100 productive edit threshold. Figure 6 suggests that the difference in the number of sessions is more evenly distributed. From 5 sessions upward, users with Notifications enabled are consistently more likely to hit greater than 5 sessions than users without Notifications.

Quality of work edit

The data suggests that users with Notifications enabled were less likely than the control to make productive contributions to articles in their first week. Despite the larger amount of revisions overall, editors with Notifications enabled did not make significantly more article edits on average (Figure 7). As Figure 8 suggests, newcomers with Notifications enabled were marginally less likely to make at least one productive edit during their first week (x^2=3.65, p=0.056).

However, it appears that the work of a few highly productive editors made up the difference. Figure 9 shows a significant difference in the aggregate (all users' edits grouped together) proportion of article edits that were not reverted (x^2=8.50, p=0.004). In other words, it appears that a smaller proportion of newcomers with Notifications enabled made at least one productive edit, but of those who did made enough to make up for the loss -- at least in this trial[2].

 
Figure 7. Main revisions per user. The geometric mean revisions per users is plotted by condition with standard error bars.
 
Figure 8. Proportion of productive editors. The proportion of productive editors (at least one non-reverted article edit) is plotted by condition with (normal approx) standard error bars.
 
Figure 9. Aggregate proportion of productive article edits. The proportion of aggregated article edits that were not reverted is plotted by experimental condition with (normal approx) standard error bars.

Burden on Wikipedians edit

In order to measure the burden on Wikipedians imposed by newcomers, we used two metrics: the average number of revisions that were reverted (by others[3]) and the proportion of those newcomers who were blocked in their first week.

Figure 10 shows that newcomers with Notifications enabled made marginally more revisions that were eventually reverted by others (t=1.755, p=0.079). Figure 11 shows that newcomers with Notifications enabled were significantly more likely to be blocked in their first week of editing (x^2=4.104, p=0.043). Both of these results suggest that newcomers with Notifications enabled are more burdensome to Wikipedians.

 
Figure 10. Reverted revisions per user. The geometric means of reverted edits (by others) is plotted by condition with standard error bars.
 
Figure 11. Proportion of blocked newcomers. The proportion of blocked users is plotted by bucket with (normal approx) standard error bars.

Summary edit

Our results suggest that the presence of Notifications effectively increases the amount of activity that new users will engage in (more edits, more edit sessions and more hours spent editing). However, the effect of Notifications on the productivity of new users is unclear. On average, users with Notifications were less likely to make productive contributions to articles, but in our experiment, a few highly productive newcomers made up some of the difference.

Our results also show that newcomers with Notifications enabled are more burdensome. They made more edits that were reverted by others and they were more likely to be blocked.

Future work edit

This analysis opens up a new set of questions about how the presence of Notifications affects newcomer behavior.

  • What are newcomers with Notifications doing to get blocked more often?
  • Where are newcomers with Notifications spending all of their time if not editing articles?
  • Are there specific notification types that predict good or bad behavior?

References edit

  1. Olson, J. F., Howison, J., & Carley, K. M. (2010). Paying Attention to Each Other in Visible Work Communities: Modeling Bursty Systems of Multiple Activity Streams. 2010 IEEE Second International Conference on Social Computing (SocialCom). doi:10.1109/SocialCom.2010.46
  2. This result is likely due to a few highly prolific newcomers. Aggregate measures like this are less robust than per-user measures.
  3. It turns out that self-reverts are relatively common. Since reverting yourself does not burden others, they were filtered from the dataset.