Research:Global Beta Features usage

Introduction

The goal of this project is to investigate the usefulness of BetaFeatures, a MediaWiki extension deployed on the Wikimedia sites, for socialising new software.

BetaFeatures

A screenshot from the Multimedia Viewer, one of the Beta Features

A recent Wikimedia Foundation software development project has been Beta Features - an extension that allows the easy enabling or disabling of new, experimental features on a per-user basis. The intention is to allow the WMF to 'soft launch' new features, both ones the Foundation is committed to and ones it is experimenting with, so that the engineers and designers can easily get user feedback before making changes or enabling things more widely. The Beta Features are accessible from the "beta" button in the user toolbar, and currently consist of:

The Multimedia Viewer, a new interface for viewing images, videos and audio files;
Typography Refresh, wide-ranging changes to MediaWiki's default typography;
Nearby Pages, which provides information on pages whose geocodes indicate geographic closeness to the page you're viewing;
VisualEditor, a rich-text editor for MediaWiki;
VisualEditor formulae editing, which allows for the creation and modification of mathematical equations in the VisualEditor;
CirrusSearch, a new MediaWiki search engine.

Users can enable these features individually, and also automatically subscribe to receive new Beta Features as they are created and deployed. Each individual feature, as well as the BetaFeatures extension as a whole, has a discussion page on MediaWiki.org where users can leave feedback, and are encouraged to do so.

Socialising new software

The purpose of BetaFeatures is to socialise new software. There is almost no literature directly discussing socialisation, but a lot of the individual principles tie back to feminist theory, particularly as applied to Human-Computer Interaction (HCI). Bardzell & Bardzell, in Towards a Feminist HCI Methodology: Social Science, Feminism, and HCI,^[1] argued for an application of feminist theory to HCI studies, and discussed multiple methodological positions that could compose such a HCI theory, positions with equivalences in our socialisation principles, including involving participants in the design of a project and the definition of its goals, an empathetic relationship with participants, and the use of diverse methods in assessing the success or failure of a particular piece of work.

In particular, on the empathetic relationship element, Wright & McCarthy (2008)^[2] argue that this consists of not only imagined use cases and users (which they term 'empathy through role-play'), but 'empathy through dialogue' - actively talking to and collaborating with affected groups, which ties back into Bardzell & Bardzell's work. In theory, BetaFeatures enables this kind of engagement and dialogue: because features can be enabled by any logged-in user, regardless of their background or experience level, and there is a feedback page accessible to those users.

But how does this work in practise? As an example; suppose that a particular BetaFeature, aimed at all users, was only being utilised by new users: the WMF could deploy it and find that it was useless or, worse, actively detrimental, to experienced contributors. Alternately, maybe the feature is heavily used by both experienced and new users, but through some fluke those who enable it all use Chrome: it is deployed, and breaks for everyone else. The goal of this research, therefore, is to work out if the BetaFeatures platform is being widely utilised by all groups of users, and whether those same users are also taking advantage of the feedback mechanisms provided.

There are a lot of reasons to anecdotally suspect that BetaFeatures is not enabling empathy through dialogue for all user groups. In particular, the feedback page being an old, wikimarkup talk page, on a wiki that most people do not contribute to and that potentially disconnects less confident users from their sense of identity, and the presence of 'inside baseball' references in the feature descriptions, seem likely to discourage users both new and old. We will explore the usage of BetaFeatures, with the hypothesis that BetaFeatures users do not represent the overall community, and then explore ways of improving the representation of groups that are not currently participating.

Research questions

How representative are BetaFeatures users of our community, in terms of editing experience, tenure and behaviour?
How representative are BetaFeatures users of our community, in terms of software use?
How representative are the providers of feedback about BetaFeatures, by the same metrics?

Dataset

For this research, we have two primary datasets. The first is data gathered by MediaWiki itself about users, their activity and their current preferences. The second is a dedicated EventLogging table that gathers information on how people change their preferences, and when - the PrefUpdate_5563398 table. To answer all of the research questions, we need to use both of these datasets together, because we care not just about people who are currently using BetaFeatures, but people who have ever used BetaFeatures, and that is only stored in the EventLogging table - as opposed to data about those users' other actions, which is exclusively stored in MediaWiki.

To perform analysis on user actions and backgrounds, we need to know what a user is - what is a "BetaFeatures user"? A user may enable a feature for 5 seconds, decide they don't like it, and reflexively turn it off - what constitutes 'use'? Thanks to the work of Halfaker & Geiger (2013),^[3] we know that 83.4% of editing sessions last for less than 30 minutes. Therefore even if a user enabled a BetaFeature right at the beginning of the session, if they waited more than 30 minutes to disable it, they probably spent at least one editing sessions exposed to it. For that reason, we will consider 'BetaFeatures users' to be any user who either currently has a BetaFeature enabled, or who previously had a BetaFeature enabled for greater than 30 minutes.

We also have to decide how to handle possible grouping or sharding of user activity. It is possible to have one account active on multiple wikis under a single name, or multiple accounts with the same name on different wikis; all of them would appear the same from the perspective of the EventLogging data. Scott Hale's paper on multilingual Wikipedians (2014)^[4] suggests that relatively few accounts fall into the latter category. As a result, we can probably treat user name as a distinct identifier across all projects without noticeably biasing the results, and group all activity by each account of that name on each wiki as by the same user.

It's also important to be comparing like with like. BetaFeatures has been around for 6 months or less; users have been around for 12 years. Accordingly, we will look at users active in the last 30 days on any wiki, for both groups.

Accordingly, the datasets are, for 'BetaFeatures users', any user who has been active in the last 30 days and enabled any BetaFeature for 30 minutes or more, and, for users generally, any user who has been active in the last 30 days.

Definitions

BetaFeatures users: users who are currently using at least one BetaFeature, or who had a BetaFeature enabled for at least 30 minutes before disabling it, and who have made an edit in the last 30 days. See the dataset discussion..
Users: any user who has made an edit in the last 30 days.

RQ1: How representative are BetaFeatures users of our community, in terms of editing experience, tenure and behaviour?

As explained in the section on socialisation, it's important to understand how BetaFeatures users represent the community as a whole. Of primary importance within that is their level of experience and tenure; these fundamentally alter how people are likely to respond to changes. An experienced user may instinctively oppose change out of instinctive conservatism - or oppose it perfectly rationally, having been around long enough to see a similar mistake be made before. A newcomer may feel free to embrace change due to a lack of attachment to the status quo - or because they're too new to have thought of the long-term ramifications the change will have for other processes. The goal of BetaFeatures and this research is not to ensure one group has any more representation than the other, but to ensure that groups have proportionate representation.

To measure tenure and editing experience, we can look at things in several ways.

For tenure, we can look at the date of registration, or at the date of their first edit. The second is probably more reliable, since there are undoubtedly people who registered accounts simply out of curiosity long before they began contributing. For editing experience, simple edit-count is a widely used measure of how experienced a user is at contributing, although there are other metrics that are probably worth looking at too, since the goal is not just to measure experienced versus inexperienced users, but also to look at different types of experience; 'gnome'-like tasks versus heavy content contributions, administrators versus editors, so on and so forth.

As an initial run, we will compare BetaFeatures users to the community overall using:

Tenure, measured from date of first edit;
Edit count;
User rights.

Results

Discussion

RQ2: How representative are BetaFeatures users of our community, in terms of software use?

Results

Discussion

RQ3: How representative are the providers of feedback about BetaFeatures, by the same metrics?

Results

Discussion

Conclusion

Privacy and Subject Protection

Transparency

Notes

↑ Bardzell, Shaowen; Bardzell, Jeffrey (2011). "Towards a Feminist HCI Methodology: Social Science, Feminism, and HCI" (PDF). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (ACM).
↑ Wright, Peter; McCarthy, John (2008). "Empathy and Experience in HCI". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (ACM).
↑ Geiger, R.S.; Halfaker, A. (2014). "Using Edit Sessions to Measure Participation in Wikipedia" (PDF). Proceedings of the 2013 ACM Conference on Computer Supported Cooperative Work (ACM).
↑ Hale, Scott (2013). "Multilinguals and Wikipedia Editing" (PDF). arXiv:1312.0976v1.

[1] Bardzell, Shaowen; Bardzell, Jeffrey (2011). "Towards a Feminist HCI Methodology: Social Science, Feminism, and HCI" (PDF). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (ACM).

[2] Wright, Peter; McCarthy, John (2008). "Empathy and Experience in HCI". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (ACM).

[halfak-3] Geiger, R.S.; Halfaker, A. (2014). "Using Edit Sessions to Measure Participation in Wikipedia" (PDF). Proceedings of the 2013 ACM Conference on Computer Supported Cooperative Work (ACM).

[4] Hale, Scott (2013). "Multilinguals and Wikipedia Editing" (PDF). arXiv:1312.0976v1.

[1]

[2]

[3]

[4]