Research:Omnibus Survey

Omnibus Surveys are questionnaires which combine smaller questionnaires from multiple sources into fewer but longer questionnaires. Such surveys are commonplace in the Market research and marketing industries because distribution and collection costs are shared between multiple researchers.

Why should Wikimedia run omnibus surveys?

Wikimedia and Academics both stand to benefit from gaining a better understanding of the Wikimedia editing community. Wikimedia is far from unique as a large online community that has evolved various internal cultural mechanisms and behaviours, it it however unusual as being a large online community ascribing to and governed by the culture and values of the open source community. i.e. Wikimedia is a charity and therefore willing to give research access to its online community in ways that most commercial organisations would not consider.

Wikimedia has a strategy of embracing Academia through its campus ambassador program, and indeed through the extensive and free use of its data for educational purposes. Running an effective and free research questionnaire program as a service to Academia is an act of reciprocity that some partners and potential partners will respect. More and better research about the wikimedia editing community and their motivations, needs and concerns is likely to help Wikimedia meet the needs of its editing community in the general interest of the project.

Why is this an issue? Treating the Wikimedia community as a free and unrestricted resource for researchers to spam would in the short term be an attractive option for the first wave of researchers. But after the inevitable "tragedy of the commons" neither the wikimedia community which had raised barriers against research spam, nor the researchers who the barriers had been raised against would benefit.


As this data would be collected from Wikimedians with the endorsement and IT resources of the Wikimedia Foundation the presumption would be that in accordance with Research:WMF support:

  1. Raw data Data identifying individual Wikimedians will be held securely by the WMF and not be published or disclosed to the researcher, unless the Wikimedian opts in to this.
  2. Non personal raw data Datasets containing the individual responses minus any fields that are personally identifying will be available to accredited researchers on the basis of a contract preventing them from forwarding the raw data or using combinations of questions to identify individual Wikimedians, and requiring them to release their resulting research under an open access Gold basis.
  3. Counts Statistics from the anonymised survey data will be published within 6 months under CC-BY-SA 3.0 (This would include counts of response by answer value and potentially some cross tabulations, providing they don't identify individuals).
  4. Researchers who have contributed questions to the survey may in turn be given an advance copy of the consequent data to prevent them being "scooped", but such exclusivity will be for a maximum of 6 months.

Advantages for wikimediansEdit

Researchers often have overlaps in the questions they ask Wikimedians, an Omnibus survey gets rid of that redundancy. If twenty questionnaires in a year all ask the gender, age, education and edit count of Wikimedians, why not run one questionnaire with those questions and twenty others, then give each researcher a dataset with the information they are interested in.

A larger and more sophisticated survey can be expected to use qualifying questions to determine later questions in the survey - thus avoiding such faux pas as asking 15 year old editors about their children or 19 year olds if they have a Phd, but also making the whole survey shorter for almost all respondees. This does however require a certain amount of management and reasonably sophisticated survey technology.

Privacy. Whoever runs these surveys will have access to data on individual wikimedians and at a minimum the IPs they edit from. With Omnibus surveys that data can be restricted to the operators of that single survey, and the various researchers only be provided with anonymised data. By contrast a strategy of allowing multiple researchers to each run their own survey of wikimedians means that the process would only be as secure as the least secure survey.

There need be no external advertising in the advert for the survey, as the invitation to complete the survey would be from either RCom or the WMF. It is likely that researchers and research institutions would like to have their names associated with particular questions, but would the community be willing to allow this?

Advantages for researchersEdit

This is a viable proven way of collecting data without irritating the people being surveyed to the point of them blocking research. A single annual survey could probably be run indefinitely without annoying the community to the point of them requiring an opt in mechanism in user preferences.

Sample sizes are likely to be larger and therefore more robust if there is a single annual survey rather than multiple adhoc ones.

Longitudinal studies could be made possible via this mechanism.

Any survey will have skews, but a strategy of not annoying the population you are analyzing by over surveying them will tend to give a less skewed sample by not creating an annoyed opted out proportion who you can no longer reach.


Wikimedia would need to maintain a longterm expanded research program that would allow researchers to request additional questions. This would involve some overhead in questionnaire design, in liaison with researchers, and possibly in survey software..

Though researchers and Academia collectively and in the medium to longterm would benefit, those researchers who were able to run surveys before the community was sufficiently annoyed as to implement opt in would lose out. Instead of being the only people able to survey the community before it went opt in they would be always on a par with other researchers.

As the Omnibus survey would be annually or at most 6 monthly researchers would usually find themselves some months from getting results.


Any community could of course agree to run additional research surveys as well as the Omnibus survey. Possible reasons for doing so would be if the research was urgently needed by the community or if the research benefited from being done in a different month than the Omnibus survey. But the Omnibus survey would be the default unless a case was made for an additional survey.