Community Safety/Frequently Asked Questions/ms

This page is a translated version of the page Community Safety/Frequently Asked Questions and the translation is 0% complete.
Community Safety - Quarterly Data from the Wikimedia Movement

Why is the survey being conducted?

It is being conducted in response to an idea proposed in the community-led drafting committee's enforcement outline for the UCoC to strengthen local community self-governance, specifically: "We recommend that the Foundation work to create a system where contributors can safely express whether they feel safe in a particular project or not."

Findings will be shared publicly with the communities for their review.

Why am I seeing this survey?

You are seeing the survey because you have been randomly sampled by the QuickSurveys tool. The tool targets a percent of viewers on a given wiki, and if they are logged in and meet the threshold for edit count, the survey will be displayed. QuickSurveys was chosen for the Community Safety Survey to ensure the anonymity of respondents. The sampling by the tool is not related to your username, but instead a session token generated when you visit a website. However, this means that if you are using a different device, or have cleared your browser cookies, you may come across the survey more than once. The probability for that is low, but it is nevertheless possible.

If you are shown the survey question and decide to answer, please keep in mind that the question intends to ask you about your interactions on the specific wiki space on which you saw the question appear and not your broader interactions across Wikimedia spaces. This is so that we can analyze how safe contributors feel participating in particular wiki spaces.

Why should I answer this survey?

Participating in this survey will help surveyed Wikimedia spaces be aware of how safe their contributors feel engaging in these communities over time. The aim of the survey is to report back to communities so they can assess their own needs for intervention; to measure whether existing projects (including local community initiatives like RfCs changing self-governance practices, support programs by affiliates, the Universal Code of Conduct, IP masking, and others) are working to increase contributors’ sense of safety; and to serve as a potential “warning signal” if there are sudden spikes in respondents feeling unsafe or uncomfortable in a particular community.

What is the survey question?

The survey question is tailored to each surveyed Wikimedia space.

Wikimedia Space Question wording and response options in English Translation to Primary Language
Persian Wikipedia In the last 30 days, have you felt unsafe or uncomfortable contributing to Wikipedia (fa.wikipedia.org)?

- Yes
- No
- I'm not sure
Coming soon
Catalan Wikipedia In the last 30 days, have you felt unsafe or uncomfortable contributing to Wikipedia (ca.wikipedia.org)?

- Yes
- No
- I'm not sure
Coming soon
English Wikipedia In the last 30 days, have you felt unsafe or uncomfortable contributing to Wikipedia (en.wikipedia.org)?

- Yes
- No
- I'm not sure
Not Applicable
French Wikipedia In the last 30 days, have you felt unsafe or uncomfortable contributing to Wikipedia (fr.wikipedia.org)?

- Yes
- No
- I'm not sure
Coming soon
Portuguese Wikipedia In the last 30 days, have you felt unsafe or uncomfortable contributing to Wikipedia (pt.wikipedia.org)?

- Yes
- No
- I'm not sure
Coming soon
Spanish Wikipedia In the last 30 days, have you felt unsafe or uncomfortable contributing to Wikipedia (es.wikipedia.org)?

- Yes
- No
- I'm not sure
Coming soon

Why do you include the URL for different Wikipedias?

We want users to answer the question based on the Wikipedia community they were shown the survey on, rather than based on broader interactions on other language Wikipedias. Many newer editors do not know that multiple Wikipedias exist, while more tenured contributors often contribute to multiple language Wikipedias. In order to be able to start with one universally applicable question that could be shown on different Wikimedia spaces to align with our survey configuration, and be able to automatically generate the correct translation in the users’ User Interface language, we opted to include the URL to specify which language space we mean.

How is this different from the similar Community Insights survey question?

The aim of the Community Insights survey data is to be representative of our broader global movement, regardless of which Wikimedia spaces someone contributes to. The aim of the Community Safety survey is to be representative of the specific Wikimedia space that the survey is distributed on (for example, “English Wikipedia,” “Wikidata,” “Portuguese Wikipedia”). The question wording on both surveys is purposefully similar, so that we can compare the data from these two surveys to triangulate reliability. This pilot is also exploring a new way to invite people to participate, using QuickSurveys instead of mass-messaging via user talk pages or emails.

Have wiki communities been informed about this survey?

We have informed the wiki communities which have been selected to be part of the testing phase of the survey as part of this exploratory project here:

The communities selected for the baselines will be informed of the process once we are able to assess the findings from our testing of the Quick Survey tool and question on Catalan and Farsi Wikipedias. We will update the links below as we post our messages to the communities:

Why were these wiki communities chosen?

  • The selection criteria for the first rounds of the Community Safety survey first began with the size of the wikis: a community needed to be large enough to be able to be surveyed 4 times per year without causing undue burden on volunteers by surveying the same people repeatedly. Based on an estimate of 20% response rates from prior participation in such efforts, we determined that the 12 largest wikis by monthly unique editor count were viable to conduct the survey on.
  • We chose these 5 communities as their contributors live in broad geographic regions, and include the largest projects by community, English language Wikipedia, and by contribution volume, Wikidata. Please note that Wikidata may not work with the current configuration of Quick Surveys and may need to be excluded until we can update the tool; we will know more in January.
  • For the pilots, we wanted to ensure we have space for “mistakes” as we roll out an entirely new survey project using a less-utilized tool (Quick Surveys), as well as space to learn from different contexts and languages and how they may affect the survey design.
  • If this project is successful and useful to communities, affiliates, and the Foundation, we aim to expand this survey to other spaces in the future!

Is it safe to answer this survey?

The survey is designed so that anyone who sees the data, including the researchers, cannot connect a response back to an individual user, or know which individuals were asked the question. The survey is governed by this privacy statement. Raw data from this survey will not be shared publicly, and the survey responses will not be used to identify any user or individual. Please read the later questions to know more about what data is collected, how it is collected and how long it will be kept.

What data is collected using this survey, and how is it processed?

When a user responds to the survey, apart from the survey response, the information collected include the page title and ID on which the user responds, platform (web/android/ios), user language, edit count “bucket” (that is, ranges of "5-99 edits" | "100-999 edits" | "1000+ edits" rather than individual edit counts), IP address, and country code. Information identifiable to any specific user such as username or ID is not collected through the survey.

The QuickSurveys tool will be used to administer the survey, which is in turn dependent on EventLogging, an extension that enables it to collect structured data on how users interact with the MediaWiki sites. When the survey is initiated, all the properties listed on the initiation schema, and when a user responds, all the properties on the responses schema, are captured.

The raw JSON data is imported into HDFS from Kafka, and then further refined into Parquet-backed Hive tables. The data is initially stored in the event database of Hive. In accordance with WMF's Privacy Policy and Data Retention Guidelines, the event data goes through an automatic sanitization process that deletes all potentially sensitive information contained in the Hive event database older than 90 days. The event_sanitized database stores the sanitized data, where it will be stored indefinitely. See the Wikitech Event Data retention page for more information.

In order to be able to analyze change over time, the Global Data & Insights team will indefinitely retain the raw datasets from the Community Safety surveys. This dataset will not be stored in queryable databases, and will only be available to specific WMF staff and contractors who have signed a non-disclosure agreement.

How will the results from this survey be used?

After data cleaning and analysis, aggregate responses will be reported in the Reports tab by each community space. We will not publish any statistics unless the associated data is sufficiently common.

We hope that participating communities will use this data to track how safety sentiments change over time and respond accordingly, possibly by further investigating why their results look the way they do, conducting community conversations, and potentially planning space-specific interventions if needed. We believe that communities are the experts in their own experiences and contexts, and are most qualified to assess their situations and how they can be improved.

The data will further be used to help track whether ongoing collaborations and partnerships, such as movement strategy-guided initiatives like the Universal Code of Conduct, IP masking, and the planned incident report system (original community consultation 2019), are actually helping to improve perceptions of safety in participating spaces.

How long is this survey data kept?

In order to be able to analyze change over time, the Global Data & Insights team will indefinitely retain the raw datasets from the Community Safety surveys. This data will not be stored in queryable databases, and will only be available to specific WMF staff and contractors who have signed a non-disclosure agreement. We will update this section as we find out more about which data will be needed to conduct analyses over time and which data can be deleted.

To comply with WMF's Privacy Policy and Data Retention Guidelines, data stored on Hive which is older than 90 days goes through an automatic sanitization process, during which all potentially sensitive information is deleted. The sensitive information includes, but not limited to, personally identifying information and browsing information. The sanitized data is held indefinitely. Kindly check this Wikitech page about event data retention for more information.

How can I dismiss this survey?

Figure 1. How to dismiss this survey.

As shown in Figure 1, you can check the cross mark to dismiss the survey if it appears to you.

Please note that, while dismissing the survey opts you out for this round of data collection, you may be randomly sampled again in future data collections. There is likewise the possibility of being re-sampled in the same round if you clear your browser cookies or switch platforms.

Due to the privacy-focused aspect of Quick Surveys, where users are anonymously sampled and the survey is not tied to their username, unfortunately we cannot provide a permanent opt-out feature.

I have been asked the survey question twice this week, why did this happen and how will that impact data quality?

We are sorry that you have been asked to respond to the same survey twice. This can happen if a user is active on multiple devices/browsers, or has cleared their browser cookies. As the survey is not designed to capture identifying information about the user such as username, we cannot avoid duplication completely. Duplication should not impact data quality as the probability of users who would answer yes or no to the question getting asked the question more than once is the same.

What if I was asked the survey question more than once in a year, or on a different Wikimedia space?

Remember that this survey is conducted four times per year in order to measure change over time. If you have been asked the survey question again, but not in the same month, you can consider this to be a separate survey. Please feel welcome to answer the question or opt out, as you would normally. Likewise, if you were asked the survey question on a different wiki space, please consider these to be separate surveys as they are meant to measure feelings of safety while contributing to the specific wiki space, and not a general sense of safety across wiki communities.

The translation of the survey question or responses is wrong, how can I fix it?

We are sorry for the wrong translation. Please post a message to the talk page with the following information: language, the text that should be corrected, and to what it should be corrected to.

If you have more questions, or would like to suggest improvements to the project, please post a message on the talk page, or email us at surveys(_AT_)wikimedia.org.