مدونة قواعد السلوك العالمية/إستشارات 2021 /بحث

This page is a translated version of the page Universal Code of Conduct/2021 consultations/Research and the translation is 21% complete.
Created
2021-06-10
Contact
Claudia Lo
Collaborators
Claudia Lo, Megan Riley
Duration:  2020-September – 2021-June

لدعم لجنة صياغة المرحلة الثانية من مشروع مدونة قواعد السلوك، أجرت مؤسسة ويكيميديا مشروع بحثٍ يركّز على التحرش في مشاريع ويكيميديا. نُفّذ هذا المشروع بشكل استطلاع وبإجراء العديد من المقابلات.

ركز الاستطلاع، الذي تم توزيعه بشكل أساسي على الجهات الشقيقة لويكيميديا، على تصورات المستجيبين ومعرفتهم والمشاركة مع أنظمة الإنفاذ وإعداد التقارير الحالية. المقابلات التي أجريت مع أعضاء المجتمع الذين اتصلوا بفريق الأمان والثقة في الماضي للإبلاغ عن حالات متعلقة بالتحرش، فقد ركزت على تجارب أولئك المستخدمين الذين تعرضوا لمضايقات خطيرة ومستمرة.

اكتشفنا أن هناك العديد من العقبات الرئيسية التي تحد من التعامل مع هذه الأنظمة، ومن أهم هذه العقبات نظام الإبلاغ المربك والخوف من رد الفعل العام. ومع ذلك، لا تزال المشاعر العامة إيجابية حيث يريد أعضاء المجتمع مواصلة المشاركة مع المجتمع وأنظمة الإنفاذ في مؤسسة ويكيميديا، على الرغم من هذه النكسات.

منهجية البحث

تتمثل إحدى عقبات البحث الرئيسية في حقيقة أن نظام الإبلاغ غير المنظم لدينا يجعل من الصعب جمع المقاييس الخاصة بالإبلاغات. بالإضافة إلى ذلك، أردنا الحصول على المزيد من البيانات حول المشاعر المتعلقة بأنظمة الإنفاذ لدينا بدلاً من المعلومات المتعلقة بتكرار الإبلاغات ونوعها.

بالنسبة لهذا البحث، قررنا اتباع منهجية مقسمة. لقد أجرينا استبيانًا، وُزّع في المقام الأول على الجهات الشقيقة، بالإضافة إلى مقابلات مستهدفة شبه منظمة من أعضاء المجتمع الذين سبق لهم الاتصال بفريق الأمان والثقة بشأن المشكلات المتعلقة بالتحرش.

الاستبيان

The survey, titled “Wikimedia Community Reporting System Survey”, was 37 questions long and available in English, Spanish and Hindi. Prior to deployment, the survey was pre-vetted by volunteers from the Wiki LGBT+ affiliate group. The survey was hosted on Qualtrics and administered using an anonymous link to collect as little identifying information as possible. It was open for four weeks, running from April 15 – May 7, 2021.

The survey was sent out primarily via email to a number of affiliates, focusing on groups for LGBT+ editors and a number of women’s groups across the movement. Invitational emails were also sent to the arbitration committees of English, French, Russian and German Wikipedias. Notices for this survey were also posted on the village pumps (or equivalent) on Italian, Spanish, French, German, Polish and Arabic Wikipedias.

Rather than focusing on harassment in and of itself, this survey was designed to focus on our community’s perceptions of existing enforcement systems. Specifically, it targeted a few key themes:

  • Understanding of enforcement systems: do people know ways to report incidents on our projects? Do they know how to use these systems? How did they learn to use these systems initially?
  • Engagement with enforcement systems: do community members routinely use these systems? How common is it to use these systems at all?
  • Perception of enforcement systems: are these systems generally well regarded? Do people think it is worthwhile to engage with them?
  • Privacy and transparency: what portions of the system do people generally believe should be accessible to the public? What information about the whole enforcement apparatus should be available to the public?

In total, the survey received 85 responses. Of these, 68 (80%) were fully completed and 17 (20%) were partial completions. 53 (62%) of survey respondents took it in English, while 31 (36%) took it in Spanish. Only one respondent took the survey in Hindi.

Most of our respondents were located in Europe. A majority identified as women, and the most common age range was between 30 and 44 years old. About a quarter also identified themselves as being part of the LGBT+ community.

Nearly all of our respondents have spent over a year on Wikimedia projects, with about a third reporting over a decade of experience with our projects. 39% of respondents currently or previously held administrator rights, while slightly over half – 55% - had served as an organizer for Wikimedia events or groups before. Most respondents, at 73%, report being active on one to three projects.

Although the survey provided an option for under-18s to report their age, we did not collect information from self-reported minors and selecting this option fast-forwarded them to the end of the survey

Compared to the 2018 Community Engagement Insights report, this study's respondents had a much higher proportion of women, similar median age, and similar geographic location. Based on the Wiki comparison dataset, for our top 100 wikis by size rank, median monthly active administrators make up about 7% of the total median monthly active editors (9 monthly active admins and 129 monthly active editors). This is an expected result due to our method of primarily recruiting participants from affiliates, ArbComs, and community members experienced enough to find Village Pumps or similar pages.

Interviews

To complement the data from the survey, we conducted four semi-structured interviews. All participants had previously contacted Trust and Safety in the past. Of our six initial invitations, four total interviews were scheduled. One was conducted with a Wikimedia Foundation staff member, although the focus of the interview was on their experiences as a volunteer prior to starting work as a Foundation staff member.

Limitations

Because of our small sample size and non-representative respondent body, this research should be understood as a pilot study and is not broadly generalizable. We know that our survey respondents are generally more experienced and have spent more time on Wikimedia projects than the general community, and that we have far more administrators than in the general community. However, the conclusions raised here are important for some key reasons:

  • These are questions that have not been answered by previous research, or have not previously been asked at all.
  • This research provides a useful initial challenge to assumptions about how enforcement systems are perceived.
  • تُسلّط هذه النتائج الضوء على المجالات التي قد يكون من الأفضل تحديد أولوياتها لجهود البحث المستقبلية.
  • نظرًا لأننا نعلم أن المستجيبين، في المتوسط، أكثر خبرة، فقد نستنتج أن عامة المتطوعين ليسوا على دراية كافية بأنظمة الإبلاغ والتنفيذ وأننا نعدّل افتراضاتنا لتتناسب مع ذلك.

نتائج أساسية

بأغلبية ساحقة، أفاد المشاركون أن أنظمة التنفيذ الحالية لدينا معقدة للغاية ويصعب فهمها.

Write-in survey responses noted the existence of loopholes, unclear redirections, the expectation that one may be asked to make reports to the very people one wished to report, and an utter lack of clear instructions on how to report. One of our interviewees noted that it took three years and a chance in-person meeting with a trustee before they even knew about any formal reporting channels. Other frustrations expressed by participants included the fact that only a sliver of problematic behaviour can be reported under their community’s rules; for example, insults directed at a minority group rather than a specific individual are hard to report.

"الأمر كارثي. يجب أن نتحمل ونصمت".

write-in response

يفتح نظام الإبلاغ الحالي المجال للانتقام من المبلغين أو قد يتسبب بردّ فعلٍ عنيف أو تدقيقٍ عام غير مبرر

Many of our write-in responses specifically named fear of reprisal as a major negative in our current system. Some of them used a specific jargon term, “boomerang”, to refer to this phenomenon, suggesting that this is so common as to warrant a special name for it.

Administrators and Arbitration Committees are also aware of, and suffer due to this flaw. One interviewee pointed out that the “half-transparent” cases (those where public evidence of harassment was supplemented with private evidence) handled by the Arbitration Committee were especially draining to handle. They described how onlookers would speculate on these private details and proceed to scrutinize or even harass the supposed reporters, and ArbCom members, on the basis of this speculation.

“There is no assurance that the community will handle the problem in a respectful manner towards the person who has suffered the aggression, making the reporting process intimidating.”

write-in response

لم تقدم غالبية طفيفة من المشاركين في الاستطلاع أي إبلاغات.

لم يقدّم 54% من المستجيبين أي إبلاغات. ويتضمن هذا 40% من المستجيبين الحاصلين، أو كانوا حاصلين، على صلاحيات إدارية.

ستة من كل عشرة مشاركين اختاروا عمدًا عدم الإبلاغ عن الحوادث.

Reasons given include a fear of backlash or reprisal, belief that the outcome would be ineffective, and the process of making reports being too confusing or difficult. Write-in answers also indicated that occasionally, the people in charge of receiving reports are the very people that are the subject of complaints.

ثلثا المشاركين في الاستطلاع من غير الإداريين غير متأكدين أو لا يعرفون كيفية الإبلاغ عن السلوك الإشكالي.

على النقيض من ذلك، كان الإداريون الحاليون أو السابقون أكثر ثقة في معرفتهم بكيفية الإبلاغ - أفاد 83٪ أنهم يعرفون كيفية الإبلاغ عن مثل هذا السلوك.

”يجب أن تكون خبيراً لتعرف كيف تستخدم [نظام الإبلاغ]، وإن توفرت لديك هذه المعرفة الكبيرة فلا بد أن تكون إدارياً حتماً.”

write-in response

يوجد هذا الاختلاف حتى بين المشاركين في الاستطلاع ممن قدّموا إبلاغات في في الماضي. 80٪ من الإداريين الحاليين أو السابقين ممن قدموا إبلاغات مسبقاً يفهمون هذه العملية. أفاد 31٪ فقط من المشاركين غير الإداريين الذين قدموا إبلاغات في الماضي أنهم يفهمون العملية.

من بين المشاركين في البحث، ثمة رغبة عامّة بوجود نظام إبلاغات خاص من خلال الويكي

عندما سئل المشاركون في الاستطلاع عن المسارات التي يجب أن تكون متاحة للإبلاغ، كان الخيار الأكثر شيوعًا هو "مسار خاص آخر"؛ الخيار الثالث الأكثر شيوعًا كان "على قناة خاصة منفصلة، على موقع الويكي".

It takes too long to resolve cases of harassment.

This was expressed by users making reports as well as the administrators expected to handle them. Reporters note how time-consuming the process of reporting is, as this generally includes having to learn about available reporting options, learning the appropriate report structure, and assembling the necessary supporting evidence for the report. Administrators point to the lack of training to handle interpersonal conflict, the complexity of cases that are severe enough to spur reports, and a general lack of capacity due to dwindling active administrator numbers.

All of these factors combine to make report resolution a very lengthy process, which itself becomes yet another factor that discourages community members from reporting.

Interconnected communities with disconnected enforcement allows community members with a history of harassment to continue such actions and evade consequences.

Without prompting, we routinely heard from respondents about certain communities with a bad reputation for being especially combative or hostile. What they had in common was a lack of guidelines around behavior or reporting and a general “blind eye” attitude towards their community members’ histories of rule-breaking behavior, especially if paired with a long history of contribution.

In at least one case, Wikimedia Commons, multiple interviewees pointed out that users with long histories of abuse, to the point of being banned from other Wikimedia projects, were allowed to engage in similar behaviors on Commons.

Participants were divided as to what the precise role of the Wikimedia Foundation should be in enforcement systems.

While there is broad consensus within the Wikimedia community that the Foundation should be responsible for certain cases involving minors or credible threats of violence, this consensus breaks down when it comes to most other matters.

Survey respondents alternately decried the Foundation’s involvement while also viewing it as a needed route that bypasses local reporting systems that are being handled by the people they wish to report. Others wanted the Foundation to act as a “backup” option if there were no global administrators, oversighters or stewards available. Still others were upset that the burden of handling harassment reports, especially while organizing Wikimedia events, was shifted to volunteers rather than the Foundation.

Survey respondents wanted access to aggregate statistics and case summaries, not necessarily full case details.

Our current systems provide full public visibility of all cases made on-wiki. However, when asked what information the general public should see with regards to reporting on Wikimedia projects, more respondents chose aggregate statistics and summaries over full case details. This was true of both administrators and non-administrators.

Despite all of this, respondents generally still view it as worthwhile to make a report.

While users were much more likely to view the entire enforcement process as ambiguously useful at best, survey respondents were still generally positive about local admins, the WMF, and event organizers’ likelihood of addressing reports. Slightly over half of survey respondents said that it was “definitely” or “probably” worthwhile to make a report. Two of our interviewees also noted that, even though they knew (or believed) that the people they reported to were powerless to act on their reports, they still wanted to make them. This suggests that the act of reporting is itself an action that people wish to perform, regardless of outcome.

Recommendations

Based on these key findings, and supported by the conclusions of previous research on harassment and reporting on Wikimedia projects, we suggest these recommendations for the Universal Code of Conduct phase 2 drafting committee.

Provide an anonymous or private on-wiki reporting system.

The outsized fear of reprisal for reporting makes it all the more astonishing that slightly over half of the survey respondents still believe the system is worthwhile. In order not to corrode this trust, on which the entire system relies, we need to provide a way for reporters to privately report incidents.

Whether this is done anonymously or privately (that is, limiting who can see the identity of the reporter), it should be our absolute priority to provide either a technical solution or a policy one to accommodate this clear and pressing need.

Clarify and streamline the reporting process, for both reporters and administrators.

Our mix of different reporting systems for local admin, global admin, and Foundation-handled events is deeply confusing.

We should provide a means for community members, especially newcomers, to clearly find the appropriate channel and reporting body for the incidents they wish to report. This is especially important for cases of harassment since being the target of sustained harassment already makes it difficult to seek out help in a timely manner.

Clarity, in this sense, means clarifying several factors:

  • Recipient of reports: who will receive and address it.
  • Pathways for reporting: which is best suited to a specific situation, and where to go to do so.
  • Necessary information in a report: how to provide the necessary information to make a report high-quality that allows administrators to act on it/
  • Visibility of the report: who will be able to see said report.
  • Process of enforcement: how are judgements reached and how are these judgements enforced.[note 1]
Make it easier to surface incidents of harassment to administrators.

Participants in this research indicate that it is extremely difficult to find where to report, figure out who should receive the report, and finally learn how to structure the report appropriately.

This severely limits the ease of reporting, which may limit opportunities to de-escalate disagreements before they become much harder to address. It may also help make reporting a less stressful action overall and improve rates of engagement, which is necessary in a system that relies on community goodwill and trust in local administrators to function.

Provide more flexible and varied outcomes for reporting.

Currently, the outcomes of reports tend to be limited to no administrator action, or some level of escalating restriction on editing. While the Foundation has tried to provide more granular options for administrators to restrict editing, we should look into opportunities to broaden the outcomes of reports.

This could include allowing reporters to have input on the outcome of reports, providing ways for subjects of reports to apologize or make amends, and other non-block dependent outcomes. Alternatively, this could involve greater inter-project coordination to place sanctions on an editor’s behavior across a wider number of projects, or a pathway to escalate reports to other authorities outside of local administrators.

Make the reporting process transparent and not just visible.

Our existing local reporting systems are by and large fully publicly visible. Nevertheless, this does not mean that they are transparent.

This is a barrier for would-be reporters, who can not only see how complex their reports are expected to be, but also shows them evidence of past backlash and reprisal against other reporters.

For administrators, our completely unstructured reporting systems make it difficult to find archives of reports with the same subject, especially if this happens across projects. It also makes it difficult to address reports as they vary wildly in quality.

Lastly, observers have no access to useful statistics since the current system’s unstructured nature makes it impossible to gather accurate or reliable metrics on reporting, and these reports’ heavy use of jargon makes them hard for observers or laypeople to understand.

Provide better guidelines or specific training for administrators to resolve disagreement while avoiding escalation into full-blown harassment.

Our interviews, and prior research on the topic, point to a link between disagreements over content and escalation into harassment or abuse. However, as prior research has indicated, we have few mechanisms to turn past administrative actions into actual guidelines or precedent for future incidents.

[note 2]

One interviewee also noted that their years-long experience of harassment actually started with a minor disagreement over categorization, and part of how this harassment intensified came about when other users were brought in supposedly to provide “consensus” on the categorization disagreement. Another interviewee’s experience of harassment started with an editor unilaterally bringing up a procedural concern using spurious evidence and drawing in other editors to provide their opinions.

In a healthy community, we could expect such procedural or technical concerns to be separate from the possibility of harassment. To this end, we should make an effort to allow administrators or other trusted community members to diffuse tensions and resolve disagreements, while avoiding aggressive behavior in the process.

Further research

This study raises a few avenues of research that may be worth further exploration.

Do these findings extend towards the general population of the Wikimedia projects? Are there any significant differences based on size?

While our targeted survey and interviews provided important perspectives for the work of the committee, it would be useful to know if these findings hold true for the general population. Since many of the concerns raised in this research seem linked to the size and capacity of the local administrators, it would be worthwhile to see if these issues exist across wikis regardless of size or if they change accordingly.

What is the state of our cross-wiki or global reporting and enforcement systems?

This research focused largely on the experiences of community members on a handful of wikis, with issues of harassment generally also limited to those same spaces. We did hear about incidents of off-wiki or cross-wiki harassment, but they were not the focus of this research. Therefore, this seems like a logical expansion for this line of research.

How do our existing conflict resolution systems address disagreement? To what extent do they facilitate or mitigate escalation into harassment?

This line of research would look at potential structural issues regarding our consensus-driven policy process, and the ways in which low-level disagreements are usually treated. If the underlying issues driving harassment are structural rather than incidental – that is, if the ways in which users are encouraged to interact makes it easier or more likely for people to engage in aggressive ways – tackling misconduct will require a very different approach than if the issue lies with a population of people choosing to act aggressively.

Do private reporting systems impact rates of reporting? Does this privacy impact rates of enforcement?

While the author of this report would argue for a moral imperative to provide private reporting pathways, it would also be prudent to investigate how implementing such a system might impact reporters, subjects of reports and administrators on Wikimedia projects.

Is the current state of the reporting system truly transparent? Who currently makes use of publicly available reporting information?

Publicly visible reports are the current standard of our reporting system, and this is usually justified in the name of transparency. However, we have conducted minimal research into whether or not such visibility actually means the system itself is transparent, nor what the community means by “transparency”. We should investigate how visibility and transparency relate in terms of Wikimedia reporting systems.

A second key question is to figure out who this current state of visibility best serves. As this Targets of Harassment project indicates, would-be reporters are badly served by this publicly visible system. Therefore, we ought to figure out who benefits, if anyone, from this system, and how they make use of this publicly visible information.

How are appeals currently handled and is this system placing a realistic ask on people seeking appeals?

While we have conducted research on many aspects of enforcement and reporting on Wikimedia projects, we have yet to pay close attention to our unban and appeals process. Investigating these parts of our enforcement process would also be a logical extension of research into enforcement overall.

Notes

  1. n.b. There are strong reasons to want to obscure the details of how administrator actions function. In this case, clarity should not override operational security concerns. Clarity may be achieved by providing an outline of actions rather than full details.
  2. In particular, refer to Reaching the Zone of Online Collaboration.

Further reading

For a full bibliography of prior research on harassment and reporting on Wikimedia projects, directed by the WMF, please see below. Unlinked documents marked “internal report” are currently limited to WMF staff and contractors, and may be available on individual request. All linked documents are in English.

Lee, Han A., and Crupi, Joseph. Reaching the Zone of Online Collaboration: Recommendations on the Development of Anti-Harassment Tools and Behavioral Dispute Resolution Systems for Wikimedia. Harvard Negotiation and Mediation Clinical Program, 12 Dec. 2017, p. 50.

Lo, Claudia. Reporting System Rubrics: A Comparison of Peer-Dependent Reporting Systems. Wikimedia Foundation, Feb. 2019, p. 62.

Lo, Claudia “Take It to AN/I”: Summary of Existing Research about Reporting Systems. Internal report, Wikimedia Foundation, Nov. 2018, p. 12.

Poore, Sydney. AN/I Survey Summary. Internal report, Wikimedia Foundation, 22 Mar. 2018.

“Wikipedia:Community Health Initiative on English Wikipedia/Administrator Confidence Survey/Results.” Wikipedia, 28 Nov. 2017.

Raish, Michael. Admin Confidence Survey 2019 Preliminary Results. Internal report, Wikimedia Foundation, 20 Aug. 2019, p. 50.

Raish, Michael. Identifying and Classifying Harassment in Arabic Wikipedia: A “Netnography.” Internal report, Wikimedia Foundation, 21 Dec. 2018, p. 23.

Support & Safety Team. Harassment Survey 2015 Results Report.

“User Reporting System/Wikimania 2018 Notes – Meta.” Meta-Wiki, 20 July 2018.