Open main menu

Talk:Community health initiative/User reporting system consultation 2019

Thoughts on the consultation structureEdit

I personally think that the process will lead to a well-rounded corpus of feedback and information that will help our team design and build the best-possible reporting system for Wikimedia wikis! I'm especially looking forward to the focus groups who will be able to dive deep into certain topics. — Trevor Bolliger, WMF Product Manager 🗨 21:56, 19 February 2019 (UTC)

The issue with enforcing existing rulesEdit

Hi, I just wanted to leave a short note to (again) thank you for looking into the issue of harassment. From my point of view, the main issue in my community (German-language Wikipedia) seems to be that that existing rules against harassment are simply not enforced because (1) the admins fear backlash from toxic users, and because (2) admins tend to be users who have a very 'laissez-faire' attitude towards harassment in the first place. I am aware that this project may not address these issues directly, but maybe there's room to do so further down the road. Kind regards, --Gnom (talk) Let's make Wikipedia green! 16:55, 10 March 2019 (UTC)

Hello Gnom, thank you for sharing your views. My perspective (which is informed by my own observations, reading the comments from other contributors for over a decade, and the research collected by the WMF Community health initiative over the past 3 or 4 years) the situations as you describe it is made worse by not having a well functioning "user reporting system."
In the current system, it take a whole of effort and insider knowledge to effectively present a case. The lack of tracking and the way that cases are archived means that a report is often viewed as one off event or a new situation instead of a continuing pattern of problematic conduct. A system with:
1) a better routing system
2) a user friendly form that walks you through how to make a high quality report
3) clear paths for escalation
4) a method for tracking
would go a long way towards putting the a good report in the appropriate channel for action to be taken.
Every day many volunteer administrators, stewards, other functionaries and experienced users protect the wikis from a steady steam of abuse. The system works well in many cases and that needs to be recognized and retained. But improved tools and workflows from a better user reporting system will make it easier for a target to file a report. And it can open the door for more people to help moderate issues, and perhaps bring fresh opinions and approaches, too.
In the next week, I will be working with WMF Anti-Harassment Tools team staff to bring some preliminary ideas for products into the discussion. The success of this project depends on hearing the voices of users from many wikis who are looking at it from a variety of of perspectives. I hope that you will stay around to comment and invite other Wikimedians to join the discussion. Cheers, SPoore (WMF) Strategist, Community health initiative (talk) 15:35, 3 April 2019 (UTC)

Formal rules and harassmentEdit

I wonder if this will spiral into a dispute whether harassment should be about following the rules, and some very strict reporting system, or it should be about what users feel is harassment. Strict rules could be nice for admins, but they tend to be interpreted by the letter and will not give any real guidance for cases where users methodically push limits over long time.

One thing I have seen in some cases are conflicts between users where one of the users has backing from a group of users. In those cases the user from the larger group tend to win out, even if the case as such should be rather straight forward. It does not happen very often, but it does happen and it is very detrimental for the community. Usually conflicts in the community are handled by admins, but I wonder if this is wrong. Admins claim they are objective, but my opinion is that they are highly biased. Perhaps we could use an anonymous and random jury, where each member just vote on the outcome. The random jury should be picked from users that has no tendency to appear together with the users involved in the dispute.

A technical solution could be to wrap a conflict/dispute in special tags, which appends a short statement about not making any followup, and a judgement from the jury about the current verdict. Such tag could be added by anyone, but then only removed by admins, unless the tags are empty. When they are added the content should only be removed by the persons involved, and only by undoing their own contributions. When a conflict is marked it should not be possible to add new content inside the tags. That makes it possible to backtrack out of the conflict for the involved persons, and stop further escalation. Note that voting on the outcome must either be on content as it is when the tags were inserted, or for each revision intersecting with the content. I believe the former would be the correct one. — Jeblad 18:47, 2 April 2019 (UTC)

The proposed technical solution will break down in a number of cases, not sure if there are any easy workarounds.
The core problem is that harassment can be ongoing for some time, and it can be interspersed with other discussions, and even good dialog with the harassers. So there are a number of possibly bad interactions, which may go either way. A case could then comprise a lot of contributions. — Jeblad 00:48, 3 April 2019 (UTC)
Hello Jeblad, I agree that this consultation will need to consider who handles the different types of reports of harassment. Last year during a session at Wikimania, the Wikimedia Foundation Anti-Harassment Tools team lead a round table discussion called 'Building a better harassment reporting system' to determine the various ways that reports about harassment and abuse are taken by volunteers acting in different roles. While to my knowledge no one mentioned a random jury system, there does seem to be a wide variety of ways that "cases" get reported and actioned. I'm interested in learning from more people on more wikis about which methods of handling might be more open for abuse such as you described? SPoore (WMF) Strategist, Community health initiative (talk) 00:54, 3 April 2019 (UTC)
As I replied to my own post (you get really high quality discussions when you are replying to your own posts) the proposed solution will break down. The problem is in how to mark the troublesome discussion, not a random jury system. A random jury system sounds kind of cool but are nothing more than a random selection from known users. It is possible to pick users from a more coherent group, but that has a risk of getting a biased group.
What you probably want is a system where it is easy to report harassment, but also very easy to verify whether the report has any merit. It is extremely common to find users claiming harassment, when they are in fact being asked to follow established rules. That said it is also very easy to find admins that oversteps rules when someone points out obvious mistakes. You can't take any of them at face value, there must be some way to verify whether the claims have any merit. — Jeblad 01:26, 3 April 2019 (UTC)
As a minor note, I have a vague recollection that a random jury system was proposed at a session in Montréal, perhaps [this one]. It was not an in-depth discussion, but it stuck in my mind because I thought it was something worth considering. Off the top of my head, a truly random jury composed of all potential editors won't work—it needs to be a subset of all editors with some form of vetting if nothing more than minimum contributions. One obvious challenge with a jury approach is the potential need for confidentiality.--Sphilbrick (talk) 21:52, 17 June 2019 (UTC)

Thank-like reportingEdit

We have an open system of contributions, with users that are highly opinionated, and how do we stop harassment without starting a blame game? We have a highly volatile community, and somehow we must avoid escalating conflicts.

What if we could report harassment the same way we thank a user for a contribution, but without the public thank, and without the reported person seeing who is the reportee. As long as only one person reports the user nothing more happen, but at some level a warning goes off and a special user group gets a notification. The same could happen if a user gets several reports over a given time or number of contributions. The reports will still not be visible to ordinary users, only to the specific group assigned the task of handling reports, and possibly to the reportee.

Because the reported users will be warned they will probably behave more responsible, and because they don't know who reported them they must behave responsible towards all. This will probably lead to individual self justice. If only logged in autoconfirmed users can report users the spam/troll problem should be fairly small. If the log entries can be deleted, then cleanup can be done if someone goes ballistic.

There are probably several types of reports, like harassment, threats, improper behavior, copyvio, and even overly use of rollback in edit wars. Perhaps they can be lumped together in a single type of "report", like "thank" is now a single type. — Jeblad 19:35, 3 April 2019 (UTC)

I think this approach has some merit and is worth exploring. I can see some potential for abuse which might be addressed with a throttle but it's worth considering.--Sphilbrick (talk) 21:57, 17 June 2019 (UTC)

Thank-like flaggingEdit

Here's an idea, inspired by Jeblad's brilliant suggestion: give people* the ability to flag diffs as a first step. They can then review their own flags and use those to prepare a formal report or a request for action (using whatever processes are already available or are added from this initiative).

Additionally, privileged actors (admins, crats, stewards) would be able to see an overview of all flags for whole-of-wiki or cross-wiki as appropriate. This would allow them to proactively take preventative action, issue warnings, or even open formal cases themselves (depending on established rules or community norms). Short of that, they could become aware of a problem brewing before it blows up? Or the data could be used for analysis of how effectively various anti-harassment programs or administrative avenues are being used? Some kinds of alerting, filtering, and searching systems would be needed to deal with the volume of flagged diffs. Privacy and safety would have to be considered before any public action (“Hey, why did you flag all my edits!? That's it, I'm gonna … !”). Perhaps an administrative direct-message system is needed (on the record but not public).

Many other platforms have have the option to "Report" a specific user post, and presumably action is only taken once some threshold is reached. Harassment is serious because it's an ongoing pattern of targeted behaviour: a single case of incivility is not harassment and is not usually actionable (except for serious threats of harm). So you need multiple "reports" or "flags" to establish harassment. (I'm using the term "flag" here to distinguish that there would be further "reporting" processes that may follow.)

Wikiconflicts are often "he-said, she-said" spirals of claim and counter-claim. “Soandso's being mean to me (long list of diffs)!” “Yeah? Well I was provoked and that person's cherry-picking to present a false picture (list of more diffs)!” For an uninvolved party to step in and assist, it's usually up to the aggrieved parties to collect the supporting evidence (at least an initial set) themselves.

Sure, you could make your own user-page with a list of links to diffs, but a) it's not private (a deal-breaker when harassment or stalking is involved), and b) it's not easy (you need to copy-paste the revision ID and know the right syntax). An-off wiki list achieves (a) but not (b).

Flagging could be a simple yes/no, or it could involve selecting categories. “Flag this edit for: []uncivil, []abusive, []vandalism, []interpersonal conflict, …” (select best one or tick all that apply, dependent on how the categories overlap). An optional description box could also help.

With the right categories, a person could even flag their own edit: for example if I post on somebody's talk page ”please stop behaviour x, it's bordering on harassment/stalking”, I might flag that as "interpersonal conflict" or "attempted conflict resolution" to include it in the bigger picture of ongoing interactions.

I think it's important that flags have the two-prong effect (user review of own flags and wider use of flags for analysis or administration). Otherwise they are just private bookmarks. While you're at it, you could add bookmarking of diffs as a separate feature using the same programmatic infrastructure. Think of a building block that can have multiple uses beyond just supporting anti-harassment processes.

Pelagic (talk) 21:50, 8 July 2019 (UTC)

*For "people", read "users"/"customers"/"volunteers"/or however you like to think of them. Pelagic (talk) 21:50, 8 July 2019 (UTC)
The “categories” should have opposites, like uncivil–civil (or perhaps uncivil–polite). People tend to describe abusive behavior in different ways, but if the the categories has a negative–positive scale then it can be summarized anyhow.
Logging a category (or reason) should not be a problem, but it should be a free text field as that would have to be patrolled and would thus increase the workload. It should neither be too many, as that makes it too cumbersome to use. — Jeblad 10:47, 9 July 2019 (UTC)

Like/dislike and “minority-/majority-report”Edit

It is worth noting that a thank-like reporting system, that is a like/dislike reporting system could be gamed. Typically this is done by a user asking friends to click like or dislike, thus gaming the system in some direction. This can be counteracted by checking the covariance to previous votes, building a kind of confidence into this specific vote, and weighting the vote accordingly. A user given a low confidence will not skew the vote very much, and could end up destroying its own confidence.

A slight variation could be to only use some randomly chosen but highly regarded users as “meta critics” for a short time. They will then form a “gold standard” for other users to be compared against. Comparing the users like/dislike to those users will give a better measure for confidence, and make it harder to game the system.

This will create a majority view and a minority view, that together with like and dislike creates four alternative outcomes. If the majority view and the minority view agrees disagree there might be a case where a “untouchable” abuse an easy target, and such cases should be reviewed carefully. It might also be a case where someone tries to game the system. If meta critics are chosen carefully there should be no problem with “untouchables”, and it seems like they should be chosen from less-to-medium active users. A kind of radical solution could be to ask readers on random how to rate comments. — Jeblad 10:14, 9 July 2019 (UTC)

Note that this leads to a reputation-system, like they use at Slashdot, but with an M1-only moderation. The M2-moderation is done with confidence modelling. For a discussion of alternative approaches, see Jøsang, Audun; Roslan Ismail; Colin Boyd; A Survey of Trust and Reputation Systems for Online Service Provision. Decision Support Systems 43, no. 2 (March 2007): 618–44. (It is under “9.4 discussion fora”.) — Jeblad 21:49, 9 July 2019 (UTC)
The bigger problem with a like/dislike system is that it treats each opinion as equal even though they usually aren't. Jo-Jo Eumerus (talk, contributions) 08:29, 10 July 2019 (UTC)
That can be a problem when you try to model a single scale, and has to few respondents, but there are alternate solutions to this. I would say go for the simpler models, as long as they are sufficient. — Jeblad 20:31, 16 August 2019 (UTC)

Approach to decision makingEdit

As noted on the project page, there may be difficult decisions to make about about which software features to prioritize and how to allocate resources toward other aspects of a user reporting system. When it comes time to make a decision about which features to build, options will be weighed by the following criteria:

  • Which option(s) most aligns with Wikimedia movement values?
  • Which option(s) is most in alignment with Strategic Direction of Knowledge Equity?
  • Which option(s) most aligns with the goal "to build a new harassment reporting system that produces higher quality reports that can be successfully processed and does not further alienate victims of harassment."
  • Which option will result in more accessible user experience, for anyone on any device?
  • Which option will result in a more sustainable product that will be resilient to changing technologies, evolving use cases, and user expectations?
  • Which option(s) do not introduce undue risk for achieving our project goals?

Will these criteria lead to the best decisions about which products to prioritize? Is the meaning of them clear and spelled out in a way that is understandable? SPoore (WMF) Strategist, Community health initiative (talk) 00:22, 3 April 2019 (UTC)

The apparent conflation of "reporting systems" and harassmentEdit

I read over the summary and also looked closely at the flow chart. The first thing that struck me is that the overwhelming majority of the reporting systems identified, and the overwhelming majority of reports made by both the formal and informal systems, have no relation to harassment in even its loosest definition. It is very unclear to me why they are even included in the "harassment" rubric (except possibly to say that they are obviously not designed for nor intended to address harassment). There is also a pretty apparent conflation of any type of inappropriate user behaviour with harassment.

I find this just plain wrong.

I do not in any way deny that harassment has occurred on and in relation to participation in Wikimedia projects; indeed, I've been at the receiving end of various types of harassment and discriminatory behaviour on a number of occasions, as recently as last month. Given the various roles I do hold or have held on the project, I have seen some undoubtedly harassing and/or discriminatory behaviour on more occasions than I can count; I've also done my best to help users who have unintentionally opened themselves up to potential harassment (most commonly from off-wiki parties). I *do* understand the problem. I just don't think that conflating every type of problem and every type of problem resolution system with this particular, real, serious problem will lead to useful, community supported outcomes. Comparing what happens at the 3-revert noticeboard with harassment management just doesn't cut it, and it's kind of embarrassing that you paid a lot of money for that report. Risker (talk) 21:33, 4 April 2019 (UTC)

Hello Risker,
Thank you for reading over the materials and leaving feedback. I’m sorry to hear that you have recently experienced a serious episode of harassment. Let me know if there are any additional actions that the Foundation’s Trust & Safety team can take to assist you will dealing with the situation.
As well, I appreciate the work that you do to help less experienced contributors mitigate harassment. The Wikimedia movement depends on volunteers like you to help our communities deal with harassment. The User reporting system project will need to draw on the experience of functionaries and and members of community governance groups – including stewards, admins, checkusers, oversighters, and Arbitration Committees – in order to make well informed product decisions. As a current and former member of several of these groups, I value your insight.
As you know, Wikimedia wikis differ from most websites because most user dispute reports are handled by volunteers instead of being channeled directly to an in-house team of employees as is the more common way. The Community health initiative and Trust and Safety team have undertaken several internal and external research projects to learn how Trust and Safety type issues are managed on the Foundation platform (wikis) and to identify potential areas for improvement. As you note, the results are available for review on the User reporting system consultation page. We’ve made them available to the Wikimedia movement to inform decision making by all stakeholders for the user reporting system project and beyond. WMF Researcher Claudia Lo is the key contact for this work.
From the current research and preliminary community consultations I anticipate that routing and escalation paths for the reporting system will be key considerations when defining the scope of the features to build. Related to this, I understand you to be saying that combining different types/levels of user conduct reports in the user reporting system could diminish the effectiveness of the system. (I hope that is not too much of an oversimplification.) This is an important consideration and will need to evaluated along with 1) ease of access to the system for new users, 2) the potential to overwhelm volunteers with frivolous or abusive reports, 3) better tracking and archiving of report to more effectively identify long term abuse.
In the next few weeks, after notifications of the consultation have happen in the broad global movement, preliminary ideas for products will be added to the consultation. I'll ping you then and hope that you will return to offer your thought about the more concrete ideas. SPoore (WMF) Strategist, Community health initiative (talk) 21:52, 5 April 2019 (UTC)

ArbCom as a private reporting systemEdit

On enwiki, currently, the Arbitration Committee is the only body that can take reports related to off-wiki information or other things too private to be posted on-wiki. We routinely do take those reports, and in many cases, act upon them. Given the call for more private reporting, it was odd this wasn't a focus of the reporting system summary. ~ Rob13Talk 15:58, 7 April 2019 (UTC)

Hi Rob13, thank you for pointing out the significant role that the Arbitration Committee on English Wikipedia has in managing cases of off wiki harassment or other sensitive information. Having a dedicated body like Arbitration Committee designated to receive the information might be enhanced if there was a clearer routing to them. This is something we'll want to discuss further when the essential components of the reporting system are discussed in a few weeks. SPoore (WMF) Strategist, Community health initiative (talk) 16:21, 8 April 2019 (UTC)

"Formal private reporting systems" and ArbComEdit

I have two questions about on the enwiki reporting system report, from my perspective as a member of enwiki's Arbitration Committee (ArbCom):

  1. How was it decided that emails to ArbCom are a "very severe" and "hard to find" reporting system? Users are advised to contact ArbCom on several relevant help pages, such as en:Wikipedia:Harassment#Dealing with harassment and en:Wikipedia:Dispute resolution#Sensitive issues and functionary actions.
  2. The major recommendation of the report appears to be the creation of "new formal private reporting systems". How will these be made compatible with the longstanding consensus on enwiki that "matters unsuitable for public discussion" should be referred to ArbCom?

Thanks. @SPoore (WMF) and CLo (WMF): Joe Roe (talk) 16:23, 7 April 2019 (UTC)

Hello Joe Roe, thank you for your questions. I want to assure you that no decisions have been made that are incompatible with pre-existing governance on local wikis. Syncing with Arbitration Committee's on English Wikipedia and other Foundation wikis is essential to the success of any user reporting system. We'll continue to reach out to ArbCom's as the consultation progresses to ensure that they participate at crucial times in discussions. @CLo (WMF): starting a project to better understand the expectations of new(er) users when they want to make a report. She can elaborate more it and also answer in more detail about the characterization of emails to ArbCom. SPoore (WMF) Strategist, Community health initiative (talk) 16:40, 8 April 2019 (UTC)
Thanks SPoore (WMF). I hope CLo (WMF) will be able to answer my questions more directly. It is disappointing to see that the development of a new "private reporting system" was announced to the New York Times before ArbCom, the body responsible for private reports on the English Wikipedia, was consulted about it. I can't help but feel this undermines your assurance that the new system will respect the existing community consensus on enwiki. Joe Roe (talk) 19:59, 8 April 2019 (UTC)
Hello Joe Roe, I'll go over your two points in order:
  1. From my own study of policy pages, looking at answers on the Village Pump as well as the Teahouse, and other places where we could reasonably expect a newer editor to go in search of help, I found that ArbCom is relatively hard to find. Additionally, there seems to be a fair bit of social weight to opening up an ArbCom case, and it's not presented as something to take lightly. On top of all of this, since reports for other common disputes such as vandalism are almost entirely public processes, switching over to a very closed reporting method such as ArbCom represents a significant shift in how one might be expected to make a report. We occasionally see things like users not realizing that certain requests for administrator action, such as asking for the removal of sensitive information, should be done privately via email despite this being stated at the top of the same noticeboards they're posting to, and I did not think that we should expect this to necessarily be different for harassment cases.
  2. The definition of "formal private reporting system" I was working with for that particular report doesn't mean a new closed system of reports. Rather, we know from talking to administrators that there's already an informal way to report disagreements privately, and that's emailing or PMing administrators directly. Of course, for this to happen, the user making these informal reports has to already be familiar with the administrators and be comfortable enough with the community to message them individually, which is a significant barrier for newcomers. A "formal" private reporting system–that is, one clearly accessible, visible, and purpose-made to receive and direct reports to the appropriate parties, could be a useful tool for tackling issues of harassment and user misconduct.
I hope that this addresses your concerns. Thank you for your questions!—User:CLo (WMF) (talk) 21:34, 8 April 2019 (UTC)
I have to credit that this is one of the more remarkable pieces of magic I've seen. Wikipedia used to have an autonomous community with WMF watching out for legal threats we couldn't handle ourselves. Now "autonomous" has been replaced by "informal" -- we now have an informal community, waiting to be upgraded and replaced by formal administration! Not since some bankers invented the concept of "identity theft" to transform their losses to fraudsters into a supposed failing of whoever had his name written on the phony form, who should ever after be vigilant lest some banker be defrauded, have I seen a magic to rival this. Wnt (talk) 10:47, 17 June 2019 (UTC) I have changed my mind on this entire issue; see below. Wnt (talk) 15:30, 1 July 2019 (UTC)

Dealing with harassment leads to further harassmentEdit

I think that the problem in the headline needs to be addressed in any new reporting system we are working on. This happens on two levels:

  • A user publicly complains that they are harassed. Other community members accuse them of being harmful and accuse them of willing to revenge for harassment. This goes along the lines: "This user published my personal data to harass me [evidence]. Please block them" - "You are not a nice person. You should not make requests like that just because you are looking for revenge. Go back to writing articles instead of trying to block useful editors"
  • An administrator reacts to a harassment report by warning or blocking a user. Other community members accuse them of being too strict, as community will lose some contributions because of that block. This gives something like that: "User A has again violated [rule] and harassed user B despite multiple warnings, they are blocked for a long period" - "You admins hate user A because they are telling truth about you admins and your friends. You are again harassing him with all these groundless blocks. Just leave them alone and let them contribute as you admins write bad article while this guy writes good ones."

Both of these patterns mean that (a) users are less willing to report harassment, as they are afraid of being harassed for reporting legitimate cases of harassment, (b) admins are less willing to react to harassment reports as they risk being harassed themselves. On the other side, reporting cannot be completely private either for transparency reasons: our communities have public block logs and warning on talk pages.

I can think of some kind of ticket-like system like Phabricator, with tickets used to report harassment that are both public (i.e. visible to anyone, or at least to registered users) and restricted (i.e. cannot be edited by anyone except administrators and ticket author). This is an early idea, and I do not insist on any specific setup — NickK (talk) 03:07, 8 April 2019 (UTC)

Hello NickK, that you for proving this input with good examples. And also a suggested solution. Please continue to think about this and share more details as they come to you. SPoore (WMF) Strategist, Community health initiative (talk) 20:17, 8 April 2019 (UTC)
This appears to be true in the Fram case under the WMF-run system, more than in the usual harassment cases on Wikipedia I think. I don't actually know that the specific person harassed was the one who filed the winning complaint. Bear in mind also that the dialog above sounds forced because it is -- usually there is some underlying issue, a disagreement of philosophy that leads to many editors honestly believing that the accused did not commit any kind of wikicrime. The harassment policy and enforcement on Wikipedia is very vague, which is what fuels this kind of problem, but having these things entirely secret is guaranteed to make it worse. Wnt (talk) 12:12, 17 June 2019 (UTC)

Tools are only a portion of the solutionEdit

Hi all, I am glad to see this focus on harassment is happening. I am encouraged to see that it is supported by research.

I am, however, concerned we are failing to address culture in these plans. Let me explain. First, I am concerned the solutions developed are addressing harassment once it has occurred. I do not see a focus on prevention. Second, many people are posting about cultural and behavioral concerns, but I do not feel the response is one of open listening. Third, while I love what you all are doing, we need solutions for the people who are experiencing these situations. I know you all mention “better tools” was a response from the Community Engagement survey, but the WMF staff made “better tools” an optional answer to the survey questions. The community response wasn’t an unbiased response. It was an optional answer to a question, both designed by the WMF. For example, if people had to rank items, and that was on the list of limited options, yes, it will get chosen. Perhaps if you had “Improve policies and culture around bad behavior” that would have received a lot of response too. I think reflecting on the bias and the balance of the information used to support this project, as well as the feedback on this discussion page, will help guide further advances from the Anti-Harassment team.

Again, I am delighted you are developing software solutions, but I am wondering how you are addressing the culture? Others on this discussion page have brought this up before me, but I feel the response was not one of listening, but of defense of why this project is a valid solution, and not addressing the gaps this project does not fill. This is why I am reframing the question about culture again here, so hopefully it can be addressed. Thank you for your response to this. Best, Jackiekoerner (talk) 15:48, 8 April 2019 (UTC)

Hello Jackiekoerner, I'm glad that you brought this issue to my attention. It is easy to get tunnel vision when working on a project and look at a topic too narrowly. It is true that I was replying almost entirely in a way that redirected the comments back toward the reporting system. I definitely could have done a better job responding to questions and comments about the culture of the Wikimedia movement and the Foundation's past and future work to make a safer environment.
I want to assure you that the Community health initiative and Trust & Safety are focused on the preventative and supportive measure, too. Past work includes Training modules about Online Harassment, Friendly Space Policy enforcement at Events, educational pamphlets for event organizers about safety at events, and supporting local wiki as they create policy about user conduct.
Going forward the user reporting system project is just one aspect of the Wikimedia Foundation's plans to grow a thriving community. The Foundation's Medium-term plan 2019 lists as a priority -- Thriving Movement In particular, Outcome 6 and Metric 7 are indicative of the focus on addressing cultural issues such as an universal code of conduct.
While I agree that it is important to not lose sight of the broader issues, it is also important to focus on the work at hand. As the User reporting system consultation moves forward and most of the discussion on the page will be with the Product Manager about software features, I can direct the ideas and comments about prevention and support to other pages where we will be discussing policy, support, and training.
But I hope that contributors with a broad range of experience, including targets of harassment, will stay engaged with the User reporting system project because it is essential to learn from them and their allies in order to build a product that meets their needs. Again, thank you for the nudge towards discussing the broader topic and the work that you are doing with the Working Group. SPoore (WMF) Strategist, Community health initiative (talk) 22:57, 8 April 2019 (UTC)

global vs. localEdit

Hi. Please include into your research the fact that reports can affect users which are either just active on one wiki or acting on a couple of wikis. Reports should reach the right bunch of people who have the tools to deal with this. Best, —DerHexer (Talk) 14:52, 16 April 2019 (UTC)

Thank you, DerHexer, that is indeed an important factor. There are several aspects to this point that need to be delineated. I'm going to add several that I think of immediately, and would appreciate you and others improving them or adding in others. SPoore (WMF) Strategist, Community health initiative (talk) 16:04, 17 April 2019 (UTC)
  1. There are cross wiki workflows done by people who have specialized tools that work globally (on all wikis.) These people have access to particular user rights (tech) and belong to a particular group (social). To have action taken through these workflows, a group needs to see the report or otherwise get a request for action from another person or group with the authority to act.
  2. There are workflows that happen on local wikis that might or might not be the same on other local wikis. The tools on local wikis are generally the same as other wikis but some local types of customization might be in place.
  3. There might be different workflows on different local wikis.
  4. There might be different policies on different local wikis. These policies might identify specific people or groups who have the responsibility to take reports. These same people or groups may or may not have the authority, responsibility or access to tools to take action on reports.
@CLo (WMF):, you did some work related to crosswiki workflows for the stewards. It would be good to learn how that work influences our thinking about the workflows for the User reporting system as it connects to the larger wikimedia ecosystem. SPoore (WMF) Strategist, Community health initiative (talk)
Hello DerHexer, that's definitely an issue we're thinking about. We don't have a complete report (of the kind that Sydney has linked on the main page) on these cross-wiki workflows, but we are aware that they exist and are critical to the way projects are run. For example, we're aware that steward workflows and the work of the small wiki monitoring team are highly dependent on what I'd call cross-wiki reports. To elaborate, I interpret this as the process where one user (or several) notices suspicious behaviour on a specific wiki, raises the issue with local admins, and this process is repeated over time across several wikis; those admins or users then escalate to stewards or another appropriate cross-wiki group to handle the issue. It's a complicated topic, to put it mildly. I think that the prompts that have been raised are pretty much in line with what I am considering, but just to throw a few more questions into the ring:
  • How do we handle different permissions across wikis when deciding on the chain of routing for a report?
  • How do asynchronous communications attached to these hypothetical reports need to account for this complex "handing-off"?
...and I am certain there are more discussions to be had on this subject. —User:CLo (WMF) (talk) 22:27, 17 April 2019 (UTC)

Need for change of attitude towards those reporting harassmentEdit

I've just read the blog post about this consultation, and I'm concerned by the attitudes to people reporting harassment and the lack of understanding of this kind of situation that it reveals. First of all, people being harassed are told they should confront the harasser directly on their talk page and ask them to stop. This is extremely unlikely to be productive; harassment is not, in general, accidental behaviour which harassers will stop when asked to do so. Engaging with a harasser is more likely to lead to increased harassment, and should not be the recommended first step.

Second, people reporting harassment are told they should be further engaging with their harassers by notifying them on their talk page if they start a report on the ANI. Again, this should not be a part of the process. If the situation is such that the alleged harasser is deemed to have a right of reply, this should be sought by admins, not by the person who has reported them. Cases where an editor is clearly targeting another editor, whether by leaving offensive comments on their talk page or through strategically reverting their edits for spurious reasons/flagging their pages for deletion/etc, should not be adjudicated via a public discussion which anyone, including the harasser, can join in. There is clearly a much greater need for easy, private ways of reporting, which can then be taken public if there is a need for further discussion or to publish the results of the admins' investigation. Regarding harassment situations as parallel to innocuous disputes between editors demonstrates a fundamental misunderstanding of how harassers operate, particularly how they use community rules to create plausible deniability that their actions are 'really' harassment. Whatever solution(s) are developed, they need to prioritise protecting victims of harassment from further harassment and retaliation.

Eritha (talk) 15:53, 17 May 2019 (UTC)Eritha

UpdateEdit

Hello,

A quick update to note that there are adjustments being made to the timeline for designing and developing the User reporting system. The plans will be updated in the next few weeks. SPoore (WMF) Strategist, Community health initiative (talk) 18:39, 13 June 2019 (UTC)

How do you stop systems from being abused?Edit

There is the potential for anti-harrassment systems being abused by raising misleading or false reports - particularly if they are dealt with secretly without input from the accused or the community. Considering how many attempts are mode to use existing editor behavior systems such as ANI or Arbcom to win content disputes this is a real danger.Nigel Ish (talk) 16:57, 15 June 2019 (UTC)

  • The answer, of course, is they don't. Due process doesn't seem to be a concern of any importance here. The whole process is a farce; WMF will do whatever they want to do, regardless of community response. CoffeeCrumbs (talk) 09:58, 17 June 2019 (UTC)

How to handle privacy vs "fair trial"Edit

@SPoore (WMF): Will this be an "absolute privacy" set-up?

While T&S can investigate any claims, it's impossible for them to know some information unless they ask the accused editor on specific points. However, in the major recent example, the WMF has indicated that they can't ask about a specific incident because it would either directly or indirectly violate the accuser's privacy.

This makes absolute privacy conflict with a full fair trial (because even neutral participants risk not having all the facts) - the former isn't more important than the latter, so how is the system going to handle this?

The WMF can amend their ToS (within the constraints of US law) - an absolute provision of anonymity, under all circumstances, is not required.

Presumably it will need to notify users of it that some information may have to be provided to the accused in order to enable full consideration, though this would be minimised as far as was practical? Nosebagbear (talk) 17:06, 15 June 2019 (UTC)

Simple, people don't get a fair trial. Or at least what I am gathering from WMF proposed policy. The WMF doesn't care about the community members who built the project. Afootpluto (talk) 18:07, 15 June 2019 (UTC)
  • @CLo (WMF): - I was hoping you might give some more constructive response, much as you did for Barkeep. I don't share the "the WMF is evil" comment given above, but I would like to know how you plan on balancing the scales from the current one-way set-up. This has some effect on how a reporting system would need to be arranged, so it isn't a "system first, rules later" discussion point. Nosebagbear (talk) 12:56, 29 June 2019 (UTC)
Hi Nosebagbear, I saw your question but as I read it, the question is more about internal policies of Trust and Safety than it is about the user reporting system project. I'm really only comfortable talking about the user reporting system project here, so I'm only replying in depth to questions specifically about it. I would also like to try and keep this page focused on the project itself, if possible, bearing in mind that our timeline has been delayed as detailed below by SPoore (WMF). I hope both my reply here, and my clarification below, can clear up some questions around scope and scale. Thank you! —User:CLo (WMF) (talk) 16:47, 1 July 2019 (UTC)
Well, the reporting system is a way for user A to tell C that B is misbehaving. So there is definitively the chance that C will only see A's side of the story if A's identity has to be kept private. On enWikipedia we've often had problems with invalid or even bad-faith reports. Jo-Jo Eumerus (talk, contributions) 07:37, 2 July 2019 (UTC)
@Jo-Jo Eumerus: - I agree with you, but if think you mean "C will only see A's side of the story", see B's as well is the goal Nosebagbear (talk)
Yeah, that was a mix-up. Changed. Jo-Jo Eumerus (talk, contributions) 16:16, 2 July 2019 (UTC)
  • @CLo (WMF): - i can understand the desire to do that, but can you agree that the nature/format/usage of a reporting system has to have some link to the rules which reports will be made, accepted/rejected and interpreted. Different rules would make me want to give a different answer to the consultation - how should I know what to say with all the information on which to base it? Nosebagbear (talk) 15:47, 2 July 2019 (UTC)
Ah, thank you for rephrasing your question Nosebagbear! I'm reading your question more as a suggestion right now. Part of what we want to know with this consultation is what you believe a reporting system should include. Please correct me if I am wrong, or mischaracterizing, but I would summarize your points so far as:
  • A private reporting system must allow the individuals handling the reports (likely volunteer administrators or another trusted group) access to enough information to make a sound judgement, which may include potentially identifying information with regards to the identity of the user making the report.
  • A private reporting system should clearly explain to would-be reporters any existing rules or guidelines regarding what information is needed for a report to be useful, what kinds of actions/behaviours are valid targets of a report, and who will be handling the report.
I would greatly appreciate examples of either, if you're happy to share them with us (privately or otherwise). Potential suggestions based off those points could be something like: the proposed reporting system could allow users to pick the type of report they want to make (we know that it's likely people will wind up using it to report e.g. vandalism even if we intend for it to address harassment), and based on the selection, it should help users structure their reports and provide proper diffs and other clarifying context. With regards to public knowledge of reporter identities, that is a conversation we want to leave up to individual projects to decide. An important point to keep in mind is that we already know that private reports are being made through email etc. to administrators. We strongly believe that access to private reports, where administrators know the identity of the reporter but this information is not immediately public, should be more easily accessible. I can definitely see how you might want to introduce compromises; for example, a completely anonymous report might simply be considered unactionable. But again, that's not a decision from the Anti-Harassment Tools team, just me repeating ideas I've heard from admins.
As an aside that you might find interesting, we've found that people don't always make good-faith reports in order to get some kind of administrative response. Sometimes, users want to report just to create a proper trail of documentation without actually wanting action, for many (valid!) reasons. And, I'll just tag in @Jo-Jo Eumerus: here, since I find their point quite interesting in this context! We are aware that some reports are simply badly submitted or formatted, especially since current reporting systems are functionally unstructured and require a fair bit of knowledge to write. On top of that, misleading reports get filed, to try and smear an editor's reputation, or false reports are made as a way of harassing the admins who have to deal with report notification spam. One potential answer is that taking these reports to a private venue helps mitigate the public impact and reputation hit that could result from misleading reports, but of course this unacceptably increases administrator workloads. What suggestions do you have for ways to handle these misuses of a reporting system? Some of the suggestions we've heard include:
  • Batched report "tickets", to stop notification spam
  • A way for administrators to flag reports as dubious, or not worth investigating
  • A way to limit access to the reporting system for users who are abusing the system, or to flag their reports.
Of course, don't take these suggestions as endorsements or statements of intent, since they definitely have their pros and cons. The project is still in very early stages, and we have not settled on any technical implementations, which means I can't say for certain how or if these will be implemented. Some of these insights come from earlier work I did, comparing reporting systems on Reddit and Facebook Groups, which you can find on the main page of this consultation. I would greatly appreciate your opinions on potential solutions or ways to mitigate these problems, as well as your own suggestions, or other issues you foresee. —User:CLo (WMF) (talk) 18:04, 2 July 2019 (UTC)
One additional problem I see is that investigating reports has never been solely the job of administrators. I'll need to think about a solution to the bad-faith report problem more - or someone else who reads this can answer. Jo-Jo Eumerus (talk, contributions) 19:15, 2 July 2019 (UTC)

WMF and Local InterplayEdit

A few thoughts/questions I have:

  • Projects have widely varying levels of capacity to handle issues of harassment. Is there one answer for a tool the project builds or do there need to be a few, depending on the level of local capacity on this issue? How can this system be used to build/increase local capacity in this area over time?
  • On English Wiki the barriers to reporting of harassment are high - the easiest places to find are public, the private places to find are private and still might direct back to a public forum. The disadvantages to this seem apparent and creating troubling situations. Are there any advantages to this higher barrier?
  • In lowering the barrier to user reporting, is there capacity from the combination of volunteer and foundation time in this area to handle an increase in reports? Does this vary based on the project in question? If not, how can that capacity be generated?
  • How can this initiative make clearer patterns of harassment where nothing on its own would be troubling but reoccurring low to medium harassment by one editor of multiple other editors is happening?
  • This document makes clear that the foundation will make final decisions in this initiative. What processes/procedures will it implement to get local buy-in when an attempt for consensus seems to not be on the table?

I know I'm supposed to be answering questions but I am hoping that the formulation of these questions in and of itself provides some helpful feedback. Best, Barkeep49 (talk) 17:45, 15 June 2019 (UTC)

Hi Barkeep49, thank you for your questions! I hope some of my answers to your more research-oriented questions will provide a little clarity – they're certainly questions that are on the team's mind as we think about potential prototypes.
  • With regards to different projects' capacities for handling issues of harassment, a related design issue we are aware of is the fact that differently-sized projects have different pathways and workflows to handle harassment. For example, a very small project might actually be better off, currently, handling harassment via informal channels (by which I mean, channels that were not purpose-built for handling harassment cases), such as email, or the use of general non-specialized noticeboards. On very small projects, the additional administrative overhead of building a formal (that is, custom built and designed to handle harassment) system might not be worth the benefits it brings. However a larger project might benefit from a more formal system, with the advantages of better documentation, and a structured pathway, since they may have the administrators necessary to handle the increased bureaucracy that such a system brings. It may well be that we must acknowledge that any formal, purpose-built pathway for harassment cannot handle every possible case.
  • Barriers to reporting are definitely something I've been looking at! One of the motivations behind our push for this tool is the knowledge that private reporting, which is sometimes necessary for safety reasons, is extremely difficult to access for newcomers. As it stands, this means that only users who are already familiar with the community are in a position to make private reports, via the medium of emails, IRC PMs, or other pathways. However, through interviews with current and former admins, we are also aware that this high barrier can be beneficial in that it is a barrier against abuse of the system. That is to say, if a bad-faith actor wanted to use private reports as a vector for harassing administrators, the difficulty of finding the current private reporting channels means that it deters those who aren't as motivated for whatever reason. One of our challenges is to come up with a system that makes it easier for good-faith actors to make reports, while providing solutions for the volunteers receiving these reports to deal with malicious uses of the reporting system. We've heard some suggestions, such as batched tickets, or some way to filter or tag users, but settling on a specific technical solution feels premature at this stage.
  • Questions of administrator training, support, and retention are definitely on our minds, though I believe they're a little outside the scope of this project :) It is a topic of great interest to me, and is something that we do aim to cover with regards to research projects in the future.
  • Based on interviews with administrators, my review of old AN/I cases, as well as some work I've been doing looking into past SPI cases, it seems that the final judgement call on what is considered harassing behaviour is the result of noticing those patterns. Oftentimes the clearest indicator that harassment is occurring is the very fact that there is a recurrent pattern of uncivil, threatening, or otherwise aggressive behaviour that constantly just falls short of immediately-sanctionable actions. One of the potential advantages of creating a purpose-built harassment reporting system could be a better way for administrators to track such occurrences over time. Even if that single diff or incident is not actionable on its own, being tied to a history of similar boundary-pushing behaviour might be. My research indicates that this kind of communication and record-keeping already happens, just that right now it tends to exist purely in the memories of long-time administrators and is not always written down, which means it's knowledge susceptible to being lost to admin burnout or attrition.
As for your final question on buy-in, I am probably not the right person to answer this. But I would like to reiterate that we are aware of the importance of gaining the community's trust for whatever structure we build. Speaking for myself, my hope is that the reporting system ends up being used to supplement existing community processes.
P.S. please forgive me for the indenting not working on bullet points. Still trying to figure that part out. —User:CLo (WMF) (talk) 22:10, 24 June 2019 (UTC)
Thanks for your thoughtful and detailed response. To indent bullets you can do ":*Comment" or as many : as you need to properly indent. Best, Barkeep49 (talk) 01:10, 25 June 2019 (UTC)
Ah, thank you for the tip! —User:CLo (WMF) (talk) 17:03, 25 June 2019 (UTC)
To expand the previous tip, indent styles "stack up" from left to right. If you want to add some sort of indentation, copy whatever was used for the previous line then add a symbol (: * or #) to the right end of the group. If you want to back-out to less indention, copy what was used on the previous line and drop one or more symbols from the right end of the group. Most often it's just colons and you simply add one, but occasionally big messy community discussions can generate an ugly random string of : * and # symbols. However it works out surprisingly easy in practice. Just copy the list and add/delete on the right. The messy part takes care of itself. Alsee (talk) 21:28, 9 August 2019 (UTC)

Minimum design requirementsEdit

  • The system must respect the privacy of the harassed.
  • The complainant must substantiate their complaint with evidence.
  • The accused must be able to reply to the complaint (suitably anonymized).
  • Investigators must be able to ask the reporter and respondant questions.
  • The process must be accountable to the community.
  • The decision must be appealable to the community (e.g. Arbcom). All materials provided must be available for inspection by the body hearing the appeal.
  • Reporters must be sanctionable for filing harassing or vexatious complaints, forum shopping or other conduct. Such sanctions must include the withdrawal of access to the reporting system.

Anything that does not satisfy all of these points is not fit for purpose. It should be possible to post a suitably anonymized summary of the case on a public board (and yes, that involves the ability to name the complainant publicly if they are sanctioned).

Other questions:

  • What about email harassment? Per point #2, evidence is required. We cannot verify a case of email harassment without copies of the emails concerned.
  • Likewise, what about mute lists? We want to see evidence that the accused has gone out of their way to harass the complainant. Part of that is circumventing mute lists.
  • Low volume but quality reports are necessary to make a difference. Volunteers manning the reporting system don't want whinges that can be taken care of ANI, whinges by editors who lost a content dispute, complaints from those who cannot deal with legitimate criticism of their edits, unblock requests, etc.
  • What about IPs? Secondly, there should be a barrier to entry to deter abuse or denial of service. MER-C (talk) 17:56, 15 June 2019 (UTC)

Endorse the standpoint given by MER-C. Starship.paint (talk) 09:09, 17 June 2019 (UTC)

I think that's not workable -- how do you "anonymize" a complaint of harassment when the moment the underlying issue is hinted at, the identity is obvious? Also, "mute lists" are bullshit. Actions against "vexatious complaints" are a paved road to Hell. And your proposals focus a bit too much on the rights of the person accused rather than the rights of the community. Bear in mind that, as in ordinary law, the principal right we are trying to uphold is our right as citizens to know we are not being pulled out and prosecuted unreasonably one by one.
I would say that better design considerations are that--
  • Harassment must be very explicitly defined. Issue by issue, in detail. A lot of this is common sense -- we know that pinging an adversary in an edit war three times can happen, but pinging him 300 times is something else, and if he says "stop pinging me" you shouldn't do it without some good reason. So if it's common sense -- write it down. Assure people the same rules and standards are being used on everyone, and at least then we'll be less convinced it's because of personal relationships and dislikes.
  • Jurors should be unconnected to the persons and issues at hand. This is to prevent the sort of COI being alleged by some community members responding to the Fram case.
  • Sanctions must be justified in regard to particular actions. Saying "we've looked at a long term pattern of editing and the totality is..." aren't going to cut it. No one can look at the totality for any prolific editor - it's too much! Investigators look and they find stuff and they miss stuff. If you don't sit between the seraphim, the only thing you can do is judge specific edits/messages or groups of them.
  • Public edits that bring sanctions, whether individually or as a pattern, must be listed and published to the community. There should be a database for us to look at and say "don't be like that" if we are not to be like that in the future.
  • Most importantly, this stuff should be left to local administrators and general community members. If you actually need to override their process -- a dubious proposition -- then you need to tell them what they are doing that is so desperately deficient so they can get some reform proposals underway immediately. I mean, I know there are deficiencies - my top two points there. But it is up to helpful editors to decide when editing becomes unhelpful. Wnt (talk) 11:17, 17 June 2019 (UTC)
  • Harassment can't be defined precisely. What you can do once and say "sorry, I made a mistake" you may not be able to do hundreds of times. And while pinging a single user 10 times when it's relevant is clearly appropriate, doing it 5 inappropriate times could be harassment.
  • An other critical issue is to allow reporting without risk of it triggering further harassment. You report certain users publicly, or even you report them privately and the handling committee publicizes it, you run the risk of further harassment. 37.26.149.129 15:48, 24 June 2019 (UTC)

AccountabilityEdit

I am strongly opposed to the proposed reporting system to the extent it imposes civility standards without local community involvement, review, or approval, or results in secret trials by secret and unaccountable judges without the right of representation, defense or appeal, on secret evidence submitted by secret accusers.

I object to the Foundation imposing any non-legally necessary sanctions within the purview of established local conduct policy and community processes.

T&S shouldn't be imposing temporary or local sanctions or modifying advanced permissions. They should have only the one tool that they have ever needed for their legitimate purpose of ToU enforcement of serious, legally necessary sanctions: the global permanent ban.

Finally, I agree with Jimbo that all bans are appealable to him in his capacity as an individual founder,[1] as long as he maintains engagement with the movement as a whole and outsiders. (His level of engagement, by the way, sets a bar to which rest of the Foundation should aspire, but has as yet not ever come close.) I ask the Community Health team to explicitly acknowledge this avenue of appeals. EllenCT (talk) 18:23, 15 June 2019 (UTC)

I concur with all EllenCT has written above about the accountability of the report system, and strongly oppose this proposal, in particular while there is the possibility of it resulting in secret trials by secret and unaccountable judges without the right of representation, defense or appeal, on secret evidence submitted by secret accusers. A system like this, specially if it gets to be entirely managed by WMF, without proper accountability, seems very prone to abuse, e.g., for vendettas, pet revenge, and imposing an editorial line/POV approved by the ones deciding over the complaints. It looks like a free card to corruption and abuse, with a great potential of increasing harassment and bringing it into a whole new level, not reducing it.--- Darwin Ahoy! 18:35, 15 June 2019 (UTC)

I'd like to quote from Anti-Harassment Tools Team Design Researcher Claudia Lo's November 2018 "Reporting systems on English Wikipedia" written for the Community Health Initiative:

the Wikimedia community highly prizes transparency. For reporting systems, this is interpreted as publicly-viewable processes, outcomes, and the identities of the involved users. Transparency in this case is not just a design consideration put into place to achieve a certain kind of efficiency or mode of operation, but a value to be strived for in the way the entire system operates.... whatever changes we recommend, it must adhere to these values even as we change key features, otherwise it will not be trustworthy.

EllenCT (talk) 19:40, 15 June 2019 (UTC)

  • Excellent quote - properly notes the issue and really the underlying truth that's caused the problems. Until very recently for en-wiki, and slightly longer for de-wiki, T&S actions have always been taken to be correct. As such, they were granted more leeway in their privacy and decision-making so long as we believed it was reasonable. The loss of that (heavily due to the lack of transparency) makes every single step to change 10x harder. Without transparency, no trust. Without trust, no change. Nosebagbear (talk) 19:47, 15 June 2019 (UTC)
Ellen is making a helluva lot of sense to me as well. Shearonink (talk) 01:04, 17 June 2019 (UTC)
Endorsing all prior comments in this section. Starship.paint (talk) 09:10, 17 June 2019 (UTC)
I, too, endorse all prior comments in this section. T&S needs to recognize that without trust of them by the community, trust that office actions will be fair and will not be a takeover of community dispute resolution procedures, these kinds of plans will fail. --Tryptofish (talk) 21:06, 17 June 2019 (UTC)
I agree with several sentiments above. But more specifically, I get the impression that there has been a presumption that private reporting of harassment where sexuality is an issue will get a better outcome. After all, many of us remember the utter debacle of the ArbCom case against Fae, where an organized off-site campaign with significant, obvious anti-gay sentiments was able to organize a bullying campaign that led to Fae being sanctioned for a tiny technical offense of trying to preserve privacy and, worse, for a few brief complaints about anti-gay harassment. However -- how can anyone know the dictator of T&S won't make the same kind of bogus decision in the future? Can WMF promise us that no managers will ever be hired for the position who are of Pakistani or Kenyan nationality or any other country where homosexuality remains illegal? That no one will be hired who subscribes to fundamentalistic belief systems? I understand, of course, that such workers would likely be required to uphold anti-discriminatory policy whether they believe in it or not ... but how can we, or anyone, check that will actually happen? Community discussion is a place where a lot of things never get resolved, but there are merits to that. Wnt (talk) 10:41, 17 June 2019 (UTC)
  • Accountability, or the lack thereof, is the primary reason I oppose WMF getting more involved. Dennis Brown (talk) 14:16, 25 June 2019 (UTC)
  • Accountability is also lacking in individual projects. The policy en:wp:Adminacct states "Administrators are expected to respond promptly and civilly to queries about their Wikipedia-related conduct and administrative actions and to justify them when needed." In practice, if an admin is asked to justify an action, and chooses to not answer, there is no action taken, although this is a violation of the administrator conduct policy. In such cases the reason to not answer is usually that there is no justification for the admin action. This lack of accountability leads to the acceptance of wrong actions.
The lack of accountability in sanctioning is a reality in current practices. The WMF can't do worse than that, but it can do much better, by developing an easy-to-use reporting system, that presents the evidence, the violated rules, the reasoning, and the decisions in a structured form (similar, but more compact than an arbitration case), that's easy to evaluate for the public. In cases involving private evidence, such evidence would be redacted.
This reporting system - if well designed -, is the way to improve accountability and transparency throughout the projects. — Aron M (talk) 23:26, 16 August 2019 (UTC)

Project update and timelineEdit

Hello all,

Thank you for your thoughts and ideas so far. Me and other members of the Anti-Harassment Tools team will answer your specific questions and comments.

A few general comments:

  • This project was put on hold about a month ago because a) the Anti-Harassment Tool teams Project Manager left the Foundation b) the Foundation is undergoing internal reorganization in ways that might effect the timeline and staff working on this project c) the Foundation's new Medium Term Plan had to be completed in order to know if URS fit into the scope of work in the next annual plan. These are related to the internal workings of the Foundation and I would not normally mention them to the community because it usually isn't relevant or of any interest. But in this instance I want to let you know the reason that the timeline for this consultation changed and that that you will have plenty of time to comment about the User reporting system.
  • Normally, the Anti-Harassment Tools team starts a project page on English and German Wikipedia and other any other local wikis where we are doing a pilot project or direct communication with the local wiki seems important. I haven't began the local project pages yet because we are doing a phased consultation and that part comes in the next phase. Thank you to a few communities that already made a local page. :-) People are welcome to comment on Meta or their local page.
I want to assure you that the Foundation understands that the User reporting system can't achieve its goals without in depth consultation with local communities. We know that this can not be a one size fits all project. That is the reason we have been doing research for over a year about what is working and what is not working in our current system. I encourage you to look at the research and stay engaged with us.

Again thank you for participating in this discussion. I'm sorry if the change to the timeline caused any worry or confusion. SPoore (WMF) Strategist, Community health initiative (talk) 22:43, 17 June 2019 (UTC)

Thanks for the update, and for your work on this. – Ajraddatz (talk) 20:29, 21 June 2019 (UTC)

@SPoore (WMF) Since you mentioned the German Wikipedia: The Anti-Harrasment Tools team should consider to present their project and asked for local feedback in the Kurier. Kurier and its discussion page are the German equivalent of both, Signpost and Village Pump. If you want to address the community it is the way to go. Other pages which may sound appropriate like WP:Projektdiskussion are backwaters with very few contributors. ---<(kmk)>- (talk) 00:09, 25 June 2019 (UTC)

EncouragementEdit

As I tried to explain during the initial drafting of the 2030 Strategy session back in 2017, I believe that both harassment and false claims of harassment are major problems for en.wp. (My own experience of being prosecuted repeatedly by a banned user and their supporters is worth study. I would be happy to discuss this further -- privately -- with T&S researchers. I have looked at extensive evidence from ArbCom cases, AE findings & RfC/U in order to try to understand how I was wrongly prosecuted in 2016-2017 and wrongly labeled as a harasser/hound). I would appreciate that you add my name to the list of those being periodically informed about requests for participation in these consultations.

Looking around today, I remembered that you had surveyed a small sample of users (admins only?) concerning AN/I, but have not yet conducted inquiry into WP:AE (arbitration enforcement). I would be happy to help provide data concerning how this latter noticeboard has, in some cases, been used as a streamlined procedure for suppressing legitimate inquiry.

Initial inquiry into harassment needs to be conducted with protection for the reporter, as otherwise it is too easy for any worthwhile inquiry to be derailed by established editors simply smearing the reporter. I have recent examples of this as well as examples dating back to 2013. The recent vociferous resistance to change on en.wp should probably be understood as being potentially motivated both by:

  • "good faith concerns" about potential abuse of this protection of the victim (concerns with opacity)
  • "less savory concerns" about the challenge posed to a system allowing deeply embedded actors to nip healthy inquiry into systemic problems in the bud

Thank you for following up on the widely-recognized community health concerns and attempting to address them. SashiRolls (talk) 00:39, 18 June 2019 (UTC)

Agree. I've reflected on the issues of current dispute resolution practices: above in #Accountability,
and in more detail in Talk: Community health Recommendation - Rules and regulations, decision making processes and leadership.
There's a proposed workflow for a reporting tool, that aims to be easy-to-use, and transparent in #Factual, evidence based reporting tool - draft, proposal. — Aron M (talk) 23:43, 16 August 2019 (UTC)

I recommend delayEdit

I recommend that the Foundation delay implementing this user reporting system until there is a clearer understanding of what matters will prompt Foundation intervention as compared to being referred to a project's own disciplinary procedures, such as Requests for arbitration on the English Wikipedia. There is a good deal of confusion as to this issue in regard to a recent ban implemented by the Foundation of a then-administrator on the English Wikipedia. --Metropolitan90 (talk) 05:58, 19 June 2019 (UTC)

Ya think? A wee bit o' drama there? I'd also like to see "contact local arbcom first and only escalate to T&S if they aren't willing/able to handle it, except in extreme cases (pedophiles, real world stalkers, that sort of thing) Beeblebrox (talk) 23:37, 23 June 2019 (UTC)

Don't put this on large wikis with community processes like ArbCom that can handle this themselves (azwiki yes, enwiki no)Edit

Fram. Enough said. — pythoncoder  (talk | contribs) 01:34, 25 June 2019 (UTC)

I don't think that's all that needs to be said. There's probably room (and intention) to integrate this with existing local wiki reporting systems, completely separate from the T&S ban system. – Ajraddatz (talk) 02:06, 25 June 2019 (UTC)
@Pythoncoder: az.wiki would argue that their processes are just as valid as en.wiki. To a certain extent, that's true. –MJLTalk 03:29, 25 June 2019 (UTC)
Yeah, even as an enwiki member I am not sure about an exception (to say nothing that we'd need a consensus anyhow). For example, who gets to decide who is exempted and who is not? If a project that has gotten an exemption develops a governance issue to the point that local processes can no longer handle problems, what then? Jo-Jo Eumerus (talk, contributions) 07:21, 25 June 2019 (UTC)
  • The Fram situation demonstrates exactly why the local Arbcom should be ones to handle local issues that aren't legal in nature, don't involve minors, or do not require emergency action. The lack of accountability to the greater community is always going to cause problems and drama, just as this has. Dennis Brown (talk) 14:14, 25 June 2019 (UTC)
    • I agree. Where WMF can help is where local communities need the help. If a community has the resources to deal with problems internally, and if WMF is concerned that the community is doing something wrong, the proper course of action is to discuss that problem openly (not the specific confidential information about individual users, but rather the policies and procedures). What happened in the Fram controversy is a case study in how not to deal with the editing community, and the amount of disruption that grew out of WMF's actions is staggering. --Tryptofish (talk) 20:58, 26 June 2019 (UTC)
      • I am not so sure about this. IMO the Fram drama stems less from the WMF interfering in local projects and more from the fact that they used the same procedures to Fram that one would apply to paedophiles and criminals even though the situations do not look like the same. We've seen the same established-editors-treated-like-vandals issues on the enwiki noticeboards before. Jo-Jo Eumerus (talk, contributions) 07:29, 27 June 2019 (UTC)
        • I don't disagree with any of that. It's very important that WMF communicate with the editing communities, so that we are all on the same page about what is or is not harassment. A great many editors at en-wiki (including me) are convinced that what Fram appears to have done should not have been treated as something that rises to the level of an Office Action, based on what we have historically understood Office Actions to be for. If WMF is convinced that the standards for civility need to be raised, that's fine, but there needs to be clear communication with the editing community about what the new standards are. No one should be left guessing until they find themselves at the wrong end of an Office Action. --Tryptofish (talk) 23:47, 27 June 2019 (UTC)
The fram situation is exactly why this needs to be forced onto en:wiki. They've shown clearly that they're not the wikipedia anyone can edit, they're the wikipedia that abusive people can keep editing while they push newbies off the project. The community processes on en:wiki routinely protect abusers and bite newbies. DanBCDanBC (talk) 10:15, 28 June 2019 (UTC)
Except they haven't shown it, even if you put aside we don't know the Fram evidence (the evidence we've been able to self-determine doesn't warrant the block), then you'd have to demonstrate why dialogue about en-wiki's issues wasn't the way (and isn't) to progress before implementing this could be justified. Nosebagbear (talk) 11:47, 28 June 2019 (UTC)
The one thing the FRAM situation has showed is that any attempt to unilaterally impose a system on enwiki without the support of its community is going to go extremely poorly. – Teratix 13:34, 3 July 2019 (UTC)

Clarification on reporting systemEdit

Hello all,

Thank you for your responses so far – there are many valid concerns and important points raised, and the Anti-Harassment Tools team is keeping an eye on this page as it develops. However, I'd like to clarify one point that's been raised in several responses: our intent with the development of a new reporting system is not to supplant existing methods of reporting, but to compliment them by both making it easier for users to reach existing reporting spaces, among the other goals described in the main article. I'd like to apologize if our initial consultation page did not adequately describe this aspect of the reporting system! To reiterate, we hope that this reporting system can provide accurate information to the appropriate channel for action to be taken. That is to say, we want to make it easier to know where the appropriate reporting spaces are, whether that space is a public noticeboard, an on-wiki private channel (such as ArbCom, if it exists on that project), or escalating extreme cases to our Trust and Safety. Once there, we aim to make it easier for reporters to understand what information they need to provide, in order to allow administrators to have the full context and knowledge of the incident (or incidents) to ensure a fair assessment of the situation.

Making it easier to create a high-quality report, and making current reporting spaces easier to find, are not the sole functions of the user reporting system but they will hopefully play an important part. Thank you for your ongoing feedback, and I hope that this message clears up some of the goals and scope of the user reporting system project. —User:CLo (WMF) (talk) 21:36, 28 June 2019 (UTC)

Community health initiative based on a homophobic and racist toolEdit

Community health initiative seems to base some of its positions on the use of the Detox tool. This tool has been shewn to produce homophobic and racist results, as well as failing to cope with the "Scunthorpe Effect". The tool has now been deleted because of its terrible failings. I did raise this on the talk page of the Initiative, but that has had no response. DuncanHill (talk) 11:28, 29 June 2019 (UTC)

@DuncanHill: I can clarify -- nobody at the WMF is using the Detox tool. Our Research team collaborated with Jigsaw on training Detox in 2016-2017, and found some promising-looking initial results. In 2017, the Anti-Harassment Tools team tried out using Detox to detect harassment on Wikipedia, and we found the same kinds of flaws that you have. The tool is inaccurate and doesn't take context into account, leading to false positives (flagging the word "gay" as aggressive even in a neutral or positive context) and false negatives (missing more nuanced uses of language, like sarcasm). As far as we know, the model hasn't really improved. I believe there's a team at Jigsaw who are still investigating how to use Detox to study conversations at Wikipedia, but nobody at the WMF is using Detox to identify harassers on Wikimedia projects. I'm glad that you brought it up; I just edited that page to remove the outdated passages that mention Detox. Other surveys, studies and reports are sufficient to establish that harassment is a serious problem on our platform. -- DannyH (WMF) (talk) 19:07, 29 June 2019 (UTC)
@DannyH (WMF): Many thanks for getting back to me on that. Could you update the Detox page accordingly, and note there that these flaws were known about two years ago? DuncanHill (talk) 19:12, 29 June 2019 (UTC)
@DuncanHill: Yeah, that page describes the 2016-2017 research I was talking about. I'll ask the Research team if they want to update the page. DannyH (WMF) (talk) 19:22, 29 June 2019 (UTC)
@DannyH (WMF): Thanks again. DuncanHill (talk) 19:24, 29 June 2019 (UTC)

ApologyEdit

I apologize to User:JEissfeldt (WMF) and the Trust and Safety team for leaping to the wrong conclusion about their entry into Wikipedia administration. I wanted to defend a Wikipedia of my dreams, a Wikipedia from the 2000s that no longer exists, and imagined that "the community" was agitating for principles that I believe in but they do not. I became aware of this when, joining foolishly in a brainstorming session of ways to disrupt Wikipedia in protest, I found out that at some point they decided that it is no longer acceptable to link to images of violence, even for purposes of talk page discussion. I found this out by getting an indefinite block on Wikipedia. I don't know how general this prohibition is now (I don't see where it is written out), but for example on w:Muath Al-Kasasbeh the admin who blocked me removed and revdeleted a Fox News reference, saying "No more ISIS snuff!" This brings me back to reality about the system I had been trying to defend. In all honesty, I don't think I have ever changed a policy in any way despite megabytes of whining. I don't know of a policy to change that would affect my present situation, even if I still had the motivation to do so. "Wikipedia is not an experiment in democracy" - that's not just truth, but policy. The question is -- who is the rightful sovereign?

Wikipedians have built articles that we see mirrored over and over again in web searches, that shape opinions of even the independent authors. If editors can be blocked indefinitely for citing a single offensive source, articles will not be able to convey the entire range of viewpoints on a subject. Wikipedia will be a key checkpoint for deciding the world's permissible range of research. So we can have one of two alternatives for how to administer the site: leave it in the hands of volunteer administrators to choose what viewpoints are permissible, or put it into the hands of professional employees of the organization that makes Wikipedia possible. Neither one is safe now. But I never really had any reason to believe any allegations against your team, except that you were showing restraint in keeping confidential what you said you would. There was never really any basis in policy for trying to fight your power, nor anything to be gained by it. I can only hope that you will try to preserve what you can of the classic principles of the Wikimedia brand.

I don't know if I will ever be able to contribute again to Wikipedia, but even if I am, my enthusiasm and my involvement in its politics is basically over. It will be up to you to see what you can make of the site. Again, my apologies for giving you needless grief. Wnt (talk) 16:18, 1 July 2019 (UTC)

User:Wnt, I deeply respect your work and what you have said above. You are assuming that a system without transparency will work because you assume good faith of the employees and contractors implementing it. That assumption of good faith must be earned by a rigorous conflict of interest standards and full disclosure of any interests. For example, suppose the person accused of being a harasser is the mother-in-law of a T&S staff member or the person filing the harassment complaint is the spouse of a WMF Board member, what should be disclosed and how should recusals be handled in an auditable way? Hlevy2 (talk) 20:08, 19 August 2019 (UTC)

Harassment translation page in Italian WikipediaEdit

WikiDonne was one of the user groups questioned for the new user reporting system. We are now discussing on the draft of the translated page in Italian Wikipedia about harassment / molestie. Is something new, but very important in order to have a healthy community. Communities are different, certain things are repetitive and certain behaviors differ culturally, but I think that many things (threats for example) are important that they remain as they are. --Camelia (talk) 09:47, 15 July 2019 (UTC)

Factual, evidence based reporting tool - draft, proposalEdit

@SPoore (WMF): Below is a draft workflow for making a high quality report. Is this tool already being designed, or maybe even in the development phase? Please forward the following draft to the developers, thank you! — Aron M (talk) 23:41, 29 September 2019 (UTC)

Although this consultation is over, this topic came up in the current Community Health discussions, and I found this is the best place to post it.

I think the community acceptance of a Code of Conduct, and user reporting strongly depend on the transparency and accountability of that system. Well structured reports that focus on the evidence will provide the transparency and make evaluation easier. This is the draft of a tool that helps users create consistent, well structured reports easier, than currently possible.

Collecting edits

  1. The tool helps users collect the edits for evidence by selecting the relevant text on pages (queries the server for the edit(s) that added that text), or selecting the edits in the history as explained in Thank-like_flagging.
  2. The user has an option to select the relevant (violated) rule from (in order of importance): the CoC/ToU, or a policy/guideline page, or other pages. CoC, ToU and a few, last used policies should be offered directly, others searched by title. The user selects the text of the rule, similarly to the previous step, only this time there's no need to look up the edit, the report can refer to the permalink of the page.
  3. The visibility of the evidence is selected: public / non-public (visible to evaluators and accused) / private (not visible to accused). Implementation detail: such evidence is presented with a distinctive background color.
  4. Additionally evidence can be added in the form of a URL and a citation copied by hand. The tool makes an archive of the linked webpage.

Finalizing the report

  1. The collected edits are listed on a special page. The cited (selected) text and the relevant rule is visible in the list. Implementation detail: the citations should be truncated at ca. 100 characters, and expanded/collapsed when clicked.
  2. The citations can be modified, which returns to step 1, and the relevant rule (step 2).
  3. The owners of the collected edits are listed. The order of the list can be edited, and there's user's role can be selected as accused / involved / etc. (custom). Additional users can be added and removed.
  4. There is a textbox for a short summary on top and for each edit for optional details. There's a bigger wiki textbox (Visual Editor) for a longer explanation at the bottom.

Submitting

  1. Either all reports go to the <user report handling group>, or the user can select where to send the report: to said group (WMF, I assume) / a local noticeboard / ArbCom / T&S / ...
  2. The visibility of the full report is selected: public / private (only the reporter and the recipient group can access it) / specific individuals or groups can access it (for marginal cases only). There is a short textbox for the reason of non-public reporting.
  3. The report is constantly saved as the user works on it (in the browser or server-side). The report is only accessible to the reporter, until submitted.
  4. Once submitted, it's accessible with a wiki link to the authorized users.
  5. The wiki link is generated by a per-project setting, similar to WP:User reports/<username>/<date>-<index> for the first accused (index optional). A redirect is generated for additional accused users.
  6. Involved parties are notified, if the report is visible to them.

Evaluation

  1. Evaluation takes place on partially or fully visible page. Visibility can be changed with the agreement of the reporter and evaluators. The accused can request such change.
  2. The reporter, the accused and other involved users have their own thread to comment, similar to the arbitration process. This can be implemented with structured talk (if the copy-paste issue is fixed). Only the last comments should be shown below the report, limited to a fixed amount of words, and 10 or so comments. The full, unlimited thread can be viewed by clicking a button.
  3. Evaluators add their findings in their own wikitext sections.
  4. Private evidence should be discussed in a dedicated thread, that's not visible publicly, unless all evidence is private.
  5. The report can be extended / amended by the reporter. Modifications must be marked clearly in the report by the tool, and the previous text be reachable with a click. The workflow of this is to be drafted.
  6. In a private case the accused is notified and invited to defend themself on a subpage of the report at a time chosen by the evaluators. Questions are presented on this page, the accused can reply in their own talk thread(s).

Evaluation on classic noticeboards

  1. The report page is transcluded on the noticeboard. If this raises technical issues, then simply posted.
  2. There are no structured talk threads. The tool should generate a section for the reporter, one section for each accused, a section for findings by administrators, and a section for unstructured comments. In very simple cases these sections might be removed.
  3. On noticeboards only standard editing procedures are available.


This tool can be used for administrative actions as well. The administrator can prepare a report as described above, also selecting the actions to be taken. Upon submitting the report the user(s) are notified with the report transcluded to their talk page. This will properly inform the user of the exact problems, and the justification of the action taken. The user(s) should retain access to their own talk thread in the report for appeals, regardless of the admin actions taken.
— Aron M (talk) 21:41, 15 August 2019 (UTC)

  • It's a nice proposed system (and would be great to have, but for current usage), but even assuming it worked well 2 things come to mind: 1) I don't think a smoother system will make undesired (or worse) enacted super-rules any easier to tolerate; 2) A good chunk of the 400,000+ words on WP:FRAMBAN were against how a user (who hadn't broken the big 4) could be expected to defend themselves fairly without complete access to at least the evidence content (sans name etc) - the presence of the "private" option would be a massive no just on its own. Nosebagbear (talk) 22:02, 15 August 2019 (UTC)

Feedback about the processEdit

I clicked the "Give feedback about the process" button, and landed here. My feedback is, that I would have contributed, had I known about this in a timely manner, but I only just found out about it now (from a June Signpost article, linked from a September Signpost; but I forget how I landed there, since I'm not a subscriber; possibly one of the Framgate fallout pages). This is not the only community discussion I've missed; possibly because I hang out on en-wiki more than meta. Mathglot (talk) 21:30, 1 October 2019 (UTC)

Tough to find meta discussions is an ongoing issue (the talk page consultations were a clear exception) - I and a few others attempt to add the key ones to wp:cent, but I'm sure we miss a fair few. From a WMF side I'd love them to add banners for all AC-users on all projects (even with loads of languages it would only be 20 words - you could do the biggest 50 easily), and locally (which for me is en-wiki) there was an idea back about a meta-newsletter (like tech's), which would be great. Perhaps an idea I should try and find some like-minded souls to help with Nosebagbear (talk) 20:56, 2 October 2019 (UTC)
Return to "Community health initiative/User reporting system consultation 2019" page.