WikiLoop/DoubleCheck/RfC:Levels for WikiLoop DoubleCheck Reviewers

OverviewEdit

Reviewing Wikipedia edits is an important activity to ensure content quality, but it could also be abused or misused. Existing major review or counter-vandalism tools (e.g. STiki or Huggle) made the tradeoff by restricting access to their tools to only a small group of trusted and privileged users. WikiLoop DoubleCheck, an open source web app for easier reviewing edits, want to explore an alternative model: just like everyone can edit Wikipedia, the DoubleCheck would like to allow everyone to review Wikipedia, while having mechanisms to address the concerns about reviewer competence and trustworthiness.

In this page we propose a design to allow reviewers with different levels to have different restrictions or powers.

Ideally we hope to complete the discussion by end of 2020-08-29 23:59, unless the community suggests a large scale change to this proposal.

Design summaryEdit

We propose the following ladders and their restrictions or additional powers for WikiLoop DoubleCheck.

A DoubleCheck reviewer's credibility ladder will be determined by the Wikipedia permissions or # of DoubleCheck judgements or endorsements whichever comes higher.

For example,

  • if a Wikipedian is a Wikipedia admin, but hasn't conducted any reviews on WikiLoopDoubleCheck, their ladder is level 5, the top level.
  • if a Wikipedian is an auto-confirmed user, but has conducted 10K or more reviews on WikiLoop DoubleCheck, their ladder is level 4
  • if a Wikipedian is an extended-confirmed user, but has received two L4 reviewer's public endorsement, their ladder is also level 4.

The main rationale is that

  • For below average credible reviewers, while we allow them to contribute to reviewing, we basically don't allow them to have any additional power to edit Wikipedia content or discussion pages.
  • For average credible reviewers, the tool hopes to make it easier to discover revisions worth reviewing, e.g. based on their topic interest or probability of vandalism.
  • For more credible reviewers, such as rollback permission holders or admins, we want to allow them to have at least the same power with other alternative tools, e.g. STiki, Huggle, and even more productivity in doing their patrol and administrative work if they choose to.
Either reach Wikipedia User Level Or has # of DoubleCheck contributions To be imposed with Restrictions or allow using Powers
blocked IP or user 0 Level 0
anonymous user such as 5 or 10, to be discuss in Separate Discussion 1 Level 1
logged in user / auto-confirmed user such as 50 or 100, to be discuss in Separate Discussion 1 Level 2
extended confirm user such as 500 or 1K, to be discuss in Separate Discussion 1 Level 3
users in {rollback} group such as 5K or 10K, to be discuss in Separate Discussion 1, or publicly endorsed by two Level4 reviewers or one Level 5 reviewers Level 4
users in {admin, bureaucrats, global stewards} group such as 50K or 100K, to be discuss in Separate Discussion 1 Level 5

Here is a list of our current or planned restrictions / power features for different levels.

Levels #/ Restrictions or Powers Level 0 Level 1 Level 2 Level 3 Level 4 Level 5
Restrictions R1. Highlighted as contributions from blocked users
R2. Speed restricted
R3. Can not be endorsed
Powers P1. URL-to-Undo: display a button taking the reviewer to the Wikipedia page before they manually undo bad the revision
P2. Generate WikiText based on reviewer-selected reasons for warnings RfPP like Twinkle
P3. Publicly endorse other WLDC reviewers
P4. Revert or Undo multiple revisions from WLDC like Huggle and STiki, if also having rollback permission in given wiki
P5. Issue reviewer-selected talk page warnings or RfPP directly like Twinkle
P.6 Directly block users or protect page if also having admin permissions in given LANG of wiki
Comparison 1

STiki for vandal patrollers requires a 1,000 mainspace edit threshold or the rollback right for its use. (see: Wikipedia:STiki#Using STiki), supports only EN Wiki to start using

Comparison 2

Huggle requires rollback permission or to be on a global white-list to start using

Request for commentsEdit

There are a few specific questions I want to seek community feedback on

  • (1) Does the criteria to define each level make sense?
  • (2) Should we add or remove restrictions or powers for these levels?
  • (3) Do you have any concern with providing wiki admins features like BLOCK or PROTECT directly in the DoubleCheck interface? (Only those with adminship on a given wiki will be able to use them.)

Please leave your comments below. Feel free to use the templates {{support}}, {{oppose}}, and {{doubtful}}. Xinbenlv (talk) 02:27, 15 August 2020 (UTC)

  •   Support seems fine. --C1K98V (💬 ✒️ 📂) 03:10, 22 August 2020 (UTC)
  •   Support SuperGoose007 (talk) 03:15, 22 August 2020 (UTC)
    • After reading some of the comments,   Oppose. I am concerned that this structure would be too confusing and turn off new users while not convincing users of Huggle and STiki to switch over instead, going against DoubleCheck's premise of an review tool that any one can edit. SuperGoose007 (talk) 02:02, 25 August 2020 (UTC)
  •   Support WikiLoop Huggle. No changes necessary. Can I Log In (talk) 03:25, 22 August 2020 (UTC)
    •   It is doubtful I find the "Or has # of DoubleCheck contributions" to be absurd. Someone is going to exploit Level 4 by autoclicking for 2 hours, and then autoclick revert; there you go if a vandal goes unnoticed doing that, then you pretty much have a vanal with access to "huggle". User rights or endorsement, but not # of contribs. Can I Log In (talk) 17:10, 22 August 2020 (UTC)
@Can I Log In:, thanks for the comment, yes I feel some of our reviewers will think those bars are too high, and some will think otherwise. Yesterday most of the voices are "just fine" or "too high" of those bars. I am happy that you now offer an opposite voice, as the bars being "too low". Yes, the proposal tries to resolve concerns of bars being too low by setting a high number of contributions required, and also imposing speed restriction. That way, one will not be able to auto-click and get higher level by 2 hours. The rationale is that, if someone is willing to spend time for hacking it, the current Wikipedia autoconfirmed-level model is extremely easy to exploit, e.g. this case, open an account, wait for some arbitrary days (>4), then suddenly edit and revert their own user page, see this case[1]. Our hope is to (1) allow at least some people be able to gain the status without someone from permissioned group / prestige to approve, but also (2) exploit-vulnerability at least better than Wikipedia and existing widely used tools. E.g. Twinkle. —The preceding unsigned comment was added by Xinbenlv (talk) 21:57, August 22, 2020
  •   Oppose tl/dr: (1) no (2) no (3) nope
I'm leading a team of developers for RedWarn, another counter vandalism tool. Following a number of bad actors, we've been planning regarding restrictions for the next version of RedWarn at redwarn.wmcloud.org. Our relevent drafted decisions here are:
  • We cannot assume that this is an editor's first and only counter-vandalism tool. Many of Wikipedia's docs encourage the use of Twinkle, so many users will be coming from which. If we use our tool as the sole way of noting a user's counter-vandalism experience, it's unlikely people will pick it up, growing tired of a restricted mode and being unable to do things. A simple way to work out is to calculate the number of the user's reverts, which can be done by checking their contributions for edits made with Twinkle, the RedWarn user script, and other tools such as undo and rollback.
  • Creating a complex restrictions system will just make everything harder for us to work with and confusing for users. Either you are in a restricted mode, or you're free to go as you like, given your account has the permissions to use features on a Wiki (such as blocking).
My notes here are also that:
  • 10,000 and 100,000 (note: don't know why this 100k unlock is here for blocks, which non-admins can't even do) actions is an INSANELY high bar. This will really end up restricting actually constructive patrollers.
  • The fact P4 and P5 is only Level 4+ is a huge drawback, editors might as well just use Huggle at that point as WLDC doesn't actually give them any more incentive to use it at this point in time. In my opinion, this should be given as a Level 3 feature, considering the same can be done with Twinkle and even semi-automatically with RedWarn, both popular tools on the English Wikipedia.
Best wishes, Ed6767 (talk) 03:27, 22 August 2020 (UTC)

@Ed6767: Thank you, I will take that feedback and digest them, Thank you for your detailed comment. Really appreciate that! I like to address taht the reason those are high bars of edits means "even you don't have endorsement from others, you can gain trust by just reviewing". We also provide a allowlist-kind-of-feature: that is any two endorsements from L4(rollbackers) or one endorsements from L5(admins) will give the user a L3 power. This is equivalent to the Huggle/STiki's whitelist models except that to get whitelisted for Huggle/STiki one will need approval by its developer or a given group, but the WikiLoop DoubleCheck's model recognizes endorsements from already trusted people to establish such allowlist, and hence more decentralized and less controlled by WikiLoop DoubleCheck Xinbenlv (talk) 05:39, 22 August 2020 (UTC)


  •   Support Tipeditor Looks good to me!
  •   Oppose per Ed6767. The requirements are too strict and discourage constructive contributions. Maybe 500 and 5,000 edits would be better? (note: on enwiki extended confirmed is 500 edits, that's the most permission you can get automatically based on edit count.) Buidhe (talk) 04:07, 22 August 2020 (UTC)
  •   Oppose I think the idea of 500 and maybe 2,500 makes a more attainable incentive than 1K and 10K. - AppleBsTime (talk) 04:13, 22 August 2020 (UTC)
  •   Support but doubtful of some "bars". The Level 4 and Level 5 requirement to do things as simple as leaving a message on the talk page (which can be done manually by anybody, and also through Twinkle) seems excessively high. MrConorAE (talk) 04:41, 22 August 2020 (UTC)
@MrConorAE:, @Ed6767:, @Buidhe:, thank you for your early feedback! I previous thought the community are generally worry people will abuse review tools because they are tool powerful, and thus STiki and Huggle and many others requires Rollback permissions to even begin using. I am happily surprised instead you like to allow more people to use them, that's exactly what I like! Hope do you feel we separate the discussion regarding to "which number of revisions is needed to conduct certain things" from the the rest of the proposal, we could make the general proposal more generic in wording, while open a separate section to discussion number bars, how do you like it? Xinbenlv (talk) 05:24, 22 August 2020 (UTC)
  •   Weak oppose I feel there is definitely a need for a system to prevent abuse of the tool, but as Ed6767 brought up, the system is overly complex and restricts and deters many users from using the tool. — Yours, Berrely • TalkContribs 08:45, 22 August 2020 (UTC)
  •   Weak oppose With WikiLoop DoubleCheck,Users like me who has little time to edit whole article can contribute by cloudsourcing their time.I worry restrictions for editors with less edit(for example,I did only 300 edits for 11 years)limits users with little time.I hope I am a rare example.--Paperworkorange (talk) 10:32, 22 August 2020 (UTC)
  •   Oppose I primarily use Twinkle for counter vandalism work, while having AWB access for programmatic edits on enwiki. Auto-confirmed users (on enwiki, and L2 as defined here) can use twinkle to do the tasks that are defined for L3 and L4. Rather than to restrict access in such manner, I feel that the access to the tool should remain as open as possible (at least similarly to twinkle), and that any abusive use of the tool in any particular lang wiki to be dealt with as with how existing abusive users have been dealt with so far in there. e.g. would-be users of such tools are slapped with a responsibility notice when visiting the documentation page of the tool that warn users of being potentially blocked for abusive use, and abusive users are being blocked through enwiki's existing administrative system. If individual wiki's community have concerns over new reviewers' competence and trustworthiness, why not offer the ability to have a whitelist and/or blacklist (i.e. like AWB's checkpage)? Robertsky (talk) 11:04, 22 August 2020 (UTC)
  •   Support per MrConnerAE. P,TO 19104 (talk) 14:52, 22 August 2020 (UTC)
  •   Support It looks fine to me. Maybe it will be convenient to reduce the "bars", as proposed. Alexcalamaro (talk) 17:24, 22 August 2020 (UTC)
    •   It is doubtful One question : to perfom P4 it states that "if also having rollback permission" , so you'll be already in level 4. So maybe you will have Level 4 by # of contributions but you wouldn't been able to perform P4. Same for P6 and admin, it states "if also having admin permissions". So Level 5 will be not enough to perform P6. I'm right ?. Alexcalamaro (talk) 18:07, 22 August 2020 (UTC)
  •   Oppose Restricting autoconfirmed users from being endorsed seems completely arbitrary. Levels 2 and 3 should be collapsed. --Mathnerd314159 (talk) 17:52, 22 August 2020 (UTC)
  • Much simpler permission structure, from scratch:
    • Blocked users: Highlighting contributions from blocked users is helpful, and blocked users should be blocked from the tool. They are unable to revert or post talk-comments anyway, and they would be likely to make poor quality or even malicious reviews.
      • Recently a partial-block system was introduced. It would probably be difficult to try to deal with this inside the tool, and I expect it would be extremely rare for it to be relevant in the tool. You can probably ignore partial blocks, relying on the wiki to report back any attempt to make a blocked undo or blocked talk edit.
    • Admin and rollback: These should be based on the wiki userrights, period. The tool may display the relevant buttons if the user has the userright.
    • Autoconfirmed: Full use of the tool, including links for undo and talk. Undo and talk are not considered restricted actions (unless a user is actively blocked), and I think the concern is people trying to do these things without having any knowledge or familiarity with the native wiki system. Twinkle is usable by autoconfirmed users. (Edit: The rate limit thing is probably a good idea.)
    • Regarding IP users or not-yet-autoconfirmed users: While they are able to identify vandalism, users who don't know policies&guidelines even exist can't really review whether an edit is appropriate. And as noted above they probably shouldn't be using undo and talk through the tool without familiarity with the native system. I think it's simpler and makes more sense to just activate the entire tool at autoconfirmed. Alsee (talk) 03:16, 24 August 2020 (UTC)
  •   Weak oppose Per Ed6767 above, plus some additional comments. What I'd like to see is some global permissioning structure encompassing all anti-vandalism tools. It doesn't make sense for the requirements to be extremely different for each tool. I don't know what it would take to achieve the following but I think it would be great to have a "basic anti-vandal" user right (maybe rollback already satisfies this?) which grants you basic access to all anti-vandalism tools, and then if absolutely necessary, one separate user right per individual tool that gets granted if you somehow demonstrate you've completed a quick training (could be as simple as just watching a 5 or 10 minute tutorial video for the specific tool) or get grandfathered in if you've already used the tool a certain number of times. Paradoxsociety (talk) 17:07, 24 August 2020 (UTC)
  •   Comment I received a request for comments on this proposal but I have never heard of DoubleCheck and have no idea (nor can I find) how to use it. I'd appreciate if someone would leave a message on w:UserTalk:Deisenbe with instructions. Thank you. Deisenbe (talk) 10:46, 30 August 2020 (UTC)

Separate Discussion 1: Number of Contributions needed for each LevelEdit

If we allow reviewers to gain power without needing to be on a given allowlist, or without endorsements from anyone, what numbers are appropriate to be set as bars? Xinbenlv (talk) 05:43, 22 August 2020 (UTC)

  • Imo the community should set the starting limit of the bar at min 1K or max 3K edit count, rest no changes needed. --C1K98V (💬 ✒️ 📂) 11:35, 22 August 2020 (UTC)
  • x/0 It's potentialy exploitable. See my comment. Can I Log In (talk) 17:18, 22 August 2020 (UTC)
Responded as Special:Diff/20388160 Xinbenlv (talk) 22:00, 22 August 2020 (UTC)
What is even more absurd, even though there is no difference in the functionality, is Level 5. Non-administrator in enwiki, reach X DoubleCheck contributions, and you have P6, but you don't have the user rights to do so. What's the point? Except to endorse a Level 3 "twice" as if you were a Level 4, all I can say is "bruh". Can I Log In (talk) 22:43, 22 August 2020 (UTC)

QuestionsEdit

  • What is a DoubleCheck judgement?
Judgement refers to the opinion a reviewer given to a Wikipedia edit aka revision, it currently have 3 possible value
1. ShouldRevert(means Damaging)
2. NotSure (means neutral or whether damaging or not is Not immediately obvious), and
3. LooksGood (means Not damaging, or in some tools being called Innocent)
  • What is a DoubleCheck endorsement?
1. If a Wikipedia editor A, thinks Wikipedia editor B, is a trusted editor, A could give B an endorsement on WikiLoop DoubleCheck. When A is highly trusted editor, B gets bump up le their perceived trusted level in WikiLoop DoubleCheck as well. This feature has not been implemented yet.
  • What is a DoubleCheck review?
The process of reading a Wikipedian edit and gives a judgement.
  • What is a DoubleCheck contribution?
The DoubleCheck contributions currently only counts how many judgements are given by a reviewer, i.e. how many reviews are done by this reviewer. However, in the future, we might consider counting other form of actions as contributions too, such as "taking follow up actions": issuing warning, reverting revisions, rolling back multiple revisions), or "reviewing other people's judgments". etc. So we use a more general term of contribution. Currently, think of it as counting judgement.

Mark D Worthen PsyD (talk) 18:07, 22 August 2020 (UTC)

Thank you @Markworthen: for asking these clarifying question. It helps me realize these terms are used without explanation. Xinbenlv (talk) 18:26, 22 August 2020 (UTC)

SuggestionsEdit

First, I know only one language fluently, therefore please take my suggestions in that light, i.e., I respect others who know two or more languages. Suggestion #1: Ask a colleague who knows your native tongue and English well to proofread the English version for proper grammar, usage, syntax, etc. Only minor improvements are needed, e.g., "P1. URL-to-Undo: display a button taking the reviewer to the Wikipedia page before they manually undo bad the revision" (I am not sure what that sentence means), and "such as 5 or 10, to be discuss in Separate Discussion 1" ("discuss" should be "discussed").

Thank you, I will take your suggestion in the future by asking my native-speaking friends to proofread for me. Also some of the issue causing the challenge to understand my writing might be due to new terms and concepts I am trying to introduce, and I will try to explain them better. Xinbenlv (talk) 05:11, 24 August 2020 (UTC)

Suggestion #2: Emphasize the either/or nature of the first two columns in the first table, e.g., Either reach Wikipedia User Level Or has # of DoubleCheck contributions. Mark D Worthen PsyD (talk) 18:07, 22 August 2020 (UTC)

Good idea, will do! Xinbenlv (talk) 05:11, 24 August 2020 (UTC)

It's important for the tool to prominently display whether there are unread Notifications or unread Talk messages for the user. If the user's edits are getting reverted, or if someone is leaving messages on their user talk, it's important for the user to be aware so they can defuse any potential conflict. If the user is unaware and they keep making more edits the situation may escalate badly, possibly resulting in anger between editors or even resulting in a block. Alsee (talk) 03:47, 24 August 2020 (UTC)

Thank you, that's very good idea, filed as a feature request and track there issue #350. We will plan on adding these features. Please stay tuned!Xinbenlv (talk) 05:11, 24 August 2020 (UTC)