Community Wishlist Survey 2022/Larger suggestions/Automatic vandalism/spam detection and revert in more wikis

Automatic vandalism/spam detection and revert in more wikis

  • Problem: Vandalism and spam/promotion is a serious problem in Wikipedia. Fighting it is tedious and boring and sometimes makes the user target of harassment. This is already damaging image of Wikipedia (e.g. Germany's John Oliver did a full story about how spam and promotion is happening and going unnoticed: https://www.youtube.com/watch?v=FNsTaKwyAzI). As pointed out in the story, number of articles to patrol has been doubled in the past decade while number of volunteers shrank to half. WMF hasn't done anything to improve workflow of patrollers in the past five years (the last I remember is RCFilters, then PageTriage but that was only on English Wikipedia). Some Wikis have tried drastic measures such as banning all IP editing that goes against Wiki's guiding principles and harms the wiki in the long-term.
  • Proposed solution: Expand ClueBot NG to more wikis or build a similar one.
  • Who would benefit: Editors will be less burdened with the firehose of vandalism and spam and promotion. Readers will enjoy a higher quality Wikipedia.
  • More comments: Note that ORES is not designed to be used for automatic reverts. It's good at reducing the pool to review (a little, not much because ORES relies heavily on user being logged in or not) but still it's not accurate enough for automatic reverts (I have tried it several times in two different wikis). I also don't mind any other proposal that improves lives of patrollers.
  • Phabricator tickets:
  • Proposer: Amir (talk) 12:57, 15 January 2022 (UTC)[reply]

Discussion

  • CB NG is old and I don't know how maintained it is, but regardless it would need a new training set for each language.... which I suggested be pulled from ORES, but if ORES isn't good enough, I'm not really sure how well a new bot is going to do. :( --Izno (talk) 00:33, 17 January 2022 (UTC)[reply]
    @Izno, this is a rather recent discussion about what you're mentioning, apart from ORES. And this is a discussion started in regard to ORES. ORES and CB look to be, unfortunately, dusty. :/ Klein Muçi (talk) 01:35, 29 January 2022 (UTC)[reply]
    For language training sets, language are complex and more spam-related words are different in each language. For global bot Cluebot NG to happen, we need to reform SRG as a page like WP:AIV, which has a bot to function and have a queue job (won't archive at all, like WP:AIV already, have a dedicated page for bot-reports). Some discussions of SRG are lengthy, and it makes a backlog to stewards. Larger discussions will still exist, at the page like Long-term abuse (shortcut: LTA) or other sections of this page, and it would reduce the area of immediate action for stewards. Thingofme (talk) 14:18, 5 February 2022 (UTC)[reply]
  • Indeed, if ORES isn't good enough, there's nothing Community Tech can build in a reasonable amount of time that will be better. I'm moving this to our Larger suggestions category instead of archiving so this proposal gets the attention it deserves. Thanks, MusikAnimal (WMF) (talk) 22:03, 24 January 2022 (UTC)[reply]
  • Besides the well-known en:User:ClueBot NG on the English wikipedia, there's also User:PSS 9, which does anti-vandalism work on the Bulgarian Wikipedia. Are there similar bots on other wikis? Uanfala (talk) 22:58, 9 February 2022 (UTC)[reply]
Ah, yes, there was also an email a few months ago in wikitech-ambassadors@ from User:Samwalton9 (WMF) who was researching such anti-vandalism bots. Sadly, I never got to reply to it.
As for the proposal itself, I really like the idea, but IMO it might be prohibitively difficult to accomplish, especially as a universal solution for all projects. PSS 9 has shown some surprising effectiveness in its 5 years of operation, but it's mostly an expert system that took literally years to fine-tune to the specific vandalism in bgwiki. It would fail miserably in any other project and it does fail sometimes spectacularly even on bgwiki—like when 3 years ago it blocked a couple of global sysops (true story).
The bot was developed in response to one group of particularly zestful, resourceful, and cunning vandals, which was a great opportunity to learn about how vandalism works. Simply looking at the content of the edits soon proved to be inadequate, so the bot began correlating different variables and tracking behaviour patterns reaching as far as the choice of usernames. Even speaking of edit content alone, training can be a daunting task—human resourcefulness should never be underestimated.
Despite all the hype, AI can indeed help tremendously: I had disabled PSS 9 several times in the past because of its problems, only to have the community each time ask to have it brought back. But AI also does need tremendous amount of work, and not just in bringing it up to the task, but also in keeping it relevant as the bad actors raise the bar. TL;DR: I'm definitely not saying that the goal isn't worth pursuing, but the huge amount of resources needed must not be underestimated.
— Luchesar • T/C 14:22, 11 February 2022 (UTC)[reply]

Voting