Community Wishlist Survey 2019/Admins and patrollers/Create an integrated anti-spam/vandalism tool

  • Problem: Our current infrastructure for countering vandalism and spam at the cross-wiki level is stuck in 2004. We have a spamblacklist with poor logging done on a per-wiki basis, a global abusefilter which isn't global, and a title blacklist that doesn't log at all.
  • Who would benefit: Stewards
  • Proposed solution: Create an integrated anti-spam/vandalism tool (like Phalanx used on Wikia) that combines the functions of the spam and title blacklists, as well as limited abusefilter functionality, to better respond to ongoing spam and vandalism at the global/cross-wiki level.
  • Phabricator tickets:


Hi Ajraddatz. I wonder if some of the needs of this proposal have been met by the recent improvements made to Recent Changes and Watchlist feeds? -- NKohli (WMF) (talk) 18:37, 30 October 2018 (UTC)

Unfortunately not - what I'm thinking of is cross-wiki in scope and blocks edits before they happen, rather than reacting to edits that have gone through. – Ajraddatz (talk) 18:40, 30 October 2018 (UTC)
  • This seems relatively minor, because the individual projects already have these tools. DGG (talk) 01:09, 4 November 2018 (UTC)
    First of all, it is absolutely worth investing in infrastructure that could work on 700 projects instead of repeating the same action 700 times. But we also have the spam and title blacklist globally. The issue is that they are both old extensions that aren't very functional - see this other proposal for more information. – Ajraddatz (talk) 03:31, 4 November 2018 (UTC)
Hi! We discussed this proposal in our team meeting today. We are not sure how much we will be able to do but we'll try to do our best. It will probably not be a big cross-wiki thing though. Doing cross-wiki projects is difficult with MediaWiki's current architecture. We will scope this project and come up with what we can do if it is in the top 10. Thanks. -- NKohli (WMF) (talk) 17:50, 13 November 2018 (UTC)
Thanks! You're in luck, because I doubt this proposal will be in the top 10. Because of the limitations in using the current extension, only a small handful of people even work with it, and this topic isn't glamorous enough to gather attention from beyond the handful of people would would be impacted by a change. That said, it's still something that is important to have eventually, so I hope this puts it on the map. – Ajraddatz (talk) 18:40, 13 November 2018 (UTC)

I agree anti-abuse tools could use more love. This proposal, on the other hand, could use more details. Are you specifically worried about logging? The difficulties local communities have in interacting with global tools? Are there specific features of Phalanx that you miss? What's wrong with global abuse filters? --Tgr (talk) 04:15, 25 November 2018 (UTC)

Fair point. I'll give a walk-through of the current state and point out the big problems that could be fixed. I'll also preface this by saying that integrating the elements into one tool is more for convenience; the problems are with each specific tool, and could be improved separately instead.
Spam blacklist: I notice a link being spammed by a couple of bots, so I want to add it to the spam blacklist. My workflow involves opening the page, waiting a few seconds for it to load, painfully scrolling down the list and trying to find a good place to add the regex. The page is so large that it lags going through it. Once I find the right place, I need to figure out what the correct regex is - I can speak regex-1, so simple additions are no problem, but I need to ask someone else to add anything more complex. After adding the text and waiting another 10-20 seconds for the page to save, I then need to manually add an entry on the spam blacklist log, since there is no automatic logging of additions. This takes another 10-20 seconds to get the diff number and justify the addition. Once I've made the addition, I have no ability to follow-up and see what it is blocking because the logging is done on a per-wiki basis and I don't have the time to check all 700 wikis. Total time: 1-2 minutes, when the rest of my anti-abuse workflow takes 10-20 seconds total. Big areas to improve: 1. change it from a big page to an extension that allows each entry to be handled individually. Imagine account blocking if you had to add the name of the account to a big page with thousands of other names. 2. automatic logging. 3. some system where you can see the impact of the action you just took - subsequent attempted uses of the blocked link.
Title blacklist: I notice an account name has been abused across multiple wikis, so I want to block the name. My workflow is a bit easier because the title blacklist is smaller than the spam blacklist, so it only lags a little bit. I add the appropriate regex to the appropriate section and create a log entry. Total time: 1 minute, still much slower than the rest of my workflow. There is no per-wiki or cross-wiki logging to see what impact my entry has had. Areas to improve: same as spam blacklist, individual entry handling, automatic logging, a way of auditing the actions blocked by the addition.
Global AbuseFilter: some cross-wiki vandal is doing a specific type of vandalism across multiple wikis. I create a filter to prevent such actions (this is already pretty complex, and could use some serious simplification for the less technically-minded among us), but it only applies to the small wikis that the global abusefilter is enabled on. The vandal continues to hit large wikis, forcing me to either contact local admins to get them to duplicate the global filter locally or (and this is what I usually end up doing) ignore the problem because I don't have half an hour to follow up on this. I also don't necessarily need a whole abusefilter: a simple condition that could be done through Phalanx or SpamRegex (old extension) would have sufficed. Areas to improve: make the global abusefilter global, add a lower tier of abuse prevention through the integrated tool.
And of course this is just a start of the laundry list of unaddressed problems with global anti-abuse tools. Global accounts cannot be blocked, requiring me to checkuser almost every account I lock so I can block the underlying IP as well. Abuse from developing countries tends to be on mobile ranges or from other public IP ranges, so there are often situations where I cannot place any IP blocks due to the potential collateral damage - the problem here being no other options to block people other than using IPs and account names. Many stewards also just block anyway, leading to the literal hundreds of unaddressed requests for unblock in our email queue from people caught in massive global rangeblocks. But this area seems like one where existing extensions (spamregex, phalanx) could be used as a starting point to make some easy fixes. – Ajraddatz (talk) 23:24, 25 November 2018 (UTC)
  • Comment. This proposal is probably to ambitions for the Community Whitelist. I doubt that this small team of developers can create such a tool within a year. Ruslik (talk) 18:04, 30 November 2018 (UTC)