Grants:Project/Daimona Eaytoy/AbuseFilter overhaul


statusselected
AbuseFilter overhaul
summaryOverhauling, modernizing and empowering a Wikimedia-deployed extension
targetAll WMF wikis, approx. 3000 third party wikis[1]
type of granttools and software
amount24 000 USD (in local currencies, see #Budget details)
advisorHujiMusikAnimal
volunteerHuji
this project needs...
volunteer
organization
join
endorse
created on15:34, 17 February 2020 (UTC)


Project idea edit

What is the problem you're trying to solve? edit

What problem are you trying to solve by doing this project? This problem should be small enough that you expect it to be completely or mostly resolved by the end of this project. Remember to review the tutorial for tips on how to answer this question.

The AbuseFilter (aka "Edit filter") [2][3] is an extension used on all WMF wikis to combat several types of harmful editing. It is the primary resource for fighting specific kinds of abuse, like spambots or blatant vandalism. The extension was written and deployed in 2008, and not much progress has been made in the last 10 years. The code is brittle and outdated and has been mostly unmaintained for years. This reflects into a high maintenance cost for WMF staff and volunteers, and a general discontent by end-users. Users have complained many times about the AbuseFilter being insufficiently helpful and buggy, and not enough developer time being spent on it — and they're not wrong. Yet, the AbuseFilter keeps playing a vital role in fighting vandalism, as many users have expressed multiple times.[4][5][6] As a volunteer, I (Daimona) have already spent some time fixing various issues. However, the technical debt is too high and steadily increasing, and we cannot catch up within our volunteer capacity.

No other volunteer or team is planning to engage in such significant changes. In this regard, it should be noted that the Core Platform Team owns code stewardship for AbuseFilter. This means that they support and guide other people working on this code[7], but they are not the maintainers. Also, working on the AbuseFilter is not within the team's roadmap or initiatives. This means that the maintenance of AbuseFilter is basically up to volunteers.

What is your solution to this problem? edit

For the problem you identified in the previous section, briefly describe your how you would like to address this problem. We recognize that there are many ways to solve a problem. We’d like to understand why you chose this particular solution, and why you think it is worth pursuing. Remember to review the tutorial for tips on how to answer this question.

We will work on modernizing the code and making it more robust. We will improve the code architecture to make it more maintainable and sustainable. We will address several bugs and empower the extension by adding a long-requested feature (shared variables).

We believe that this is the best approach, because AbuseFilter is of great value for the community, and it deserves ameliorations. Overhauling the code structure will also enable developers to add new features more quickly. This is one of the reasons why we are so focused on technical improvements.

Note that we will have clearly-defined goals for this project (described below), and this is not meant to be neverending maintenance work.

Project goals edit

What are your goals for this project? Your goals should describe the top two or three benefits that will come out of your project. These should be benefits to the Wikimedia projects or Wikimedia communities. They should not be benefits to you individually. Remember to review the tutorial for tips on how to answer this question.

  1. On the developer side, the main goal is to turn AbuseFilter into an easily-maintainable extension. We will reduce the need for maintenance, and make it easier for other contributors to work on the code. This will also deny the need for further grants.
  2. On the user side, the extension will have fewer malfunctionings. It will provide a new feature to help users write more powerful abuse filters.

Project impact edit

How will you know if you have met your goals? edit

For each of your goals, we’d like you to answer the following questions:

  1. During your project, what will you do to achieve this goal? (These are your outputs.)
  2. Once your project is over, how will it continue to positively impact the Wikimedia community or projects? (These are your outcomes.)

For each of your answers, think about how you will capture this information. Will you capture it with a survey? With a story? Will you measure it with a number? Remember, if you plan to measure a number, you will need to set a numeric target in your proposal (i.e. 45 people, 10 articles, 100 scanned documents). Remember to review the tutorial for tips on how to answer this question.

 
This is an approximative diagram of the structure of the AbuseFilter code as of Feb 10th. Every fuchsia arrow represents a dependency that shouldn't be there. You think it seems a big mess? It is.
Technical goal

We will commit to writing clean and well-tested code; we will measure our progress throughout the following metrics:

  • Test coverage (see current) will be accurately measured, and it will reach 50%. While not perfect, we believe this is good enough and achievable within this grant. It should also be noted that test coverage is not a perfect metric: 100% coverage does not means that the application is bug-free. We're also keeping the Pareto principle in mind: 20% of the code causes 80% of the bugs. Hence, we will focus on covering the hottest parts of the code, by using metrics such as CRAP[8][9]
  • All the extension code will be namespaced, according to best practices. This will also give us more accurate metrics.
  • Code coupling will be reduced. Specifically, we will refactor all the static methods in the main AbuseFilter class, hence eliminating all cyclic dependencies involving that class.[10]

As mentioned above, this will leave the code in a healthy status, hence facilitating future maintenance and feature additions.

User-facing goal
  • 10 important bugs will be resolved
  • A new feature will be implemented: shared variables (T120740)[11]

Do you have any goals around participation or content? edit

Are any of your goals related to increasing participation within the Wikimedia movement, or increasing/improving the content on Wikimedia projects? If so, we ask that you look through these three metrics, and include any that are relevant to your project. Please set a numeric target against the metrics, if applicable.


The shared metrics do not apply to this project.

Project plan edit

Activities edit

Tell us how you'll carry out your project. What will you and other organizers spend your time doing? What will you have done at the end of your project? How will you follow-up with people that are involved with your project?

The project will need careful planning, which we will do before starting the development work. We will work on a single thing at the time, one of us writing code and the other one reviewing that code. The following is a possible plan.

First of all, we will examine the phabricator workboard, triage tasks, and use story points to distribute our work. We might create a dedicated workboard on phabricator for better organization. This is when we'll outline the final progress plan.

Then, we'll start with the purely technical part. We will evaluate existing unit tests, to ensure that they make sense, and add new tests to cover risky areas of the code. This will make us more confident when touching "hot" code. From then on, we'll adopt a test-driven approach to ensure stability and keep the code well tested. We will review and rewrite the overall architecture following the best practices, both MediaWiki-specific[12] and generic, like SOLID principles. We'll namespace all the code (existing and new). The progress will be regularly monitored by using the metrics above, as well as external tools like phpda and codeclimate. At the end of this phase, we will end up with a healthy codebase.

At this point, we'll start addressing bugs. We will prioritize bugs that cause the most troubles to the users. Given our time distribution, we'll prefer small but critical bugs. We commit to fixing 10 of these. Finally, we will implement shared variables, a new feature requested by the community.[11]

Throughout the whole process, we'll provide guidance to other contributors and help them comply with the new coding standards.

In theory, implementing the new feature won't need any community feedback, since it was already provided in the past. If, however, we find ourselves uncertain about specific implementation details, we won't hesitate to ask for community feedback. This would mainly happen on technical discussion pages, including but not limited to the edit filter noticeboard on enwiki. We will anticipate the need for feedback in advance and forward our requests as soon as possible, to avoid dead time and increase our productivity.

Budget edit

How you will use the funds you are requesting? List bullet points for each expense. (You can create a table later if needed.) Don’t forget to include a total amount, and update this amount in the Probox at the top of your page too!

The time dedicated to this project will be partitioned as below. The partitioning may slightly vary during the process, according to our needs. We will work part-time for a total of roughly 3 months. Here is an esteem, using a standard rate of 40$/h[13]:

What hours
Meta-development
Setting up metrics measurement 5
Task triage and prioritization 10
Pure development
Writing tests for existing code 100
Code architecture review 200
Code namespace-ization 30
Fixing open bugs 50
Implementing a new feature 70
Total
Base hours 465
Rounded hours (*) 600
Overall cost 24000 USD

(*) Software is, by nature, unpredictable. Unexpected needs must be taken into account. Following a general rule, the amount is rounded by 30%. We prefer to set this expectation now and return any unused fund, instead of relying on reserve budget or risking to end up with incomplete goals.

Daimona Eaytoy will work for 350 hours (450 with rounding), and Matěj Suchánek for 115 hours (respectively 150). Converting the total to local currencies with Oanda.com, we get:

  • 16 615 EUR for Daimona
  • 137 426 CZK for Matěj

Community engagement edit

How will you let others in your community know about your project? Why are you targeting a specific audience? How will you engage the community you’re aiming to serve at various points during your project? Community input and participation helps make projects successful.

The foremost mean of communication with end-users will be on-wiki pages, like the aforementioned edit filter noticeboard. The technical village pump will be used for wikis without an AbuseFilter-specific page. We will use the Tech/News newsletter to request wider feedback. The modality in which this communication will happen is described above. Users and developers will also be informed via wikitech-l and MediaWiki-l.

Get involved edit

Participants edit

Please use this section to tell us more about who is working on this project. For each member of the team, please describe any project-related skills, experience, or other background you have that might help contribute to making this idea a success.

  • Daimona Eaytoy is a volunteer MediaWiki developer. He has contributed to various codebases and code checker tools, like phan and phan-taint-check. He maintains the AbuseFilter extension in his free time and will be working part-time for this grant.
  • Matěj Suchánek is a volunteer Wikimedia developer focusing on bots, Wikidata and technical needs of cswiki. He will try to help with reviews, coding and testing.
  • Huji (advisor) is a volunteer MediaWiki developer and has been one a long-time contributor and reviewer for several extensions that relate to fighting vandalism or making wikis safer, including AbuseFilter, CheckUser, and LoginNotify. He is also a sysop and checkuser on different wikis and uses these extensions on a frequent basis.
  • MusikAnimal (advisor) is a software engineer for the Community Tech team at the WMF. He has worked on the AbuseFilter code base in the past and in a volunteer capacity has extensive experience as an AbuseFilter manager. He will provide product-level assistance as needed along with code review.

Community notification edit

You are responsible for notifying relevant communities of your proposal, so that they can help you! Depending on your project, notification may be most appropriate on a Village Pump, talk page, mailing list, etc.--> Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?

Endorsements edit

Do you think this project should be selected for a Project Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).

  •   Support An efficient anti-abuse filter is necessary to combat the increasingly numerous vandalisms affecting the encyclopedia entries.-- Burgundo (talk) 16:12, 18 February 2020 (UTC)
  •   Support The idea is good and I have faith in the proponent as a good user to contribute to such task. Sannita - not just another it.wiki sysop 16:18, 18 February 2020 (UTC)
  • it's mandatory to reinforce the tools for automatic vandalism control 93.64.78.194 16:27, 18 February 2020 (UTC)
  • Strong support. AbuseFilter is critical to keep vandalism and spam manageable at wikis. AbuseFilter is a very complex extension that has lacked clear and continued maintenance over the years. I am happy to support this so this very needed extension keeps on bettering. —MarcoAurelio (talk) 16:35, 18 February 2020 (UTC)
  •   Support AF has long been a critical mediawiki extension. --Vituzzu (talk) 16:37, 18 February 2020 (UTC)
  •   Support AbuseFilter is a critical anti-spam/anti-vandalism/anti-harassment tool which needs to be well-maintained to continue to be effective across all of the projects. SBassett (WMF) (talk) 16:42, 18 February 2020 (UTC)
  •   Support I think this could be extremely helpful. Furthermore, Daimona Eaytoy is an absolutely trustworthy volunteer. --Horcrux (talk) 16:46, 18 February 2020 (UTC)
  • Critical anti-vandalism extension RhinosF1 (talk) 16:52, 18 February 2020 (UTC)
  •   Support AbuseFilter is a critical tool, and I am confident about the technical skills of Daimona Eaytoy, who is a well trusted user. --Phyrexian ɸ 16:59, 18 February 2020 (UTC)
  •   Support Better filters are a relevant feature on fighting vandalism, and also on avoiding false positives. Marcok (talk) 17:02, 18 February 2020 (UTC)
  • Critical tool. I know Daimona as a sysop of itwiki and I trust him. Jaqen (talk) 17:09, 18 February 2020 (UTC)
  •   Support +1 CPettet (WMF) (talk) 17:49, 18 February 2020 (UTC)
  •   Support AbuseFilter needs maintenance since time: more efficiency, new tools, stay up-to-date with the evolution of the web and also of the LTA techniques. As admin and CU, I can only appreciate and support at most the goals of this project. Also I personally know the high level skills of Daimona Eaytoy and the precious "dirty work" he's doing in its daily operations on wiki. If you're looking for the right guy to do this job, he's definitely the one. L736Etell me 18:24, 18 February 2020 (UTC)
  •   Support (I discovered this grant from a mailing list) I'm glad to have shared for a while my small sandbox MediaWiki instance with Daimona to hack toghether with AbuseFilter, some time ago, and learn each other skills. Reading this, I'm excited to see that now I have the opportunity to endorse a boost on a critical extension, in the right moment, and with the right propellant: Daimona. It couldn't have been better planned. Go, hack, Rock, Daimona! --Valerio Bozzolan (talk) 18:54, 18 February 2020 (UTC)
  •   Support. The extensive use of the AbuseFilter extension exceeds its capabilities. Improvements would be very welcome in many projects. --abián 19:20, 18 February 2020 (UTC)
  •   SupportParma1983 (talk) 20:15, 18 February 2020 (UTC)
  •   Support Daimona has an excellent track record and AbuseFilter is much in need of love. Bawolff (talk) 21:01, 18 February 2020 (UTC)
  •   Support per Bawolff Sakretsu (炸裂) 21:54, 18 February 2020 (UTC)
  • Thanks for taking this on! DannyS712 (talk) 22:24, 18 February 2020 (UTC)
  •   Support Daimona has been doing great work so far, and AbuseFilter definitely needs the help. Kaldari (talk) 22:56, 18 February 2020 (UTC)
  •   Support Daimona's contributions to the AbuseFilter code base have been transformational and I think this grant proposal is a great idea. Huji (talk) 23:34, 18 February 2020 (UTC)
  •   Support Absolutely! No one better suited for this much-needed task. Suffusion of Yellow (talk) 01:07, 19 February 2020 (UTC)
  •   Support Yes, yes, and yes. I could not think of a team more suited for this. ~riley (talk) 01:39, 19 February 2020 (UTC)
  •   Support Abusefilter is a very important extension and definitely needs a overhaul. Can't think of anyone better to do that. Galobtter (talk) 04:15, 19 February 2020 (UTC)
  •   Support Certainly useful. Epìdosis 08:45, 19 February 2020 (UTC)
  •   Support The abuse filter needs to be fixed and improved ValeJappo (talk) 13:22, 19 February 2020 (UTC)
  •   Support The AbuseFilter is a key tool for a huge number of wikis, essential for efficient counter-vandalism efforts. We definitely need an improved and up-to-date AbuseFilter: any time spent in revision/improvement/extension of such a fundamental feature is an excellent investment. And such work couldn't be in better hands than Daimona's.---Equoreo (talk) 17:24, 19 February 2020 (UTC)
  •   Support AF really needs help and Daimona is the right guy for it --Civvì (talk) 22:15, 19 February 2020 (UTC)
  •   Support As a technical person I can guarantee AF is one of the most under-loved projects deployed in production. I also quite trust Daimona and Matej on their technical skills. Hopefully this will be the path to having a full-time dev time at WMF maintaining and improving this vital part of Wikimedians' work. Amir (talk) 22:27, 19 February 2020 (UTC)
  •   Support Masum Reza📞 03:21, 20 February 2020 (UTC)
  •   Support C. crispus (talk) 12:50, 20 February 2020 (UTC)
  •   Support That could really be a big help and Daimona is a most trustworhty and skilled sysop. M&A (talk) 14:08, 20 February 2020 (UTC)
  •   Support sound motivation, looks like a must have Hjfocs (talk) 15:02, 20 February 2020 (UTC)
  •   Support --Dave93b (talk) 13:17, 21 February 2020 (UTC)
  •   Support The extension is old but it is fundamental for the fight against vandalism. This is a great opportunity to improve it ValterVB (talk) 08:49, 22 February 2020 (UTC)
  •   Support Abusefilter really needs UI improvements, new features and less bugs to help volunteers who fight vandalism and harassment, as said on fr-wikipedia. And for that, it seems a better code is needed first. — Jules Talk 13:32, 22 February 2020 (UTC)
  •   Strong support Abusefilter is central for counter-vandalism. Not upgrading it, means that we want to open the doors to disruptive edits! Ruthven (msg) 23:36, 22 February 2020 (UTC)
  •   Support --TriggerOne (talk) 01:42, 25 February 2020 (UTC)
  • Daimona and Matěj have already made many improvements to AbuseFilter and it seems like a great idea to let them do it on a more regular basis. I've reviewed some of their volunteer patches in the past and I can endorse their competence in this. Matma Rex (talk) 23:06, 26 February 2020 (UTC)
  •   Support (but..). The abuse filter is a fundamental tool of fundamental importance to fight vandalisms and much more. Because of this, I think the WMF should invest a lot on it, having some of its techs working regularly on it, developing new functionalities, speeding it up to allow a more intensive use, etc. Thus, while I'm in favour of the goals of the project, I'm a bit perplexed about the idea of allocating it externally. Anyway, given that the internal option it's probably unlikely to actually happen, this project is a valid alternative. Finally, I'm also wondering whether having a non-professional programmer (although surely very skilled and talented) taking on the main part of this project is the right way to go, but I'm guessing this will be addressed and evaluated by people more competent than me on a later stage of this call.--Sandrobt (talk) 07:58, 5 March 2020 (UTC)
  •   Support, essential tool, it has to be fixed/improved/cared of. --Wikinade (talk) 13:27, 13 March 2020 (UTC)
  •   Support ·addshore· talk to me! 10:17, 6 April 2020 (UTC)
  •   Support. A no-brainer. MER-C (talk) 10:08, 2 May 2020 (UTC)

Notes edit

Relevant links edit

Community health initiative/AbuseFilter is an assessment by the Anti-Harassment Tools team. Additional links are available there.