Problem: It is extremely cumbersome to find out who wrote a specific part of the article, get an overview of how the current content maps to the various authors etc. History search based tools like WikiBlame make it merely very cumbersome. Who Wrote That? is a tool that provides a decent experience, but it is only available at a select few large Wikipedias.
Proposed solution: Extend Who Wrote That? to more wikis.
Who would benefit: Editors who need to track down problematic (or particularly excellent) content, wiki historians, researchers, readers suspicious about the reliability of a page etc.
Personally I mainly care about huwiki, but the more, the merrier; I assume it makes more sense to do this in bigger blocks; whatever the team feels is achievable. (Eventually, would be nice to extend it to all Wikimedia wikis, except for Wikidata and Commons which are quite large and non-text based so that would be a waste of resources. Enwiki is about half of all wiki content and I imagine the resource cost for a tool like this scales superlinearly, so that doesn't seem like such a tall order.) --Tgr (talk) 08:18, 5 February 2023 (UTC)[reply]
Thanks for creating this proposal! I believe we're going to address this eventually anyway (at least for a few other popular languages), but with a proper proposal that hopefully does well in voting, it will make it much easier to prioritize, acquire funding if necessary, and so forth. If it means anything to voters, the system that powers Who Wrote That? is WikiWho. The algorithm works amazingly well, but it's very costly as it essentially processes and stores data on every single mainspace revision (i.e. the full history of pages). I think adding many of the popular languages won't be a problem. Doing every single wiki (except Commons/Wikidata) is probably not going to happen anytime soon. I think we'd need to first revise the architecture, do a proper production deployment, and go from there. The storage footprint is currently just too great (for context, the combined size of the currently supported languages is about 3.8TB). It would probably need a dedicated team working on it for a year or more. MusikAnimal (WMF) (talk) 03:29, 6 February 2023 (UTC)[reply]
It would be very nice to have docker/script image or something which user could just git clone from repository and it would download backup dump of the selected wiki from dumps.wikimedia.org, process it and then download and process new revisions using API to keep it sync to latest version. This would allow hackers from different language versions to test and dev it locally (and and run their own annotation servers if there is more wide interest) Zache (talk) 05:34, 19 February 2023 (UTC)[reply]
Voting
Support This tool has become invaluable to me. Whether it is to track sources that used to be in the text (but were moved), to figure out when text was inserted to find if there was simultaneous discussion and to remove text by disruptive editors.. Femke (talk) 19:01, 10 February 2023 (UTC)[reply]
Support As a user who often sees who is editing and adding content to certain articles, this would certainly be of great help. Unfortunately, this is not yet available on the Indonesian Wikipedia. ···🌸Rachmat04·☕10:08, 14 February 2023 (UTC)[reply]
Very strong support This can help track vandalism more easily on more sites, and it can find good talent. I love this proposal. NPRB (talk) 14:27, 14 February 2023 (UTC)[reply]