Community Wishlist Survey 2017/Bots and gadgets/Keep maintenance categories and links up to date

Keep maintenance categories and links up to date

  • Problem: Changes to MediaWiki code related to parsing can leave links tables out of date. Sometimes code changes will add new tracking categories, typically related to error tracking. But until the page is edited, null edited, or purged with links update, the page will not show up in the category. Some pages do not get edited or refreshed for years, which means that errors in pages will go undetected and unfixed.
  • Who would benefit: Gnomes; WMF developers who want to move to new technologies but need editors to clean up existing pages first; readers who encounter strange page behavior.
  • Proposed solution: Null-edit all unedited pages, or do some equivalent action, on a known periodic basis, perhaps once a month. Publicize the schedule so that it is known that any new template changes or MW code changes may take N weeks to be reflected in all relevant locations on the corresponding MW site.

Discussion edit

  • In theory that sounds like a great idea, which I support. However, it worries me that that could lengthen the category update time even more, and right now they're SLOW. What can we do to mitigate this risk?--Strainu (talk) 12:31, 7 November 2017 (UTC)[reply]
    • The proposal is for the back-end software (MediaWiki itself, or a job queue, or something similar) to null-edit pages that are stale. The proposal, if implemented, would shorten category update times, not lengthen them. Jonesey95 (talk) 23:48, 1 December 2017 (UTC)[reply]
  • I like it but who would be responsible for the Null editing/updating the categories you reference? Zppix (talk) 19:37, 7 November 2017 (UTC)[reply]
  • That's a big problem on Commons. In general, most (if not all) pages the categories of which are assigned via templates suffer from that. Right now about 1,400,000+ pages on Commons are not categorised correctly due to that problem. If a category is changed by altering the template a touch on every template-categorised page is necessary to fix the categorisation that else stays pointing to a redirecting page. Another example: Because c:Category:Non-empty disambiguation categories had become completely useless keeping about 8000 empty categories I started many months ago a weekly touch run for to keep it usable. --Achim (talk) 20:24, 8 November 2017 (UTC)[reply]
    • I think the issue with c:Category:Non-empty disambiguation categories is that {{PAGESINCAT:... is not considered a dynamic parser function (like e.g. {{CURRENTDAY}}) which would cause the pages to be reparsed regularly. We also don't keep track of links for pagesincat, so we can't purge when someone adds something to a category via job queue. Given this is using {{PAGESINCAT:... for the current category, we probably wouldn't even need to keep track of links, only if the current page is checking how many cats are in itself, so we could probably fix that much easier than purging all pages. BWolff (WMF) (talk) 22:23, 28 November 2017 (UTC)[reply]
  • I would support this, but only if the null edit does not appear in edit history's, and happens at a long schedule, something like monthly or yearly runs, repeating, with all articles spread over the month/year to avoid server load. A Den Jentyl Ettien Avel Dysklyver (talk) 14:51, 9 November 2017 (UTC)[reply]
  • null edit can be done using pywikibot's touch.py. But this is solution only for smaller wikis, ~100k pages takes ca 10 hours. JAn Dudík (talk) 11:54, 10 November 2017 (UTC)[reply]
  • I think it would be useful to add touch capability to AutoWikiBrouwser, as proposed in phabricator:T167283, so there is more than one way to touch pages. --Jarekt (talk) 19:36, 15 November 2017 (UTC)[reply]
  • I'm doubtful how practical this is for all pages on large wikis. I have no idea how long such a reparse would take. It wouldn't surprise me if we're talking months to run through all pages. Maybe even years. BWolff (WMF) (talk) 22:23, 28 November 2017 (UTC)[reply]
    • The first step in implementing this proposal would be to investigate a few different methods for doing this refresh and assessing their practicality. The status quo is that some pages never get updated, so even "years" would be an improvement, but I think we can get it down into the "months" or even "weeks" range with some clever work. Jonesey95 (talk) 23:48, 1 December 2017 (UTC)[reply]

Voting edit