User talk:InternetArchiveBot/Archives/2021

False negative

The bot marks archive-url's of defensenews.com as url-status=live when they're in fact dead (false negative). It may be because they look to be live to the bot while in fact they only shows "load more" on the page instead of the article. https://i.imgur.com/ifwYY3a.png all defensenews links that it changed status for need to be checked and restored also remove unnecessary access-date if the link is dead and archived. The current status is that it kills all defensenews.com url. 919181512a (talk) 11:44, 10 January 2021 (UTC)

This seems to have only happened to search results which seem to be pages going to http://defensenews.com/blogs. These have been blacklisted, but it looks like most links that exist on Wikipedia are broken links. Also no HTTPS which is funny considering this is a site about defense news. —CYBERPOWER (Chat) 16:53, 26 July 2021 (UTC)

I'll run the domain through WaybackMedic to search out the soft404s and update links on enwiki and in the iabot database. -- GreenC (talk) 17:52, 1 August 2021 (UTC)

New complication: "The Domain has been Excluded from the Wayback Machine". WaybackMedic will attempt to find and replace alternative archives on enwiki and in IABot database. -- GreenC (talk) 18:13, 1 August 2021 (UTC)

Results

defensenews.com has been scrubbed for dead and soft-404 links, and inoperable archive URLs. The IABot database updated with new archive URLs, and status set to blacklist on a per-URL basis. The global live state has been lifted, set to "none", as the remaining links are still live. Also processed all pages on Enwiki adding new archive URLs (no Wayback since the domain has been excluded), or dead link tags. Any questions or see any problems let me know. -- GreenC (talk) 20:11, 2 August 2021 (UTC)

spacenews.com is of the same company and problems, also now done. -- GreenC (talk) 20:59, 4 August 2021 (UTC)

This section was archived on a request by: Harej (talk) 22:57, 4 August 2021 (UTC)

New talk page

This is now the primary discussion page for InternetArchiveBot. Please post messages concerning InternetArchiveBot's operations here. harej (talk) 21:54, 15 June 2021 (UTC)

This section was archived on a request by: This section will be archived in 14 days. harej (talk) 22:16, 15 June 2021 (UTC)

Blocked

I attempted to analyze en:Nadhmi Auchi and was unceremoniously dumped out with the error: Analysis error: blocked: You have been blocked from editing. uh, no, I haven't. Pls fix. Elizium23 (talk) 19:42, 13 June 2021 (UTC)

I also got this error - a couple times over the last few weeks. It's not consistent, but it probably occurs about 50% of the time I try to use the tool. Weird and a bit alarming. Ganesha811 (talk) 03:59, 14 June 2021 (UTC)
Elizium23 and Ganesha811, this is a known problem that is not related to either of your accounts. See this ticket where we are keeping track of the issue. harej (talk) 01:23, 16 June 2021 (UTC)
Thanks for the heads up - I found that logging out and logging back in on the bot site seemed to fix it, though that may be coincidence and probably won't last. Ganesha811 (talk) 01:25, 16 June 2021 (UTC)

Request to add this bot to Bengali Wikinews

Though the site [1] is in incubator, it already has 140+ news articles. The news sites are reliable but there is a tendency in Bengali news sites to disappear after a while. --Notbrev (talk) 16:53, 28 June 2021 (UTC)

Notbrev, the bot is currently set up to run across all of Incubator, and I will be happy to set it up on Bengali Wikinews once it has its own wiki. In the meantime I have manually run the bot across the Wn/bn pages. In general the Internet Archive is scanning external links posted on Wikimedia projects so if any of them break the bot will be ready to fix them. harej (talk) 17:41, 28 June 2021 (UTC)
Thanks Harej! --Notbrev (talk) 04:26, 29 June 2021 (UTC)
This section was archived on a request by: harej (talk) 03:04, 30 June 2021 (UTC)

Wrong edit on it.wiki

Hello! Sorry to bother, but the bot has attempted several times to change an url in w:it:Elfi della notte (today the last time, cfr).
The current link is https://web.archive.org/web/20070607175408if_/http://www.wow-europe.com/en/info/encyclopedia/349.xml
while the "corrected" url is https://web.archive.org/web/20100330053101/http://www.wow-europe.com/en/info/encyclopedia/349.xml
the problem is that the old url works perfectly fine, the new one doesn't. Can this be stopped somehow? --Syrio (talk) 17:10, 28 June 2021 (UTC)

User:Syrio, I just updated the IABOt database so it uses the correct archive via the tool at iabot.org. [2] Anyone can do it, no special permissions or requests are required. -- GreenC (talk) 18:59, 28 June 2021 (UTC)
Thank you! :) --Syrio (talk) 19:38, 28 June 2021 (UTC)
This section was archived on a request by: harej (talk) 03:04, 30 June 2021 (UTC)

de:WP

Any timeline when the German bot will be back in business? It's months now, the maintenance lists are empty, the articles are rotting like old leaves... Thanks and kind regards, Grueslayer (talk) 15:52, 16 June 2021 (UTC)

Grueslayer, I left Cirdan a message, and will work with him to rectify any outlying issues with dewiki to get it back up and running ASAP. —CYBERPOWER (Chat) 17:05, 26 July 2021 (UTC)
This section was archived on a request by: Harej (talk) 22:57, 4 August 2021 (UTC)

fr:WP

Hi,

I just see on the French bot page that, in November 2019, someone need to "Fix inconsistencies in sources such as improper template usage, or invalid archives.". Someone have more information about that ? Do there's a lot of work to do or there already a French bot doing the job ? Simon Villeneuve 17:51, 19 June 2021 (UTC)

@Simon Villeneuve: The bot is currently turned off on frwiki due to a unique situation regarding Wikiwix, in which we are currently uncertain how to address.—CYBERPOWER (Chat) 17:11, 26 July 2021 (UTC)
This section was archived on a request by: Harej (talk) 22:57, 4 August 2021 (UTC)

Looki.de

I tried to automatically convert https://de.wikipedia.org/wiki/Spezial:Weblinksuche/www.looki.de into archive links using https://iabot.toolforge.org/index.php?page=manageurlsingle&url=www.looki.de but didn't have any success. The problem is that the links aren't 404, but instead the domain was sold. What am I doing wrong? Matthias (talk) 07:33, 27 June 2021 (UTC)

Matthias M., I have marked the entire domain as permanently dead. You should be able to run the bot on pages containing looki.de now. —CYBERPOWER (Chat) 17:19, 26 July 2021 (UTC)
This section was archived on a request by: Harej (talk) 22:57, 4 August 2021 (UTC)


InternetArchiveBot mishandles the "nowiki" tag

Apparently, IAB doesn't check for the "nowiki" tag in the markup, leading to text like this to appear (note the curly braces and the "bot" parameter; example seen at this page about the "magnet:" URI, permalink: permalink to that article):

«dchub://example.org{{Недоступная ссылка|date=Февраль 2019 |bot=InternetArchiveBot }}{{Недоступная ссылка|date=Август 2018 |bot=InternetArchiveBot }}{{Недоступная ссылка|date=Май 2018 |bot=InternetArchiveBot }}»

My guess is, the URI was checked by the bot, and the template was added, however there was an added-earlier "nowiki" that embraced the markup, thus messing with it.

Le Cybeaurge (talk) 23:06, 20 July 2021 (UTC).

Le Cybeaurge, This appears to be a remnant of an old bug that existed when IABot 2.0 was still in a beta state back in 2018. I believe the bug has already been rectified and it should no longer make edits like that. If you happen to see newer edits made by the bot that still present the same issues you mention, please re-report. —CYBERPOWER (Chat) 17:15, 2 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 22:57, 4 August 2021 (UTC)

Blocked

When I try to analyze a page, it says I am blocked from editing. This is false, I can just edit any page I want myself. What's wrong here? PhotographyEdits (talk) 14:51, 1 August 2021 (UTC)

PhotographyEdits, do not worry, there is a weird bug occurring in Wikipedia that causes a mistaken error to be passed back. Unfortunately, it's not something I can directly fix as this requires the involvement of the developers of the MediaWiki API. Just re-attempt at a later time and it should work. —CYBERPOWER (Chat) 17:19, 2 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 22:57, 4 August 2021 (UTC)

Bot remove content while inserting Webarchive template

Tracked in Phabricator:
Task T287898

Good evening and sorry for my english which is very bad. I want to report this edit performed 2 lug 2021 on it.wiki: it:Special:Diff/121633282. InternetArchiveBot has removed a huge amount of content while inserting the Webarchive template. --Scalvo98 (talk) 19:50, 1 August 2021 (UTC)

Scalvo98, that is indeed a serious bug, but we identified that this is a very rare occurrence. We will get it fixed ASAP. —CYBERPOWER (Chat) 17:40, 2 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 22:57, 4 August 2021 (UTC)

Bot incorrectly marks url-status as live

Diff. Bot marks url-status as live for pages that exist with content "Not Found. Sorry, but the page you were trying to view does not exist", but whose real content disappeared years ago. In the first citation it edited, it (logically, I suppose, if it thought the page was live) added today's date as access-date. The site itself is live, but I'm guessing it isn't returning a 404 code for these long-dead pages? cheers, Struway2 (talk) 14:59, 4 August 2021 (UTC)

Struway2, the URL in question was marked as "permalive," i.e. "always treat the URL as though it is alive." This mark has been undone. harej (talk) 19:37, 4 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 22:57, 4 August 2021 (UTC)

Use localized digits

Hi. Edits like fa:Special:Diff/30560910 result in a red error message shown by the cite template because the date you entered (in this case 19 اکتبر 2018) is using a mixture of Persian month names (اکتبر which means October) and Latin digits (19 and 2018). On fawiki, you should tell your robot to use Persian digits instead, such as ۱۹ instead of 19.

You can use the persiantools module from pip to that end. It has a submodule called digits which can handle this. See this example from my robot. If you have questions, please don't hesitate to ask, but please {{ping}} me so I notice them. Huji (talk) 22:29, 27 June 2021 (UTC)

Huji, we have created a ticket for this.—CYBERPOWER (Chat) 17:30, 26 July 2021 (UTC)
This section was archived on a request by: Harej (talk) 02:16, 9 August 2021 (UTC)

Broken edit (plwiki)

Hello. The bot did not recognize template argument followed url and combined its content into new value of the url. Bad edit. Paweł Ziemian (talk) 11:37, 11 July 2021 (UTC)

Paweł Ziemian, This is now tracked. —CYBERPOWER (Chat) 17:10, 2 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 02:16, 9 August 2021 (UTC)

Needs to fail better

I've been playing with the bot but it keeps failing, particularly on Russia-related topics, as en Wikipedia objects to some of the links and the edit doesn't go through. I've found them in my edit filter log afterwards. I think when it fails it need to fail more intuitively and explain what's going on. The red error message isn't that helpful. I can provide examples if needed. Thanks (and great bot btw). Secretlondon (talk) 01:37, 13 July 2021 (UTC)

Secretlondon, the red error messages you see are most likely errors coming directly from Wikipedia, which are being passed back through the bot's user interface. They are admittedly not very descriptive. My opinion is that the best way to fix this would be to have the MW API devs make the error messages more descriptive. —CYBERPOWER (Chat) 17:10, 2 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 02:16, 9 August 2021 (UTC)

Replacing archive links

Hello. Can you get this bot to stop replacing archive.is links with web.archive.org ones? I'm talking about plwiki. First of all, if an archive link was inserted manually by an editor, then we should assume it's suitable and there's no need for it to be replaced (it's best left as it is cause it's probably the best match). Second, occasionally archive.is can handle content that web.archive.org messes up in display. Third, your bot gets the archive dates completely wrong, it's understandable that there could be a few months difference but it even leaves archived pages that are a few years newer or something, so you can never be sure if the content has stayed the same or not (which could be prevented by NOT replacing the links in the first place!). Can something be done to address these issues please? It would be far better if you made your bot replace the short variant archive.is links with the longer ones that betray the archived url as well – in a web.archive-ish fashion. 83.23.180.72

I agree. It also happens on the Wiki in Catalan. I put a lot of archive.is not showing up now and they were already archived.--Tomeu87 (talk) 13:42, 27 July 2021 (UTC)

When reporting problems, please provide example diffs. That would help. It's unclear if this is a bot policy, a bug in the bot, a problem with the data, etc.. and we can't know without investigating specific examples of what the bot is doing. -- GreenC (talk) 02:18, 28 July 2021 (UTC)

For example this change or this other. Among others--Tomeu87 (talk) 12:19, 29 July 2021 (UTC)

Tomeu87, thank you for the diff. There is now a Phab ticket at task T287837 - If you have not heard anything in a few days please ping. -- GreenC (talk) 17:34, 1 August 2021 (UTC)

Tomeu87, Fixed. [3] -- GreenC (talk) 15:33, 7 August 2021 (UTC)

This section was archived on a request by: Harej (talk) 02:16, 9 August 2021 (UTC)

Encode

Why bot edit this URL? 194.50.14.241 17:46, 8 August 2021 (UTC)

In order for the bot to scan a URL, it needs to encode the link – this is a technical limitation of the bot that is not easily overcome. The bot also normalizes archive URLs as part of the edits it makes. Because of these two things, the bot ended up making an edit that I agree was not necessary. —CYBERPOWER (Chat) 16:56, 9 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 22:21, 18 August 2021 (UTC)

blocked: You have been blocked from editing

Sometimes I get a message in red; "blocked: You have been blocked from editing". This occurs on more than one browser and more than one computer. Them after a few hours, the block goes away. Right now I see a fragment of a message "blocked: You have been blocked from editing.'); ERROR: Failed to post edit error to DB. --> ". Abductive (talk) 08:46, 20 August 2021 (UTC)

Abductive, this is a known bug that is not related to your account. It is being tracked on Phabricator. Harej (talk) 20:55, 20 August 2021 (UTC)
Thanks! Abductive (talk) 20:57, 20 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 23:02, 20 August 2021 (UTC)

Queue a job that repairs all links

I would like to queue a batch job, but I would like it to repair all links, not just the dead ones. (In specific, on enwiki we had some domain names that have been usurped by squatters, and so they appear to be "live" but they need to be rescued and archived.) Is there any way to do this in the batch job interface? There is a checkbox in the individual-page form, but my batch consists of at least 80 articles. Thanks. Elizium23 (talk) 10:56, 6 June 2021 (UTC)

Elizium23, the short is no. You cannot queue a batch job to proactively rescue live links. However, you can set the domains the bot thinks are alive to a permadead state and initiate a batch job on the pages those domains exist on. If you provide a list of domains, I can get a job started for you. —CYBERPOWER (Chat) 16:55, 26 July 2021 (UTC)
This section was archived on a request by: Harej (talk) 15:59, 22 August 2021 (UTC)

InternetArchiveBot on sc.wiki

Hi,

I'm one of the administrators on the Sardinian Wikipedia, and I have three issues to report about the activity of the bot there.

The first one is the fact that I can't access the bot queue for it on toolforge (I get this error message: "Sorry but access to the bot queue for this wiki is disabled. It may be because the bot isn't approved for use on this wiki. Please use the Single Page Analysis tool instead."). Why is that? Do I need to to something to access it?

The second one is the fact that the bot is not recognizing dead URLs that are already marked as dead. As you can see here the bot reverted one of my edits, since I deleted the superfluous "Ligàmene interrùpidu" template it added there. The "Tzita web" (Cite web) template already had "urlmortu=eja" ("deadurl=yes") in it, but for some reason the bot didn't recognize it.

The third one, probably connected to the second one, is the fact that when transforming a simple link in a "Tzita web" template, and adding the archived link (see here, for an example) while all of the other data is entered in Sardinian ("archive-url"->"urlarchìviu" etc.) "urlmortu=eja" gets added in English ("deadurl=yes"). It still works, but I would like to have that translated too, if possible. Could you please tell me what I need to do to fix these issues? I don't know from where exactly the bot takes the informations for those strings--L2212 (talk) 15:28, 11 August 2021 (UTC)

L2212, the bot queue is not yet enabled for some wikis, but it will be available in the near future. The second issue should be resolved and the bot should now acknowledge the parameter now. As for the third issue, this can be resolved by altering the configuration page for the CS1 module. By going to Line 89, you can rearrange the list of parameters to acknowledge. The bot always uses the first in the list. As for the yes/no keywords, the module appears to be missing the keywords section of the config. The bot relies on this to localize the values. See this section of enwiki for an example of how to setup keywords configuration. —CYBERPOWER (Chat) 21:11, 20 August 2021 (UTC)
Cyberpower678 Ok, I think I fixed the parameters and keyword issues now, I will check when the bot will be up again. Thank you very much for your help!--L2212 (talk) 13:52, 22 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 15:58, 22 August 2021 (UTC)

Operational status

The website has been down and inaccessible for around 2–3 days, when will it recover? Paper9oll (talk) 01:25, 23 August 2021 (UTC)

This section was archived on a request by: Harej (talk) 06:43, 24 August 2021 (UTC)

cbignore ignored?

Can you explain why this nl:Speciaal:Diff/59710400 edit was made? {{cbignore}} was placed inside a <ref> directly after the link's closing square bracket. Based on the documentation the link should not have been marked as dead. Note, the website does report a 404 error, so the fact that InternetArchiveBot thinks it is dead, is not an issue. --Wimmel (talk) 09:51, 14 August 2021 (UTC)

Wimmel, Sorry for the late response, there was an issue with the configuration of the bot for nlwiki where the bot wasn't configured to acknowledge the template. This has now been fixed and will take effect soon on nlwiki. —CYBERPOWER (Chat) 18:53, 25 August 2021 (UTC)
Perfect, thank you, good to know what was wrong here. --Wimmel (talk) 08:43, 28 August 2021 (UTC)
This section was archived on a request by: Wimmel (talk) 08:43, 28 August 2021 (UTC)

False reports saying links are dead, can't use report function

A couple of times now, the bot has reported links to articles I watch as dead when they aren't. You can click through, and they are still live on the original website, they just may not have also had an archive version saved as well. See this edit and this one. Both cases the original link and website are still working. Also I tried to report the false positive myself but I got through and it tells me I don't have the permissions to report it.NZFC (talk) 22:28, 17 August 2021 (UTC)

NZFC, We have taken a look at both URLs. According to the URL scan log, the bot thinks the URLs are dead because it is under the mistaken impression that it redirects to the home page. For many news websites, when a URL redirects back to it's home page, it is because the article in question no longer exists. However in this case it still exists and the URLs do not redirect, which is unusual. It could be the site was temporarily broken or there is some weird glitch in the scanner causing it to believe it is being redirected. As for reporting false positives, you may have accidentally tried to report it from another wiki on that web application. We recommend you use your home wiki to do maintenance and bug reports on IABot. You can verify which wiki you are using by checking the dropdown on the top right of the page. Hope this helps. We will investigate why the scanner is having issues on that domain. —CYBERPOWER (Chat) 18:51, 18 August 2021 (UTC)
This section was archived on a request by: Harej (talk) 17:35, 28 August 2021 (UTC)

Not working?

https://iabot.toolforge.org/index.php?page=runbotsingle&action=analyzepage

reports

This page isn’t working iabot.toolforge.org is currently unable to handle this request. HTTP ERROR 500

It was working OK yesterday. Kerry Raymond (talk) 01:56, 23 August 2021 (UTC)

This section was archived on a request by: Harej (talk) 17:42, 28 August 2021 (UTC)

Huge notices on talk-pages, being nothing more than ballast

Hi. I truly and fully appreciate your work, and don't know if your bot is still leaving those endless notices on the talk-pages of Wikipedia articles it corrects, but if that is still the case, could you please, please rewrite the bot's code and stop it from doing so? I guarantee you, those mammoth entries full of technical details and very nicely worded messages to fellow editors, are not read by anyone, ever, and are just taking up a lot of space (and using up lots of memory & megawatts at the Wiki storage sites). Maybe there is a way they can automatically be removed retroactively, if possible w/o leading to even more archive space being taken up by the step? I mean something like this "External links modified" entry. I am in awe of people with your technical coding skills, but Wiki is primarily there for users, and made user-friendly for editors, who as a rule care little and know little about what is "under the bonnet" of this wonderful and weird machine. Alternatively, a very short link to your user page would do, where anybody interested could find the explanation & further links regarding the issue. Thank you and keep up the good work, 2.53.47.64 16:25, 31 August 2021 (UTC)

Those talk page messages were disabled on English Wikipedia years ago and are no longer posted by the bot. Harej (talk) 18:05, 1 September 2021 (UTC)
I came here to complain about the same. I use the English Wikipedia. I did not realize the messages stopped in 2018, because the External links modified sections fill up any talk page I have been in the last few months. I now do see a note that editors can delete them. Jay (talk) 05:47, 3 September 2021 (UTC)
This section was archived on a request by: Harej (talk) 18:07, 1 September 2021 (UTC)

archive-date on it.wiki

Hello. Is it possible to disable the addition of the archive-date parameter on it.wiki? Even if not added, it's automatically identified from the URL, therefore it's quite superfluous. Thank you. --Chiyako92 (talk) 17:08, 28 August 2021 (UTC)

Chiyako92, unfortunately not. The bot relies on the CS1 Module. The module defines the archive date which the bot automatically picks up on, and is considered automatically as a mandatory parameter. This is currently a technical limitation. —CYBERPOWER (Chat) 17:31, 28 August 2021 (UTC)
This section was archived on a request by: —CYBERPOWER (Chat) 18:17, 8 September 2021 (UTC)

Bot adds invisible character

As can be seen in this diff, the bot sometimes converts nbsp into an invisible character, which then gets flagged. Abductive (talk) 21:49, 13 August 2021 (UTC)

@Abductive: the bot is actually decoding the nbsp, not converting it. Invisible characters are characters that take up bytes in the string but does not actually render in the output. The space rendered is still an nbsp. I am curious to know where this is getting flagged, and how widespread this is.—CYBERPOWER (Chat) 18:14, 1 September 2021 (UTC)
This section was archived on a request by: —CYBERPOWER (Chat) 18:18, 8 September 2021 (UTC)

Causing error English Wikipedia

Origins Award Winners (2001)". Academy of Adventure Gaming Arts & Design. Archived from the original on 2008-02-02. Retrieved 2007-10-16. More than one of |archiveurl= and |archive-url= specified (help); More than one of |archivedate= and |archive-date= specified (help); More than one of |accessdate= and |access-date= specified (help).--Moxy (talk) 16:35, 6 September 2021 (UTC)

Moxy, Can you provide a diff of a faulty edit? —CYBERPOWER (Chat) 18:21, 8 September 2021 (UTC)
This section was archived on a request by: —CYBERPOWER (Chat) 16:57, 13 September 2021 (UTC)

"MISSING i18n ELEMENT (consolename)"

All elements on the IABot portal have been replaced with the words "MISSING i18n ELEMENT (consolename)".

Some googling suggests this is a language issue. Not sure how to fix it though. Kylesenior (talk) 01:45, 10 September 2021 (UTC)

Kylesenior, this bug was briefly deployed but it should have since been fixed. Please let me know if it still appears for you. Harej (talk) 17:57, 10 September 2021 (UTC)
This section was archived on a request by: Harej (talk) 15:40, 13 September 2021 (UTC)

Wikinews

Hello. I’ve been looking at n:Wikinews:Bots/Requests/InternetArchiveBot, where you are both listed as an operator for the bot. Please could you update us on the current state of the bot request, particularly why it is stalled? What needs to be done to progress the request? Cheers. [24Cr][talk] 10:47, 5 September 2021 (UTC)

Cromium, the request was stalled for inactivity on the bot proposal page. Do you think there is consensus for a test run? If so I would like to proceed with that. Also, do you think you can look at Wikinews:Global bot and update it since the recent change in global bot policy? (Or, clarify that English Wikinews will intentionally not adopt it.) Harej (talk) 18:28, 8 September 2021 (UTC)
@Harej: There is indeed consensus, albeit we only have two users interested in bot approvals at the moment and we are both enthusiastic about this one. A test run is very much awaited. I will have a look at the global bot policy. [24Cr][talk] 19:07, 8 September 2021 (UTC)
Cromium, I've started a trial run for English Wikinews, but the bot is currently down for other reasons. I will continue correspondence on the English Wikinews request for bot approval. Harej (talk) 18:07, 15 September 2021 (UTC)
This section was archived on a request by: Harej (talk) 18:07, 15 September 2021 (UTC)

"Interface Disabled"

Recently I tried to archive some links on one of my Wikipedia articles,[[4]]. However, I then received the following message:

"Interface disabled
The maintainers have disabled this interface. This is 
either for maintenance reasons or a security issue has 
been discovered."

I am now unable to use this tool. What should I do?Dunutubble (talk) 16:29, 16 September 2021 (UTC)

Dunutubble, do you still see that error message? The interface was disabled earlier but should be enabled again. Harej (talk) 03:26, 18 September 2021 (UTC)
Harej, I am happy to say it works.Dunutubble (talk) 13:01, 18 September 2021 (UTC)
Dunutubble thank you for confirming. Harej (talk) 13:29, 18 September 2021 (UTC)
This section was archived on a request by: Harej (talk) 13:29, 18 September 2021 (UTC)

http://www.oie.int & https://www.oie.int

Hello,

Can you check the problem here please?

Thank you, Doc Taxon (talk) 21:50, 8 September 2021 (UTC)

I submitted a bot job. It will get to it in the near future.—CYBERPOWER (Chat) 17:06, 13 September 2021 (UTC)
This section was archived on a request by: —CYBERPOWER (Chat) 16:03, 20 September 2021 (UTC)

New job to fix non-permalink archive.org links?

On the English Wikipedia, it seems that around 1,639 articles link to machine-relative (iaXXXXXXX.us.archive.org) Internet Archive URLs. According to the Internet Archive API documentation, this type of link is not permanent and should instead be replaced with the file's /download/{item}/{file} format. I was planning to write up a script to fix this, but it seems like this would be better suited as a job by the InternetArchiveBot (given that it already performs IA-related tasks). Would it be possible for something like this to see implementation on enwiki (and possibly other wikis in the future)? Chlod (say hi!) 02:01, 11 September 2021 (UTC)

WaybackMedic has fixed 10,209 of thee over the past 5 years. It's more difficult then it appears due to edge cases; and some should be /download/ and others /details/. I'll add those 1,600 pages to my todo list. -- GreenC (talk) 22:00, 13 September 2021 (UTC)
There remain about 372 cases that are difficult. They have not been programmed into WaybackMedic, and most of them I have no idea how to parse off hand. If someone was interested in doing the research of how they convert to perma-links I could probably update Medic. Example conversion for view_archive. -- GreenC (talk) 21:00, 14 September 2021 (UTC)
This section was archived on a request by: 20:26, 20 September 2021 (UTC)

Interface still disabled

Interface is still disabled. have confirmed with others offwiki that this is not isolated for just me. Sennecaster (talk) 12:57, 20 September 2021 (UTC)

The interface is back up. It was turned of due to an outage on archive.org.—CYBERPOWER (Chat) 16:06, 20 September 2021 (UTC)
Sennecaster, Actually, never mind, the power went out at the datacenter just now, and we were forced to turn it off again. —CYBERPOWER (Chat) 16:10, 20 September 2021 (UTC)
Sennecaster, the management interface is now back up. Harej (talk) 20:24, 20 September 2021 (UTC)
This section was archived on a request by: 20:26, 20 September 2021 (UTC)

dead-url on ku.wiki

Hello, I'm planning to update our CS1 modules to the current version of en.wiki. The only problem I have at the moment is the fact that InternetArchiveBot uses |dead-url=yes/no instead of |url-status=dead/live. Can this be configured using the configuration page? (I'm not an admin) If not, what can we do about this? Also if I translate all parameters to Kurdish, can the bot be configured to use the local parameters? Balyozxane (talk) 21:16, 15 September 2021 (UTC)

Balyozxane, If you are using the CS1 module, you can safely update it now, and the bot should automatically adjust to the change, without you needing to worry about it. I do however recommend that you maintain older parameters for backwards compatibility, otherwise the bot may end up duplicating values in the template. —CYBERPOWER (Chat) 16:05, 20 September 2021 (UTC)
This section was archived on a request by: 16:09, 27 September 2021 (UTC)

Transcluded urls

Does the bot work on URLs transcluded by a template, or does it only work on raw urls in the wikitext? Notsniwiast (talk) 16:38, 20 September 2021 (UTC)

Notsniwiast, the bot works through raw URLs that are in the page source. Unless the URL is brought in automatically from Wikidata, in which case the archive URL is updated on Wikidata. Harej (talk) 20:05, 20 September 2021 (UTC)
This section was archived on a request by: 16:09, 27 September 2021 (UTC)

blocked from editing

I can't run the bot. It says Analysis error: blocked: You have been blocked from editing. How to fix this? --Clog Wolf (talk) 05:40, 22 September 2021 (UTC)

Sorry about that. This is unfortunately a known problem with no solution yet. It happens at random, but eventually goes away on it's own. You should be fine now.—CYBERPOWER (Chat) 18:17, 22 September 2021 (UTC)
This section was archived on a request by: 16:09, 27 September 2021 (UTC)

dead domain silverkeypress.org

It's dead Dave. silverkeypress.org Thanks.  — billinghurst sDrewth 11:45, 22 September 2021 (UTC)

Billinghurst, It's marked dead accordingly. The bot will eventually get around to it. —CYBERPOWER (Chat) 18:20, 22 September 2021 (UTC)
This section was archived on a request by: 16:09, 27 September 2021 (UTC)

Interface not loading

For the last ~3 hours, the management interface has not been loading for me.

When I try to go to https://iabot.toolforge.org/index.php?page=runbotsingle, the page neither loads nor time out; it just hangs for hours.

What's up? --BrownHairedGirl (talk) 12:20, 25 September 2021 (UTC)

It's working again now. --BrownHairedGirl (talk) 13:23, 25 September 2021 (UTC)
This section was archived on a request by: 16:09, 27 September 2021 (UTC)

Bot duplicating archive links

While cleaning out en:Category:CS1 errors: redundant parameter, I noticed that the bot is adding duplicate archive-url/archive-date/access-date parameters - see this example. GoingBatty (talk) 15:57, 24 September 2021 (UTC)

I reported this bug at phab:T291704. --BrownHairedGirl (talk) 17:07, 24 September 2021 (UTC)
This section was archived on a request by: —CYBERPOWER (Chat) 18:17, 29 September 2021 (UTC)

Bot Changes image parameters

Hello on ckb wikipedia the bot changes image Parameters and even their original sizes [1][2][3][4] can we fix this problem? Thank you 🌸 Sakura emad 💖 (talk) 18:00, 27 September 2021 (UTC)

I opened a bug report for this problem. How often is this occurring?—CYBERPOWER (Chat) 18:37, 29 September 2021 (UTC)
Sakura emad, if you have any other diffs where this happens please share them. Harej (talk) 18:02, 7 October 2021 (UTC)
i am not really sure how often but i see that from time to time. 🌸 Sakura emad 💖 (talk) 18:22, 7 October 2021 (UTC)
This section was archived on a request by: Harej (talk) 20:00, 7 October 2021 (UTC)

IABot inappropriately adding "Archived copy" as citation title

In this edit, it appears that IABot added "Archived copy" as the |title= value in multiple citation templates. If so, it should not do that. It should either retrieve an appropriate title from the page's <title>...</title> block or leave it blank. Jonesey95 (talk) 21:47, 27 September 2021 (UTC)

Unfortunately, the CS1 Module requires the presence of a title parameter or it will leave big, ugly, red error messages. Due to IABot's line of work, querying for the the <title> block is not a reliable solution. In addition to that, IABot is not making full HTML requests to servers to conserve resources and is instead pulling headers. As such, the bot doesn't have access to title, and I am forced to use a placeholder such as "Archived copy" as no other solution exists at this time.—CYBERPOWER (Chat) 18:43, 29 September 2021 (UTC)
This section was archived on a request by: Harej (talk) 18:02, 7 October 2021 (UTC)

Bot needed on Ks Wiki

I was wondering if you could run this on Kashmiri Wikipedia. It would be really useful. Thankyou Iflaq (talk) 15:27, 1 October 2021 (UTC)

Iflaq, I have enabled the bot on Kashmiri Wikipedia. As part of the process I imported the Webarchive template. I recommend localizing the English strings in the data subpage. Harej (talk) 16:21, 4 October 2021 (UTC)
This section was archived on a request by: Harej (talk) 16:21, 4 October 2021 (UTC)

Bot creating duplicate parameters

On en.wiki, the bot is adding duplicate parameters for access-date (when accessdate already exists), and seems to be using made up dates for the access date. e.g. [5], [6], [7]. This didn't used to be the case, any idea why this is happening? Joseph2302 (talk) 10:38, 28 September 2021 (UTC)

This is unfortunately, a known bug, but apparently it also a widespread one. I have shut down the bot on enwiki for now, pending a bug fix.—CYBERPOWER (Chat) 18:48, 29 September 2021 (UTC)
Still unresolved and bot still running, see [8], [9], [10] etc --John B123 (talk) 20:43, 11 October 2021 (UTC)
John B123, batch jobs submitted by users have now been disabled. Harej (talk) 01:37, 12 October 2021 (UTC)
@Harej: Thanks. --John B123 (talk) 06:27, 12 October 2021 (UTC)
This section was archived on a request by: Harej (talk) 15:24, 12 October 2021 (UTC)

Wrong parameters for cite journal template on ru-wiki

There is no «ссылка» parameter in cite journal template on ru-wiki: [11]. url parameter should be used for this template. --Agra (talk) 12:48, 10 October 2021 (UTC)

This needs to be addressed by GreenC.—CYBERPOWER (Chat) 16:08, 11 October 2021 (UTC)
The bot is improved it can now detect the language of the citation and use the key name appropriate for that language, for any key or language. This will solve the problem with ruwiki. Also fixed all the errors on ruwiki [12]. -- GreenC (talk) 15:15, 12 October 2021 (UTC)
This section was archived on a request by: GreenC (talk) 15:16, 12 October 2021 (UTC)

parameter

Hello, here bot removed Bengali parameter and added English parameter. It would be great if bot do the opposite for bnwiki (usually does but not sure what happened here). Also how to localize "Adding 1 book for যাচাইযোগ্যতা"? Thanks আফতাবুজ্জামান (talk) 01:49, 14 October 2021 (UTC)

I believe this is related to GreenC's work. Harej (talk) 18:54, 14 October 2021 (UTC)
That is related to the bug immediately above this thread, where it added a local (Russian) language version of |url= when the rest of the citation is using English language, which caused a problem. So it went back and corrected so the key names are all English (when everything else is English). If you want to convert all key names to Bengali that is fine, but I don't think it is a good idea to mix Bengali and English keynames as it can be error prone like the Russian example above. These edits were one time, small in number, and will not be made again since the bot can now detect to use English or Bengali when adding a new keyname. -- GreenC (talk) 19:32, 14 October 2021 (UTC)
Al right. --আফতাবুজ্জামান (talk) 21:11, 14 October 2021 (UTC)
If you can tell me the Bengali for "Adding 1 book for verification" or "Adding 2 books for verification" I will update it, currently no interface to localize this string. -- GreenC (talk) 19:34, 14 October 2021 (UTC)
GreenC:
  • "Adding 1 book for verification" → [[উইকিপিডিয়া:যাচাইযোগ্যতা|যাচাইযোগ্যতার]] জন্য ১টি বই যোগ করা হল
  • "Adding num books for verification" → [[উইকিপিডিয়া:যাচাইযোগ্যতা|যাচাইযোগ্যতার]] জন্য numটি বই যোগ করা হল (no space between num & টি)
-- আফতাবুজ্জামান (talk) 21:11, 14 October 2021 (UTC)
Got it, thanks! -- GreenC (talk) 21:23, 15 October 2021 (UTC)
This section was archived on a request by: GreenC (talk) 21:23, 15 October 2021 (UTC)

Dead url that is actually dead

Hi, this edit is an error. That link is in fact a dead url ("urlmorto"). Bye, --Martin Mystère (talk) 18:57, 18 October 2021 (UTC)

Martin Mystère, thank you for the report. The next time the bot scans that link it should be recognized as dead. Please let me know if it doesn't or other issues arise. Harej (talk) 18:25, 20 October 2021 (UTC)
Harej: Thank you. --Martin Mystère (talk) 18:28, 20 October 2021 (UTC)
This section was archived on a request by: 21:00, 20 October 2021 (UTC)

English Wikinews

Hello. Just to let you know I’ve approved this bot for operating on English Wikinews. Please update as appropriate and let the bot run on that wiki. Cheers. [24Cr][talk] 07:24, 30 October 2021 (UTC)

Thank you for the update Cromium. Please let me know if you run into any issues with the bot on Wikinews. Harej (talk) 16:11, 30 October 2021 (UTC)
This section was archived on a request by: Harej (talk) 16:12, 30 October 2021 (UTC)

Dutch Wikipedia

Hi, I just saw this link mentioned on a talkpage. It was added Jan 2020, so I might be a bit late, but apparently the information on the source page was not correctly archived: at least, I cannot unfold the boxes and read the information? This is a pharmaceutical website (apotheek.nl), often referred to by Dutch GP's. Ciell (talk) 15:51, 5 November 2021 (UTC)

Thank you for this information. Unfortunately, there is nothing IABot can do about this particular issue, but what we have done is gone ahead and reported this URL to the Wayback Machine engineers. They may be able to come up with a solution to fix this problem with archiving sites like these going forward. —CYBERPOWER (Chat) 19:20, 10 November 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:12, 17 November 2021 (UTC)

Wrong edit on it.wiki

Hi, sorry to bother; the bot replaced a working internet archive url with a non-working url, here. The problem had been already reported (and allegedly solved) in June. --Syrio (talk) 14:46, 8 November 2021 (UTC)

@Syrio: thank you for this information. The problem is there exists an "if_" directly after the snapshot timestamp in the archive URL, which the bot is not recognizing and as a result, considering it invalid. This is so the Wayback Machine header doesn't load. There are other aliases for this function that achieve the result. If you replaced "if_" with "id_", you should be able to achieve the same result, but the bot will recognize it and leave it alone. I have put in a fix for this which will be released on the next update of IABot. —CYBERPOWER (Chat) 19:27, 10 November 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:12, 17 November 2021 (UTC)

Bot should escape two (and more) apostrophes in archived url

Because otherwise it breaks the url:

Before the bot: Claus Sluter’s ‘Well of Moses’ for the Chartreuse de Champmol reconsidered: part II

After the bot: %20part%20II'.pdf Claus Sluter’s ‘Well of Moses’ for the Chartreuse de Champmol reconsidered: part II

diff MarMi wiki (talk) 18:15, 10 November 2021 (UTC)

Thank you for reporting this. I have gone ahead and pushed out an update to forcibly encode ' going forward. —CYBERPOWER (Chat) 19:52, 10 November 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:12, 17 November 2021 (UTC)

Bot down

The bot appears to have been down for the last 12 hours. No edits on en.wp, and interface disabled.

What's up? --BrownHairedGirl (talk) 15:46, 11 November 2021 (UTC)

We are having technical issues with the Wayback Machine. —CYBERPOWER (Chat) 15:54, 11 November 2021 (UTC)

Trouble again. The bot is still accepting and processing single page requests (e.g. [13], a few minutes ago) ... but the job queue has stalled on batch jobs, after processing the first 102 items of Job 8965, a zh.wp job.

The most recent zh.wp bot edit was at 23:32, 22 November 2021.

The most recent en.wp bot edit was at 21:07, 22 November 2021.

What's up? --BrownHairedGirl (talk) 10:45, 24 November 2021 (UTC)

Hello BrownHairedGirl, we took the bot down because it was overwhelming Wikipedia's servers. It will return once we fixed the issue causing this. Harej (talk) 19:16, 24 November 2021 (UTC)
Bot's still down. Any ETA on when it'll be back up? Whoop whoop pull up (talk) 17:04, 4 December 2021 (UTC)
Hopefully by the end of the day today. —CYBERPOWER (Chat) 19:16, 8 December 2021 (UTC)
@Whoop whoop pull up: ^ —CYBERPOWER (Chat) 19:17, 8 December 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:19, 8 December 2021 (UTC)

Bad books.google to archive.org replacement

This 2020-03-19T00:56:00 en.wikipedia edit replaced

The only overlap I can see is "Pittsburgh Press" Jnestorius (talk) 19:19, 30 November 2021 (UTC)

That was an error 1.5 years ago. It doesn't touch Google Books links anymore. I think the fundamental problem is the citation is poorly conceived relying almost entirely on the URL for verifiability, take away the Google URL and left with {{Cite book|title=The Pittsburgh Press|publisher=The Pittsburgh Press|language=en}} which is an unverifiable citation that would need to be deleted. Google Books has been unreliable so we can expect in the future the URL will become a 404 and the citation at the point would have to be deleted. The bot was an attempt to make the citation better but it failed due to minimal information. It probably should have pulled the date from Google Books main page to make the match more accurate, that could be fixed, but it doesn't do Google Books stuff anymore. The correct action is fill in more details. This was done by Jnestorius. -- GreenC (talk) 20:12, 1 December 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:22, 8 December 2021 (UTC)

phab:T291704

Is there any chance of a fix to phab:T291704? It was reported over 6 weeks ago.

It's quite a significant issue, because it creates cite errors in a significant numbers of articles. Cleaning up afterwards is tedious and time-consuming, and pre-emptive fixes (by adding the dashes) triggers complaints of cosmetic edits. --BrownHairedGirl (talk) 21:33, 11 November 2021‎ (UTC)

@Harej: does IABot take pull requests? So that potentially me and anyone else can submit a fix to the issue ourselves.
That ping didn't work since you didn't sign your posts, but IABot's source code is at https://github.com/internetarchive/internetarchivebot and (AFAICS) is accepting pull requests. * Pppery * it has begun 17:25, 14 November 2021 (UTC)
Yes, this is a priority, but, unfortunately, this is taking longer to get fixed. —CYBERPOWER (Chat) 19:18, 24 November 2021 (UTC)
Thanks, Cyberpower678. Good to know that it is a priority. It will be great to have it fixed. --BrownHairedGirl (talk) 20:38, 27 November 2021 (UTC)

@Harej: this will not be resolved until the bug is fixed. --BrownHairedGirl (talk) 04:58, 9 December 2021 (UTC)

@Harej and @en:User:Cyberpower678: it appears that the phab:T291704 bug may have been fixed, at least in respect of |accessdate=
See these 2 bot edits:
  1. https://en.wikipedia.org/w/index.php?diff=prev&oldid=1059937012 : bot adds archive links without duplicating existing unhyphenated parameter "accessdate"
  2. https://en.wikipedia.org/w/index.php?diff=prev&oldid=1059934705 - bot adds archive link to https://legalcounselbd.com/by-executing-cht-peace-accord-bangladesh/ without duplicating existing unhyphenated parameter "accessdate"
Please can you respond to say what has been changed? And please also respond at phab:T291704? --BrownHairedGirl (talk) 15:27, 12 December 2021 (UTC)
Responded on your talk on enwiki. The update fixes the problem mentioned in the talk page. —CYBERPOWER (Chat) 19:14, 15 December 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:14, 15 December 2021 (UTC)

Useless page link

Is it possible to teach a bot not to link a page if the book is presented only in a download format like here? --Yellow Horror (talk) 09:24, 12 November 2021 (UTC)

This is something for GreenC to address. I will point him here, and should respond soon. —CYBERPOWER (Chat) 19:19, 24 November 2021 (UTC)
I don't know what "a download format" means. The URL [14] opens to page 436 of the book. Similar to Google Books preview, some pages are available to view in full. In this case, pages 437 and 437 can be read. -- GreenC (talk) 20:37, 24 November 2021 (UTC)
@Yellow Horror: Does this answer your question? —CYBERPOWER (Chat) 19:19, 8 December 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:15, 15 December 2021 (UTC)

Massive load to Wayback Machine

Hi,

Is it possible to massive load of hyperlinks to Wayback Machine?

Kind regards ArturM (talk) 11:54, 4 December 2021 (UTC)

ArturM, the Wayback Machine allows bulk upload through an API as well as with Google Sheets. Harej (talk) 20:06, 8 December 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:30, 15 December 2021 (UTC)

Ref errors

I don't know much about archiving or bots, but I noticed a series of edits on December 4, some of which don't seem quite right. If you go thorough these edits, some have left ref errors that have since been corrected by others. The worst example is this one where all the bare-url references were left less readable than they were before. Other edits have caused other kinds of errors. I have no idea what the root of this is. MB (talk) 22:27, 5 December 2021 (UTC)

@MB:, I'm not really seeing the problem. I'm seeing proper conversions to citation templates, but, these URLs have no titles on them so a placeholder is being inserted due to a requirement of the citation template. The bot is being invoked by an individual. It's best to contact them about what they are choosing to do with the bot on these articles. If you have any further questions, please feel free to reach out. —CYBERPOWER (Chat) 19:48, 8 December 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:30, 15 December 2021 (UTC)

Failed dead link fix in February 2021

Good day, I have noticed that this bot failed to properly fix a link on the English Wikipedia article "1955 doubled die cent" in February 2021. The reason for this failure appears to be that the website cited renamed its URL, left a redirect, and later deleted the redirect to the new URL, resulting in the dead link. The bot appears to have tried to access an archive made after the cited website deleted the redirect, and therefore ran into a 404 error. The cited webpage still exists under its new URL, and I have manually fixed it (see here), but I would suggest, if possible, that the bot's code be edited so that it can try an older archive (if one exists) if it runs into a dead archive link while trying to fix dead links.
58.107.92.93 05:12, 3 December 2021 (UTC)

Thank you for fixing the URL. Unfortunately, what you are asking, is not currently possible with the bot. The bot simply checks if the destination of the URL is a working page. If it doesn't work, then it can't know that the page exists elsewhere and it will attempt attach an archive of the page. Failing that, it will simply mark the URL dead. We will consider automatic URL updates to external links as a future feature of IABot. —CYBERPOWER (Chat) 19:32, 8 December 2021 (UTC)
I believe that you may have misunderstood what I was suggesting the bot do. In the above case, the page was archived several times since 2000, when the original URL worked, and the reason why the fix failed was because the bot appears to have tried to access a more recent archive created after the URL was deleted, then gave up and marked the link as permanently dead. I was suggesting that, in the event that the bot fails to find a working archived URL at a more recent date, it instead should continue trying increasingly old archives until it either finds one that works, or it can no longer find any more, and only then should it mark the link as permanently dead. I hope that this will be possible to implement.
58.107.92.93 00:26, 12 December 2021 (UTC)
We discovered that the Wayback Machine was the underlying cause of the problem. We have reset the bot and the next time it should use the correct archives. Harej (talk) 20:06, 15 December 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:05, 22 December 2021 (UTC)

Fatal error

For the last ~5 hours, the interface page https://iabot.toolforge.org/index.php?page=runbotsingle has consistently given the following output:

Fatal Error: You are upgrading from v2.0.8.3 to v2.0.8.4. This update requires a clean install, or an update script. Please run update.php in the source root directory.

I just wanted to be sure that the bot owners are aware of this. --BrownHairedGirl (talk) 00:52, 11 December 2021 (UTC)

BrownHairedGirl, this has been fixed. Harej (talk) 14:57, 11 December 2021 (UTC)
Thanks, @Harej! Is there any doc on what changes have been made in the new version? --BrownHairedGirl (talk) 15:14, 11 December 2021 (UTC)
BrownHairedGirl, 2.0.8.4 has one change, to address T291704. Harej (talk) 19:34, 15 December 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:05, 22 December 2021 (UTC)

bewiki

Hi! This bot does not work in the Belarusian Wikipedia in the mode of archiving/fixind dead links, it only bluelink books. You don't know, why? Our administrator said, that in Configure (https://iabot.toolforge.org/index.php?page=wikiconfig) we have "Scan for dead links: true". So it must work. Page "User:InternetArchiveBot/Dead-links.js" we have. ZlyiLev (talk) 09:07, 12 December 2021 (UTC)

ZlyiLev, thank you for bringing this to our attention. This has now been fixed and the bot is running on Belarusian Wikipedia. Harej (talk) 20:02, 15 December 2021 (UTC)
This section was archived on a request by: Harej (talk) 19:05, 22 December 2021 (UTC)

InternetArchiveBot is again percent-encoding Chinese characters

Hi Cyberpower678 (talk · contribs) and Harej (talk · contribs). I have posted several messages about InternetArchiveBot's percent-encoding of Chinese characters at en:User talk:Cyberpower678:

  1. 6 June 2021
  2. 26 June 2021
  3. 5 July 2021

This week, InternetArchiveBot has made a series of edits (1, 2, 3, and 4) to percent encode Chinese characters again, including in articles where I had previously reverted the bot.

As an example, the bot changed:

https://web.archive.org/web/20210308022025/https://paper.hket.com/article/2309597/61歲YouTuber張媽媽%20擁30萬粉絲吸百萬瀏覽

to:

https://web.archive.org/web/20210308022025/https://paper.hket.com/article/2309597/61%E6%AD%B2YouTuber%E5%BC%B5%E5%AA%BD%E5%AA%BD%20%E6%93%8130%E8%90%AC%E7%B2%89%E7%B5%B2%E5%90%B8%E7%99%BE%E8%90%AC%E7%80%8F%E8%A6%BD

Percent encoding these non-English language characters makes them unreadable. This makes it harder to catch whether the wrong URL was used or to detect from the URL a summary of what a source says. Would you modify the bot not to percent-encode the non-English characters? I do not want to have to keep reverting the bot when it makes these edits.

On a separate issue, in this edit, the bot set https://hk.appledaily.com/special/20181209/CSERSVHJ6LIXE6I3QKR36PUG54/ with |url-status=live but the URL has been dead since 24 June 2021 because of en:Apple Daily#2021 arrests and closure. Would you fix this error too? Thanks, Cunard (talk) 11:24, 30 December 2021 (UTC)

The bot has a switch in the configuration settings for normalizing the URLs. I've disabled it for EnWiki. It should stop making these changes. —CYBERPOWER (Chat) 19:37, 5 January 2022 (UTC)
I have also marked the domain hk.appledaily.com as permadead. The bot is now going around to update pages with these links. —CYBERPOWER (Chat) 19:45, 5 January 2022 (UTC)
Thank you so much for disabling it for the English Wikipedia and marking the hk.appledaily.com domain as disabled, Cyberpower678 (talk · contribs). This helps a lot! I really appreciate your making this change as I like the bot a lot and had no concerns with its changes other than the percent-encoding the bot did. What was the reasoning for normalizing the URLs? Would it make sense to disable normalization globally as it seems that most language Wikipedias would prefer seeing the actual characters in their language instead of the percent-encoded text? Unless there is a technical reason I'm unaware of, I think it would be better for the bot to "un-normalize" rather than normalize the URLs (in other words, convert the URLs from having percent-encoded characters to having the actual characters in the language).
As an example, in Nk (talk · contribs)'s comment here in June 2021 regarding the Bulgarian Wikipedia, Nk wrote, "that type of encoding which is also unwanted". This was in reference to InternetArchiveBot changing:
https://web.archive.org/web/20111004140141/http://xfactor.novatv.bg/people/view/12/Михаела-Филева/
to:
https://web.archive.org/web/20111004140141/http://xfactor.novatv.bg/people/view/12/%D0%9C%D0%B8%D1%85%D0%B0%D0%B5%D0%BB%D0%B0-%D0%A4%D0%B8%D0%BB%D0%B5%D0%B2%D0%B0/
Cunard (talk) 09:22, 6 January 2022 (UTC)
It used to be a limitation of the bot, but in an update, I was able to somewhat overcome this limitation. The bot requires the encoded URLs to properly query the servers, or properly format archive URLs. Decoding them and printing them in their decoded form has demonstrated to have issues in the past with some URLs. While it would be trivial to decode them before printing them, it may leave the URL broken in some cases (defeating the purpose of the bot). Archive.today is good example of this. Those URLs are stored exactly as they were encountered, and thus a decoding it, may break the URL. Conversely encoding it may break the URL. So with archive.today, the bot retains the precise formatting in those instances. I am designing an entirely new version of IABot, which will be known as InternetArchiveBot 3. It will have a completely new code structure which will be a lot more flexible in handling references, URLs, and other stuff, as well as offer better expandability of feature sets. Of course normalizing URLs will no longer be an option and printing decoded URLs will be the default. —CYBERPOWER (Chat) 14:09, 6 January 2022 (UTC)
Thank you so much for providing this background information and context! I am excited about your future work on InternetArchiveBot 3, which will print decoded URLs by default and have more flexibility. I really appreciate your sharing this information and making these improvements, Cyberpower678 (talk · contribs). Cunard (talk) 11:17, 7 January 2022 (UTC)
@Cunard: Gladly, but IABot 3 has not even yet started development. It will be a while before even a testable version of the bot is available. In addition to that, I'm going to need more manpower to work on this. IABot 3 will be completely written from scratch. —CYBERPOWER (Chat) 18:19, 7 January 2022 (UTC)
@Cyberpower678: A complete rewrite is a really large undertaking and you are very busy with a lot of things including the many questions you get on this talk page, so I anticipate that IABot 3 will take quite some time to be rolled out. Thank you again for the future development of IABot 3 as IABot 2 already is doing so much good for the wiki in combating link rot and for the transparency in sharing further details about IABot 3! Cunard (talk) 11:18, 8 January 2022 (UTC)
This section was archived on a request by: Harej (talk) 19:03, 12 January 2022 (UTC)

wheelchair-information.com expired domain

Hi. The site wheelchair-information.com is used as a reference through the WPs, though now is a dead url as an expired domain registration. Thanks.  — billinghurst sDrewth 00:40, 31 December 2021 (UTC)

Domain has been marked as permadead. Thank you. —CYBERPOWER (Chat) 19:47, 5 January 2022 (UTC)
This section was archived on a request by: Harej (talk) 19:04, 12 January 2022 (UTC)

Ukrainian month names

Urgent!!!

This bot makes incorrect changes, namely replaces the genitive case of the month in dates with the nominative one: example.


The same in Ukrainian: Цей бот здійснює некоректні зміни, а саме замінює родовий відмінок місяця в датах називним: приклад.

Best regards, Рассилон (talk) 10:56, 16 December 2021 (UTC)

@Harej: Some of more recently: uk:Special:Diff/34108978/34127380. --Рассилон (talk) 09:53, 19 December 2021 (UTC)
Рассилон, the bot had been disabled on Ukrainian Wikipedia, but it was re-enabled by User:A1. Because the underlying problem that led to the bot to be disabled (the one you brought up) was not fixed I have re-disabled it. I recommend not re-enabling the bot on Ukrainian Wikipedia until this bug is fixed. For reference the bug is T288495. Harej (talk) 19:11, 22 December 2021 (UTC)
It is not a good decision, because archiving references is a really useful work that could not be done by anyone else, at the same time mistaking in cases is not so harmfull and really is very easy to fix by a lot of other bots (Andriy.vBot or Aibot succeded with it). A1 (talk) 20:15, 22 December 2021 (UTC)
A1, the main issue is whether there is consensus for the bot to continue operating in spite of the bug. Default policy is to not operate a bot on a wiki where there is a bug that defaces articles. If we can get those bot developers to actively clean up after the buggy edits then that could be a path to us re-enabling the bot. Could you reach out to them and ask them to comment here? Harej (talk) 19:15, 5 January 2022 (UTC)
This section was archived on a request by: 17:17, 26 January 2022 (UTC)

Removing archive url selection

[15] - what do you think you're doing? The selection thing was there for a reason. Please revert this edit and fix any mess you've done. 2A00:F41:484C:C715:6202:869A:4523:B7E1 10:41, 24 December 2021 (UTC)

Because the URL fragment was missing in the original URL, it was not carried through. The bot matches the archive URL to the original. You need to include the fragment in the original URL as well. —CYBERPOWER (Chat) 17:33, 29 December 2021 (UTC)
I don't get it. Isn't the "selection" fragment specific to archive.today? It's not a part of the original url but rather a handy feature offered by the archiving service. 2A00:F41:4861:72EE:FB09:8D75:ACEF:F077 22:41, 29 December 2021 (UTC)
I’m not aware of such a feature. —CYBERPOWER (Chat) 22:44, 29 December 2021 (UTC)
Then stop messing with our urls, please. 2A00:F41:48E6:5C79:2BC:84BA:F67C:8AFF 10:33, 31 December 2021 (UTC)

I am using that feature, too. It automatically highlights specific parts of the article - in this case, the section about the year 1967. You can use it by marking text on the archived page, which will change the url accordingly. Such additions should indeed be left alone by the bot. Renerpho (talk) 04:24, 2 January 2022 (UTC) -- And of course any changes made by the bot related to this in the past should be fixed. Renerpho (talk) 04:30, 2 January 2022 (UTC)

Just out of curiosity, is this a new feature of archive.today? In other words, when was it introduced? —CYBERPOWER (Chat) 20:04, 5 January 2022 (UTC)
Renerpho, pinging in case you didn't see this. Harej (talk) 19:59, 12 January 2022 (UTC)
This is a new-ish feature ca. 2020. Created a Phab task T299438-- GreenC (talk) 18:54, 18 January 2022 (UTC)
This section was archived on a request by: Harej (talk) 17:18, 26 January 2022 (UTC)
Return to the user page of "InternetArchiveBot/Archives/2021".