User talk:InternetArchiveBot/Archives/2024

Why bot encodes decoded links?

It's wrong for links containing non-ASCII characters, because it makes links less readable: https://ru.wikipedia.org/w/index.php?title=Рыбий_клей&diff=prev&oldid=133979619 MBH (talk) 19:32, 9 January 2024 (UTC)

One more case. @Cyberpower678 @GreenC MBH (talk) 13:41, 16 January 2024 (UTC)
MBH, the bot is required to encode links to look them up. A future version will not do this. Harej (talk) 20:52, 27 March 2024 (UTC)
This section was archived on a request by: Harej (talk) 18:28, 3 April 2024 (UTC)

IABot not functioning normally and times out

I have experienced problems with the IABot not functioning as normal, and timing out. I hope this can be investigated and resolved. Marshelec (talk) 06:32, 12 January 2024 (UTC)

Same issues encountered here, I have been experiencing 504 Gateway Timeout Error for 95% of my run since 2–3 days ago even though the IABot just needed to archive 1–3 sources for my every run, looks to me on surface that it's iterating through every single sources regardless if it has been archived. Paper9oll (talk) 07:25, 14 January 2024 (UTC)
A ticket has already been raised for this problem, by another user. See: https://phabricator.wikimedia.org/T355010 I note that the author/creator of IABot is a subscriber to the ticket, so has almost certainly been notified or seen the problem report. Marshelec (talk) 19:37, 14 January 2024 (UTC)
This section was archived on a request by: Harej (talk) 18:28, 3 April 2024 (UTC)

The bot marked the same link as dead twice.

Se edits #1 and #2. The link works when I check it. Hubba (talk) 01:54, 23 January 2024 (UTC)

Hubba, this appears to be due to geo-restriction interfering with our U.S.-based link checker. I have added that domain (and the root domain and www. variant) to our permalive list. Harej (talk) 21:01, 27 March 2024 (UTC)
This section was archived on a request by: Harej (talk) 18:28, 3 April 2024 (UTC)

Adding strange, non-functional archive links to en:2024 Haneda Airport runway collision

Please see this diff. I'm not sure what's going on, but InternetArchiveBot keeps adding incorrect archive links pointing to a googleads.g.doubleclick.net page that doesn't seem to exist rather than to the kyodonews.net link that's actually present in the reference. (It's also edit-warring with Citation Bot, which correctly removes the bad archive links.) This appears to be a bot problem rather than an Internet Archive problem, as the proper link does exist in the Internet Archive. Jay8g (talk) 00:20, 7 January 2024 (UTC)

Jay8g, this should now be resolved. Please let us know if it happens again. Harej (talk) 20:11, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

Useless bot edits

Tracked in Phabricator:
Task T361746

Hi! What is the point of these two changes?

Ideawipik (talk) 02:02, 25 January 2024 (UTC)

Hi I was wondering the same, that is why the bot keeps replacing .is links with .today ones, even if the only one working are .is.
I've corrected the same page twice now, so i was wandering how to make it stop. Astubudustu (talk) 10:39, 16 March 2024 (UTC)
Ideawipik, Astubudustu, while "archive.today" is the standard domain and we tend to standardize this domain, you are right that if this is the only content of the edit, the edit should not be made. I have prepared a bug report. Harej (talk) 20:21, 3 April 2024 (UTC)
Thank you so much! Astubudustu (talk) 20:54, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

cbignore

Why didn't cbignore work? Proeksad (talk) 20:20, 28 January 2024 (UTC)

Proeksad, for whatever reason the "Cbignore" template was not configured as a setting for Russian Wikipedia. This setting has now been changed. Harej (talk) 20:26, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

Archive.ph → Archive.today

https://nl.wikipedia.org/w/index.php?title=Patreon&diff=next&oldid=66920330

and

https://nl.wikipedia.org/w/index.php?title=Prog_(tijdschrift)&diff=next&oldid=66920158

But archive.ph is the same service and the link with ph works fine. This is again a clear violation of the Dutch version of “if it ain't broke, don't fix it” guideline, just like the most recent time we spoke. Mondo (talk) 20:11, 1 February 2024 (UTC)

Mondo, bug report has been filed. Harej (talk) 20:29, 3 April 2024 (UTC)
Thank you. 🙂 Mondo (talk) 20:38, 3 April 2024 (UTC)
I replied in the Phab giving the technical reason why, it's done for functional reasons not cosmetic, archive.today is a special domain that is functionally more reliable then the other ones, and it's also the domain the owners of archive.today requested we use on Wikipedia as a safeguard against potential future outages. -- GreenC (talk) 14:42, 4 April 2024 (UTC)
They can request whatever they want, but at least on the Dutch Wikipedia, changes at the request of owners are seen as an unwanted change and even without their request it's seen as an unwanted change, so something still needs to be done about it. Mondo (talk) 14:57, 4 April 2024 (UTC)
Besides, it looks like the bot doesn't even care for archive.today that much anyway, as it just changed an archive.today URL to archive.is: https://nl.wikipedia.org/w/index.php?diff=prev&oldid=67337586 (the second highlighted reference). I used IABot for this. Mondo (talk) 19:56, 7 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

The bot keep adding archive link where it isn't required.

The bot always try to add this link but it isn't needed. It happened like 3 times and I had to cancel the change every time.

https://web.archive.org/web/20211012034604/https://incubator.wikimedia.org/w/index.php?hidebots=1&translations=filter&hidecategorization=1&hideWikibase=1&limit=50&days=3&title=Special%3ARecentChanges&testwiki=wp%2Fryu&urlversion=2 Patronus95 (talk) 12:51, 2 February 2024 (UTC)

Patronus95, where is this link being added? Harej (talk) 20:53, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

stalled out job?

https://iabot.wmcloud.org/index.php?page=viewjob&id=17011 didn't noticed this had stalled out 2 days ago Akaibu (talk) 18:14, 7 February 2024 (UTC)

Akaibu, looks like it is now done. Sometimes it can take a while. Harej (talk) 20:54, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

Finlex.fi URLs aren't dead

Bot's edits: [1], [2], [3]. Some URLs it tagged as dead but are actually working: [4], [5], [6]. 85.76.13.79 18:33, 10 February 2024 (UTC)

The site has a "Are you human?" check box and that is probably the cause. I set the domain to Subscription for now. It will stop the bot from changing it to dead. It also means that bot won't be fixing dead links, for this domain. -- GreenC (talk) 15:01, 17 March 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

urldatachangestate

Hi!

I'm translating InternetArchiveBot user interface into Hebrew, and I have a question.

The message urldatachangestate says "from <b>{{logfrom}}</b> to <b>{{logto}}</b>". I guess that "{{logfrom}}" and "{{logto}}" are something like "live", "dead", etc., but can you please explain more specifically what are the possible values?

And are they always in English, or can they be translated?

I'll update the documentation for translators after you reply.

Thanks! Amir E. Aharoni (talk) 03:00, 29 February 2024 (UTC)

Dead, Dying, Alive, Unknown, Subscription, Permadead, and Permalive are the statuses and yes, they are translatable. —CYBERPOWER (Chat) 21:09, 3 April 2024 (UTC)
Thanks! I updated the documentation accordingly. Amir E. Aharoni (talk) 15:07, 4 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

Issue with Billboard short links

I've run into this a bit when going through the url=value CS1 pages. So, this bot was just run on https://en.wikipedia.org/w/index.php?title=Tony_Martin_(British_singer). If you look at the comparison between 9 January 2024 and 2 March 2024 (04:13), you'll see that one of the changes made was to the shortened Billboard link used by previous editors. I'm fixing it with the long links, but it seems IABot wants to change the symbols used to shorten URLs on Wikipedia into the code used in URLs? I've been fixing these for a while, but they aren't the only issues I come across in the CS1 pages, so it's the first time I've noticed which bot is doing this particular function.

I though someone should know. OIM20 (talk) 09:51, 2 March 2024 (UTC)

OIM20, thank you for letting us know. I have filed a bug report. Harej (talk) 21:13, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

Bot citing dead link on talk pages

On talk pages where the bot leaves a description of its edits (example), it links to a dead page where we are supposed to report errors. Ubh (talk) 15:36, 7 March 2024 (UTC)

The URL changed to https://iabot.wmcloud.org. Please don't report errors from 6+ year old edits. They are far too old to be meaningful in improving the bot.—CYBERPOWER (Chat) 21:14, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

IABot for Gagauz language

Can you please authorize me to use IABot for Gagauz language to on the gagwiki (Gagauz Wikipedia)? I can currently use it for English (enwiki) and Russian (ruwiki), but not the Gagauz one.

When I try to use the bot on a gagwiki (Gagauz Wikipedia) page, I get "Permission error" and "The action you are trying to perform requires the analyzepage permission." and "This permission is obtainable with the following groups: basicuser, user, admin, root, bot".

My Wikipedia userpage is https://en.wikipedia.org/wiki/User:Maxim_Masiutin Maxim Masiutin (talk) 03:34, 10 March 2024 (UTC)

Maxim Masiutin, you need a minimum of ten edits on that wiki in order to use InternetArchiveBot. You currently have four. Harej (talk) 21:22, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

Bot (innocently) allowing itself to look rude and arrogant/condescending/entitled

In case this has already been fixed, I apologize for being behind.
I have no way of knowing whether it has, and/or have not found a place where I would have had.
I guess something might be in the docs, but it has not been obvious or easy to find for me, sorry.

tldr: This could IMHO be fixed without any fuss and for good with a flick of the wrist by just adding a few words at the start of the first paragraph of the bot's message, making it begin with "Internet Archive Bot [Link] here." /tldr

I came across a place ([here https://en.wikipedia.org/wiki/Talk:Aerospace_engineering] and in fact many more) where there is a section, created by this bot, titled "External links modified", followed by an IMHO appropriate greeting, "Hello fellow Wikipedians", followed by a number of very appropriate factual statements, BUT THEN followed by,

"When you have finished reviewing my changes, you may follow the instructions..."

It seems to me that for a reader who, to this point of reading the section (and onwards), is not aware (as content may well be read from top to bottom rather commonly) that they are reading a message generated by a bot, being told rather bluntly that

  • "When you have finished reviewing my changes",
may appear to that reader to have been written by an author with a rather entitled personality and/or behavior, such as to assume that the reader "will" or "has to" review that authors's changes, as though the author were (feeling) entitled to the reader doing so.
It seems to me that this means running a risk of causing a casual reader to
  • be upset
    by what they may well perceive as "this kind of language and behaviour towards" [themselves and the "fellow Wikipedians"],
  • respond badly, such as
  • feeling treated condescendingly and/or
  • now feeling specifically disinclined to "review ... the changes"
thus producing a disservice to
  • the objective of having the changes reviewed by a person
  • peace, quiet and style on WP
  • "... you may follow the instructions ..."
It seems to me that this looks and feels like more of the same, and even more strongly so.
(I know the wording may sound innocent by itself, but it seems to me that it's the context that makes the difference.)
Remark: That part of the wording was not found on the page given above (seems something had been improved in the meantime) (but on a page I don't wish to link to.)

JFTR, that is for sure how I just felt when I had read that passage to that point without realizing I was reading something written by a Bot.

About followup (after a fix has been done) on older pages that still reflect the previous presence of the problem: Would it be historical misrepresentation or maybe just a nice idea to have the Bot occasionally fix (update) the wording it left there when it did, maybe in its free time :) ?

$02c FWIW, HTH -- 93.232.230.13 13:17, 13 March 2024 (UTC)

Thank you for the feedback. I'd like to note that the bot has largely stopped posting these messages, especially on English Wikipedia. Harej (talk) 21:31, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

Page size limit

Greetings. I read you plan to increase the limit on the single page tool. I need this to run on a page with about 800 links. When do you plan to increase the limit? SusanLesch (talk) 17:59, 20 March 2024 (UTC)

Well I figured out a workaround for now. I copied the article to a sandbox in parts, and ran the bot on the parts. SusanLesch (talk) 20:44, 20 March 2024 (UTC)
There shouldn't be a page size limit on the bot anymore. Are you getting an error? —CYBERPOWER (Chat) 21:33, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

Encode subject lines of emails from InternetArchiveBot

When there are non-ASCII characters in an email subject line, the entire subject should be encoded as UTF-8 so that it will display properly for the recipient. I received email from InternetArchiveBot about a submission for the Turkish Wikipedia with "Subject: Bot iÅŸiniz 18485 tamamlandı!" and about one for the Italian Wikipedia with "Subject: La tua attività di bot 16464 è stata completata!" The corresponding text in the body of the message displayed properly, with all the diacritical messages where they should be: La tua attività di bot 16464 è stata completata! I use gmail, so it's possible that gmail is doing something wrong.

This page explains what to do: https://www.telemessage.com/developer/faq/how-do-i-encode-non-ascii-characters-in-an-email-subject-line/ and the service at https://www.sendblaster.com/utf8-email-subject-encoder/ will encode a subject line, one line at a time, so that "Subject: La tua attività di bot 16464 è stata completata!" would become Subject: =?UTF-8?B?TGEgdHVhIGF0dGl2aXTDoCBkaSBib3QgMTY0NjQgw6ggc3RhdGEgY29tcGxldGF0YSEg?= Eastmain (talk) 20:56, 30 March 2024 (UTC)

Thank you for your report Eastmain. I have filed a bug report. Harej (talk) 21:43, 3 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)

False positives and reporting

The bot appears to mark https://ochem.eu/* pages as dead links. These are not dead: when I visit http://ochem.eu/article/99826, the page redirects and asks me to login, but I can login as a guest and get redirected back to the page I'm looking for. This elaborate double-redirection process may be blocking the site to crawlers and causing the false positives.

I would report this problem through the "report false positive" link, but that appears broken: it says I don't have the "reportfp" privilege, even though that should be available to all users.

Thanks, Bernanke's Crossbow (talk) 05:58, 2 April 2024 (UTC)

Bernanke's Crossbow, usually when this happens it's because of geo-restrictions affecting our link checker. However, I visited that website with a VPN and the site would not load then either. So the website appears to at least be inaccessible to much of the Internet. Harej (talk) 21:51, 3 April 2024 (UTC)
Ah. In fact, I just discovered it's even weirder than that: until today, I've only ever visited the site in Firefox's InPrivate mode. I just tried it without InPrivate, and it fails to load then too (but works fine in InPrivate still). They must be doing something very strange with cookies.
Thanks and sorry to have bothered you, Bernanke's Crossbow (talk) 22:24, 3 April 2024 (UTC)
I set the domain to Subscription so the bot will skip it. -- GreenC (talk) 14:23, 4 April 2024 (UTC)
This section was archived on a request by: Harej (talk) 20:21, 10 April 2024 (UTC)
Return to the user page of "InternetArchiveBot/Archives/2024".