FYI: Racking up filesEdit

Seems that there is some DB type issue as linkwatcher is racking up link files and not processing a lot of links. I have been through both a restart of linkwatcher and a reboot of the liwa3 instance, though it keeps slowly increasing with only occasional file loading. I also did a hard restart of coibot, though no reboot, just in case.

<sDrewth> !backlog
<COIBot> No reports waiting. On Wiki: 0 open XWiki reports and 30 open Local reports.
<COIBot> Syslogs: 20: - coibot: -1 secs. commander: 0 secs. diffreader: 14658 secs. linksaver: 8 secs. parser: 4093 secs. readbl: - script: -1 secs.
<LiWa3_2> Syslogs: 20: - diffreader: 14 secs. linkanalyser: 697 secs. linkparser: 2688 secs. linkreporter: 12365 secs. linkwatcher: 125 secs. output: - script: -1 secs.
<LiWa3_3> LW: 03 hours 26:08 minutes active; RC: last 1 sec. ago; Reading ~ 864 wikis; Queues: P1=222; P2=1687; P3=90 (101 / 399); A1=1111; A2=0 (1194 / 868); M=0 - In LinkWatcher edit-backlog: 4094 files (0 lines) and in analyser backlog: 8 files (0 lines).

 — billinghurst sDrewth 02:46, 7 March 2021 (UTC)

@Billinghurst: syslog.linkparser throws strange errors - it seems that somewhere the system changed and it misfires on regexes which then results in mis-assigned edits. It looks like all edits are on 'Mw::' (as if it does not read the diffurl good), and then it tries to read diffs from en.wikipedia that were somewhere else. I don't understand yet what is the issue (whether it is DiffReader.pl that misinterprets stuff, or if linkwatcher.pl is doing something with the data). Note that also in the backlog files things are wrong, so it goes wrong between reading the diff from the feed by DiffReader.pl, and storing them by linkwatcher.pl (i.e. before it hits the LinkParser.pl). --Dirk Beetstra T C (en: U, T) 06:01, 7 March 2021 (UTC)

@Billinghurst: is there somewhere an extremely slow or throttled server? So that the bots lag because they have to wait for info/response? —Dirk Beetstra T C (en: U, T) 15:06, 7 March 2021 (UTC)

User:Sic19 thwocking WD with urlsEdit

I can see that User:Sic19 is racking up edits at WD adding official websites, and up to about 10M edits (unsure how any are recent). I have undertaken wl add Sic19 * and I hope that is the right solution. If there is something better that I can do, then please let me know.  — billinghurst sDrewth 23:33, 13 March 2021 (UTC)

@Billinghurst: yes, I guess that is it. Flooders on WD are an issue. Not parsing them and not getting the data is not really an option either (it sets a record for official websites, if I have time I could program something to remove ‘official site to subject’ from stats and become more precise). Crap ...
We should be whitelisting/ do-no-count-ing more links on WD though. Would be great if suchs flooders could inform linkwatcher and coibot beforehand ... —Dirk Beetstra T C (en: U, T) 05:50, 14 March 2021 (UTC)

Can COIBot/LinkReports be generated manually?Edit

It sometimes happens that I'm not sure whether there is enough evidence to report a domain at en:Wikipedia talk:WikiProject Spam. It would be nice if I could generate a preview of the domain's LinkReport without it being saved. This would tell me if the domain is worth reporting. —Bruce1eetalk 09:12, 5 April 2021 (UTC)

@Bruce1ee: I have granted you access to request reports at user:COIBot/Poke. Or if you use IRC, then connect to Freenode and #wikimedia-external-links and you can request reports and run some analytics (Small Wiki Monitoring Team/IRC).  — billinghurst sDrewth 13:53, 5 April 2021 (UTC)
Thank you billinghurst, that will help a lot. I'll be using COIBot/Poke as I don't use IRC. —Bruce1eetalk 14:00, 5 April 2021 (UTC)
@Bruce1ee and Billinghurst: I was earlier today planning to check whether you had that capability.   Done. --Dirk Beetstra T C (en: U, T) 14:16, 5 April 2021 (UTC)
Thanks. —Bruce1eetalk 14:31, 5 April 2021 (UTC)

IRC/Migrating to Libera ChatEdit

We are going to need to do COIBot and LiWa3.

Tell me what you would like me to do to assist. Happy for whatever drudgery tasks you need done. First question is do you want a phab ticket generated, or is coordination going to be here?  — billinghurst sDrewth 13:54, 21 May 2021 (UTC)

# freenode settings
freenodeserver=irc.freenode.net
freenodeserverport=8001

...

# freenode settings
freenodeserver=irc.libera.chat
freenodeserverport=8001
... IRC client at irc.libera.chat on ports 6665-6667 and 8000-8002 for plain-text connections, or ports 6697, 7000 and 7070 for TLS-encrypted connections.

Painful.

I’ll be around tomorrow to make a start, after that very limited. I first want to run a backup of current, then linkwatcher (easiest), then COIBot and XLinkBot. Problem is going to be channel modes and users in the channels (but that needs a cleanup anyway). —Dirk Beetstra T C (en: U, T) 07:19, 22 May 2021 (UTC)

The IRC stuff I can do; especially in association with CVN group. I have already started on a couple. Create list of your hierarchy, and I can work with the group coordinator and the CVN network to get things in place. Happy to be the legs.  — billinghurst sDrewth 11:12, 22 May 2021 (UTC)

@Billinghurst: See User:Beetstra/LiberaMove. --Dirk Beetstra T C (en: U, T) 07:02, 23 May 2021 (UTC)

Encoding errorEdit

User:COIBot/LinkReports currently shows "12:22:51, Tue Jun 01, 2021 - highlevelsound.blogspot.de - XWiki link additions by КиноФан2021". The linked page shows "КиноФан2021" encoded correctly. Any way you could fix that? ~~~~
User:1234qwer1234qwer4 (talk)
17:48, 3 June 2021 (UTC)

@1234qwer1234qwer4: yes, I am aware of that problem. Encoding and de-coding is sometimes tricky in Perl, and it appears that there is a double loop (or even worse) somewhere. Sometimes you see also 2 users in the list of users where one is the encoded/decoded version of the other one. It is on my list (User:COIBot/Wishlist, item 9 basically), I will also put this one in the list to re-parse and see if I can resolve it. I do have some time end of this month to do so.
The thing that is never mangled is the diff-link, so in case of doubt that is the way to use. --Dirk Beetstra T C (en: U, T) 05:27, 6 June 2021 (UTC)

Category:Local COIBot ReportsEdit

This seems redundant with Category:COIBot Local Reports. Should they be merged? ~~~~
User:1234qwer1234qwer4 (talk)
10:02, 10 June 2021 (UTC)

@1234qwer1234qwer4: Looks like it. It looks like Category:Local_COIBot_Reports only contains reports from 2013, 2014, 2015. Just a set of forgotten reports. Yes, they can be merged and then we will have to monitor for a bit whether COIBot is not adding stuff. --Dirk Beetstra T C (en: U, T) 10:16, 10 June 2021 (UTC)
I have redirected the category and moved its contents with the flood flag. ~~~~
User:1234qwer1234qwer4 (talk)
13:51, 10 June 2021 (UTC)
@1234qwer1234qwer4: thanks, lets keep an eye if COIBot is not refilling. —Dirk Beetstra T C (en: U, T) 11:35, 11 June 2021 (UTC)
Huh? ~~~~
User:1234qwer1234qwer4 (talk)
22:15, 12 June 2021 (UTC)
That is what I meant. See User:COIBot/Wishlist (I will go through the code somewhere next week). Dirk Beetstra T C (en: U, T) 05:45, 13 June 2021 (UTC)
@1234qwer1234qwer4: Actually, this diff that you cited above suggests that the bot mistakingly takes the en.wikipedia settings from en:user:COIBot/Settings and applies them here instead of user:COIBot/Settings. Maybe it is a failed settings load that is not caught by the bot. I recall that I have seen a similar error on en.wikipedia, where COIBot did not save use the path 'user:COIBot/LinkReports/' on example.com (to save User:COIBot/LinkReports/example.com) but instead used the en.wikipedia path 'wikipedia:WikiProject Spam/LinkReports/' to save Wikipedia:WikiProject Spam/LinkReports/example.com on meta (which the mediawiki software funnily enough recognizes as 'save WikiProject Spam/LinkReports/example.com on en.wikipedia'.
I will try to check for failed settings loading, probably by setting a fake parameter on each. Dirk Beetstra T C (en: U, T) 08:34, 15 June 2021 (UTC)