Talk:Spam blacklist/Archives/2019-12

Add topic
Active discussions

Proposed additions

  This section is for completed requests that a website be blacklisted

Url shorteners















Hi. I found this bunch of shorteners in the ruwiki blacklist, they should be added to the global one I think. Чоч.рф was added as \b\xD1\x87\xD0\xBE\xD1\x87.\xD1\x80\xD1\x84 and \bxn--n1arb\.xn--p1ai. Track13 0_o 09:15, 1 December 2019 (UTC)

@Track13:   Added to Spam blacklist. Thanks. -- — billinghurst sDrewth 11:45, 1 December 2019 (UTC)

dewabet888.asia



Spam-source. -- Tegel (Talk) 16:39, 2 December 2019 (UTC)

@Tegel:   Added to Spam blacklist. --Tegel (Talk) 16:40, 2 December 2019 (UTC)

gameonline303.com



Spam-source. -- Tegel (Talk) 16:44, 2 December 2019 (UTC)

@Tegel:   Added to Spam blacklist. --Tegel (Talk) 16:44, 2 December 2019 (UTC)

thepavementlightcompany.com



(initially posted to en:User talk:XLinkBot/RevertList#thepavementlightcompany.com, then en:MediaWiki talk:Spam-blacklist#thepavementlightcompany.com; transferred here at helpful suggestion of Dirk Beetstra)
Links within this domain are being spammed to en:Pavement light (and have been spammed to the simple-en version, and some other en-wiki pages) by a narrowish range of IPs, I assume just one person. The person says that they are a manufacturer of pavement lights, and that they are linking to their own website. They claim that they wrote most of the article text[1] (they didn't; I and another registered editor did, although they did copy a small section of text from their website, which in turn looks as if it may have been copied from another website). They call other editors "competitors"[2] and they have removed links to useful information on the sites of competing firms.[3]. While they have been particularly bad in the past six months (possibly because an inexperienced editor initially tried to revert them manually, accidentally overlooking and leaving some of their spam in the article), this problem started two years ago,[4] at which time I warned them.[5] They then apparently created an account and repeated the edits, adding the link to several other articles as well;[6] and I warned the account,[7] and they abandoned the account without responding.[8] Since then, they seem to have returned to IP editing, and two other people have warned them for COI[9] and promotional editing.[10] No response, no effect.[11] COI bot suggests that they also created another account, and abandoned it after its userpage, which contained the link, was deleted as spam. This may be partly my own fault, as I did initially add one link to this URL, although I've now removed it in favour of the site it looks like it copied from; should have checked more carefully earlier... HLHJ (talk) 20:16, 7 December 2019 (UTC)
@Beetstra:   Added to Spam blacklist. Enough warned, no reason to believe that they will stay within this set of wikis. --Dirk Beetstra T C (en: U, T) 03:53, 8 December 2019 (UTC)
Thank you. HLHJ (talk) 20:12, 8 December 2019 (UTC)

Pavement light spammer again





The spammer of thepavementlightcompany.com has reacted to their blacklisting by registering some more domains. Please block the lot. They are still on a dynamic IP; not sure what else can be done. HLHJ (talk) 22:14, 21 December 2019 (UTC)

@HLHJ:   Added to Spam blacklist as \bthepavementlight -- — billinghurst sDrewth 23:14, 21 December 2019 (UTC)
Thank you, billinghurst. I didn't know one could blacklist regexes; thank you for the link, I now understand the system better. It is to be hoped that pavement lights will never darken this page again... HLHJ (talk) 01:07, 23 December 2019 (UTC)

Proposed removals

  This section is for archiving proposals that a website be unlisted.

toevolution.com



Add modification codes in Wikipedia to this topic https://en.wikipedia.org/wiki/The_Water_Dancer Using the site of toevolution.com, but it appears that in the banned site, please cancel the ban to be able to add the reference, as this will benefit the users. — The preceding unsigned comment was added by 41.37.79.129 (talk)

  not globally blacklisted; this is locally blacklisted at English Wikipedia, you will need to talk to them about the potential for its removal.  — billinghurst sDrewth 19:49, 27 December 2019 (UTC)

calculate-linux.org



necessary for adding official links to https://en.wikipedia.org/wiki/Calculate_Linux. There is an old discussion on https://en.wikipedia.org/wiki/Talk:Calculate_Linux about requesting the removal. Paul Zeiger (UU) (talk) 17:11, 27 December 2019 (UTC)

@Paul Zeiger (UU):   Removed from Spam blacklist. It's blacklisted since 2008, you present a valid usecase, let's try and see. It will be reblacklisted if spammed again. Martin Urbanec (talk) 17:59, 27 December 2019 (UTC)

Troubleshooting and problems

  This section is for archiving Troubleshooting and problems.

Discussion

  This section is for archiving Discussions.

handles4u and related

links






























users






































This will likely expand when the reports get in. --Dirk Beetstra T C (en: U, T) 13:06, 16 December 2019 (UTC)

Help needed!

I use a hosted wiki software, that use this spam list. How to put some links in a whitelist, with admin privileges? — The preceding unsigned comment was added by 87.91.51.235 (talk)

Use your page mediawiki:spam-whitelist  — billinghurst sDrewth 03:59, 17 December 2019 (UTC)

shorturl.at



URL shortener used to add spam. — JJMC89(T·C) 20:59, 30 December 2019 (UTC)

@JJMC89: Where are you seeing it used? If you have a look at the XWiki report, you will see that I blacklisted the domain in 2018. I will get COIBot to kick a new report to see if it can also assist.  — billinghurst sDrewth 22:08, 30 December 2019 (UTC)
  Already done My bad, I misread the diffs. It was being added without the protocol. — JJMC89(T·C) 22:33, 30 December 2019 (UTC)

bet365.com



Website of a notable company with 15 sitelinks in Wikidata.[12] It was added to the blacklist in 2007; the log links to a diff to a page that has no spam blacklist log entries since the log started six years ago. Peter James (talk) 09:46, 31 December 2019 (UTC)

A pure betting website with zero encyclopedic information. The site has used referral link schemes in the past, see this spam diff (and probably still does). It has also made news for false advertising and other questionable business ethics (see en-Wiki article). ==> No possible value + high risk of misuse = I am strongly opposed to removing such a site from the blacklist. GermanJoe (talk) 14:56, 31 December 2019 (UTC)
So you think official websites shouldn't be linked in articles or in Wikidata? The "official website" template was once nominated for deletion ([13], and there was strong consensus to keep it. And if you think companies you dislike shouldn't have their sites linked, that's probably incompatible with NPOV. As for "high risk of misuse", all you can find is another edit by the same person at the same time as the diff I linked, twelve years ago (and on another article that hasn't had much spam - the log for that article only shows two attempts to add a Twitter link in 2017). Peter James (talk) 15:13, 31 December 2019 (UTC)
There are several options such as local whitelisting of an about page or simply adding the site information as raw unlinked text to show the official website in main articles on project level. GermanJoe (talk) 21:25, 31 December 2019 (UTC)
@Peter James and GermanJoe: Yes, WD SHOULD have the data. However, that will disturb all wikis that use the data from WD onto their pages (up to 800+ mediawiki wikis). As argued elsewhere on this page whitelisting/excluding the top domain (here) is, IMHO, a very bad idea. Whitelisting on WD is a bit less of a bad idea (as it enables linking on WD and hence would block any other wiki that uses WD, and enables the same technical possibility to evade the whole spam-blacklist completely). I do not see at this time any reasonable solution except for a technical change in the MediaWiki software through a high-priority phab ticket. Sorry. --Dirk Beetstra T C (en: U, T) 11:20, 14 January 2020 (UTC)
The solution is in the guidelines at the top of the spam blacklist and its talk page: "Only blacklist for widespread, unmanageable spam" (with the exceptions mentioned) and "we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects". Peter James (talk) 13:53, 14 January 2020 (UTC)
Yes, it was blacklisted because it was widespread (several wikis), unmanageable spam (by several IPS and editors). This was used in affiliate spamming and no, spamming does not stop because it is years ago. This is only in support to one page on every wiki, which each wiki should decide on their own by local whitelisting. Allowing this link in through de-listing allows back the affiliate spamming which is not in support for our projects.
As for having it on WikiData (where it indeed belongs): that needs a separate solution as explained in the threads below. --Dirk Beetstra T C (en: U, T) 14:04, 14 January 2020 (UTC)
Then a regex for affiliate URLs (if they still exist) can be blacklisted; it looks like there is something similar for Amazon. Peter James (talk) 14:33, 14 January 2020 (UTC)
Sigh. I am giving up. There is a plethora of problems with excluding the main domain from blacklisting, and as the logs show there are cases where the domain itself was questionably used as well, and when allowed it allows back for malicious use (not explained per en:WP:BEANS, though the techniques were already abused and they can also now already be abused with another layer of difficulty - and those things will be abused as it put money directly in the pocket of the abuser - and yes, the affiliate program still exists). That stands in contrast with that this link is only useful for 1 page on maybe 200-300 wikis (by your admission, only 15 ..), not thousands of pages per wiki (like Amazon). That is exactly what local whitelisting is for. I am sorry, this is not convincing me of possible widespread use worthy of delisting, being a betting site with affiliate program there is reason to expect further abuse. If many local Wikis think it is of sufficient general use (as Billinghurst indicates below) we may change our minds (also because that will help to show that the risk is actually relatively small). --Dirk Beetstra T C (en: U, T) 05:30, 15 January 2020 (UTC)

  Comment generally the process that the wikis look to for high risk sites is for a local whitelisting of respective /about pages, so that those urls can be added as required, though limits the possibility of abuse. The subject matter is covered in the below discussion about vid.me.  — billinghurst sDrewth 22:27, 31 December 2019 (UTC)

There's no evidence that it is a high risk site, only that one person, probably not involved with the company, added spam links twelve years ago. For one or two sites local whitelisting may be reasonable, but there are 15, and also Wikidata where it isn't possible to add unlinked text as an official website. Peter James (talk) 23:18, 31 December 2019 (UTC)
I am not certain that you can say it is or it isn't a high risk site without someone scanning the whole of spamblacklist logs, rather than just for a specific article. Our recommended process for removing sites from the blacklist is to suggest that whitelisting at a wiki first and see how it progresses. Ask at w:mediawiki talk:spam-whitelist and see how you go. [Noting that this is a consensus-based discussion forum, so you can point to this discussion at any whitelist conversation to see if that community has an opinion of the blacklisting anyway.]  — billinghurst sDrewth 11:33, 1 January 2020 (UTC)

d:Q78682705



Because of the \bvid\.me\b entry, this site can't be the P856 (official website) value of this Wikidata item. How to resolve this? --Liuxinyu970226 (talk) 01:59, 17 December 2019 (UTC)

Also, if in the future there are also many items that are about websites listed in this page, then it's expected that normal users can't add P856 values normally, so can we request such additions by posting this page or not? Or should Wikidata be exempted from global spam blacklist application? --Liuxinyu970226 (talk) 02:06, 17 December 2019 (UTC)
Bit dot ly, TinyURL are likely. --Liuxinyu970226 (talk) 02:10, 17 December 2019 (UTC)
WD is always welcome to utilise their local whitelist to exempt any domain that it chooses, though my understanding is that it isn't that simple, as the further usage at the WPs is not possible due to the blacklisting. There has been that discussion here—which will be in the archives—where Beetstra has propounded on this and I will let Beetstra better express his points rather than do my poor man reproduction.

If you believe that the policy of blacklisting url shorteners is incorrect, then worthwhile raising that matter through a well-structured RFC, as the policy pre-exists WD, and identifying all the aspects of the matter from its collection to its use, and how you expect to deal with spam or abusable urls.  — billinghurst sDrewth 03:39, 17 December 2019 (UTC)

@Billinghurst and Liuxinyu970226: Be VERY careful with this. IF you whitelist, say, \btinyurl\.com\b then that allows for not only the domain link as official homepage on the WikiData item for Tinyurl, but also for a lot of tinyurls everywhere throughout wikidata (there is no reason why the globally blacklisted 'myspammycompany.com' for the wikidata item for MySpammyCompany then cannot have a tinyurl redirect to myspammycompany.com). That gives a plethora of problems down the line: 'tinyurl.com' will be transcluded through external links templates, so a page on en.wikipedia suddenly has a link to tinyurl.com. Since it is not whitelisted on the local wiki, it will hence result in a spam filter block on that local wiki on the next edit. It also results in a spam block for anyone who wants to add that transcluded domain through one of the templates upon adding (in other words, they cannot add that through WD transclusion). If tinyurl.com is blanket whitelisted and starts appearing also on other items on wikidata, it may also result in spam blocks on edits on other pages. Please, do NOT do this. --Dirk Beetstra T C (en: U, T) 06:08, 17 December 2019 (UTC)
@Billinghurst: .. this is not only for url shorteners .. it goes for anything that is blacklisted. Redtube.com will have the same problem, whitelist that and you can just wait for tech-savvy high school vandals to add that as their school's official homepage on wikidata and have it transcluded on hundreds of Wikis at once. --Dirk Beetstra T C (en: U, T) 06:10, 17 December 2019 (UTC)
Sure, I know that. It isn't my job to explain folly to them. I was presuming that they were going to whitelist, add the url, the remove, as they would normally only asking for official webpage.

The bigger story is about the impact and consequences of having added links at WD, and then trying to utilise them at other WMF wikis when they are still blacklisted globally, or locally, and the consequences in editing.  — billinghurst sDrewth 10:54, 17 December 2019 (UTC)

Well, I had to explain it to them quite some time ago once, when they whitelisted something on WD and someone on en.w came complaining they couldn't edit. What en.w locally does is to whitelist the /about page - that is generally a neutral landing page and not the top page (which is often the reason something got blacklisted - pornhub.com is blacklisted because students tend to replace their school website with it), and it is more difficult to 'abuse' (whitelisting tinyurl.com's homepage also allows tinyurl's redirects). But in any case, whether pornhub.com/about or pornhub.com is locally whitelisted on WD, it will impact editing on all wikis that try to transclude the globally blacklisted page, as pages cannot be edited.
What COULD be considered is that our blacklist rule is exempting a neutral landing page (a /about) on each site that we blacklist. --Dirk Beetstra T C (en: U, T) 10:57, 18 December 2019 (UTC)
In Wikipedia they could still be added without linking to the URL. Probably better to use edit filters to block edits such as that, which are vandalism rather than spam. Peter James (talk) 09:59, 31 December 2019 (UTC)
@Peter James: What could still be added in Wikipedia? If WD has whitelisted 'http://example.com', then how can Wikipedia use that without linking to it? They would have to <nowiki> for each of the wiki? And no, these are not vandalism, you want to link to the official website of a subject, exactly the data that WikiData has. --Dirk Beetstra T C (en: U, T) 12:00, 14 January 2020 (UTC)
Just to give an example: try to add {{official website}} to en:Cloud mining. You cannot, that will use the data from WD, and that link is globally blacklisted and hence you cannot save the en.wikipedia page. Alternatively, remove the link on WD, edit the page on en.wikipedia (now you can), rollback your edit on WD, and try to do a totally unrelated edit on en.wikipedia on that page. Again, you cannot as then your subsequent edit on en.wikipedia will be recorded as 'adding' the link it uses from WD (which you did not add in that edit) and that is dissalowed. --Dirk Beetstra T C (en: U, T) 12:08, 14 January 2020 (UTC)
There are already blacklists for specific types of URLs without blocking the entire Google and Amazon websites - could something similar be done here for URLs such as tinyurl.com, possibly using a regex for one or more characters after the domain name? Peter James (talk) 09:59, 31 December 2019 (UTC)
Also the spam blacklist doesn't prevent addition of blacklisted links, it only restricts editing of pages that contain them, so for example I couldn't undo this edit. Peter James (talk) 10:36, 31 December 2019 (UTC)
I beg to differ, I think that you have that back to front. An undo from no links to links is the addition of links. It is looking for "added_links".  — billinghurst sDrewth 11:43, 1 January 2020 (UTC)
@Billinghurst: I just ask if how the P856 values can be added for items, where topics are websites that listed in this blacklist, if the answers are "no" or "not easy", then I will ask Wikidata community to consider technically excluding application of global spam blacklist, and only use local blacklist to anti-abuse. --Liuxinyu970226 (talk) 07:14, 19 December 2019 (UTC)
@Liuxinyu970226: That will have disastrous effects on all wikis that use that data. --Dirk Beetstra T C (en: U, T) 07:40, 19 December 2019 (UTC)
Liuxinyu970226, there are currently existing external links on wikidata that are now blacklisted here on-wiki (these links were spammed to WD before they were blacklisted). There are now on all hundreds of wikis a page where you cannot add the official website by transclusion from WD (like e.g. en:template:Official website does when called without parameters). --Dirk Beetstra T C (en: U, T) 07:51, 19 December 2019 (UTC)
@Liuxinyu970226: It was a pretty naive question, and you were given a fulsome answer to try and cover the range of reasons that you may have been asking. I would think that this is a bigger question than just WD where the urls are used outside of WD. I would think that it may be something that all of the WMF community may have an interest in rather than just the technocrats/puritans at WD. As Beetstra said these were blacklisted as they were abused, not because they had the potential to be abused. If you take it to WD, I look forward to your holistic discussion, not something narrowly focused upon that the spam blacklist stops them being added.  — billinghurst sDrewth 09:15, 19 December 2019 (UTC)
@Billinghurst: see d:Wikidata:Administrators'_noticeboard#Local_spam_filter. From a WD perspective this all makes sense (though they would also get the real crap), but WD is however used by the majority (if not all) other wikis. --Dirk Beetstra T C (en: U, T) 09:39, 19 December 2019 (UTC)

To me, for these items the best option is still to exclude here on meta a neutral landing page. That solves a lot of problems throughout: it enables the WD item to have a representative link in their item that does not result in any problems on other Wikimedia projects (or when local projects want to use that link). That does still protect WD against edits like this and [ this] (can someone tell me why a municipality in Germany needs a link to pornhub.com?). All other options are of a technological level that needs significant changes in the structure of the software that Wikipedia is running on. --Dirk Beetstra T C (en: U, T) 09:59, 19 December 2019 (UTC)

I think we should only permit that for URLs specifically on request, lest the spammers refashion their /about page for promotion. Vermont (talk) 11:11, 19 December 2019 (UTC)
  •   Comment global whitelist. It seems to me that there is now the need for global whitelist page. We know that there are dangerous domain names, though for famous sites. Asking for every wikipedia to locally whitelist is now unreasonable, especially in light of WD, and its methodologies.  — billinghurst sDrewth 22:30, 31 December 2019 (UTC)
    • What about negative lookahead? See encyclopediadramatica\.(?:com(?!/Main_Page)|net|org|se) entry, encyclopediadramatica.com/Main_Page should work correctly, unlike the rest of the variants. \bgoo\.gl\b(?!/maps\b).* is similar variation. --Martin Urbanec (talk) 22:35, 31 December 2019 (UTC)
      • Yes, though I think that it is a little harder to manage and simple pastes of an acceptable url just easier and more overt. We should only ever need the one, and on request.  — billinghurst sDrewth 00:29, 1 January 2020 (UTC)
        • @Martin Urbanec: ED is a particularly bad example, where the /about was whitelisted on en.wikipedia as the official website, only to see the abuse that landed this wiki on the blacklist to extend to the only whitelisted page. Front pages are often not a good idea anyway, for ED that could be showing the content that we want to exclude, for others it is just the website that is being abused. --Dirk Beetstra T C (en: U, T) 12:55, 12 January 2020 (UTC)
          • It isn't being used as an example of when exceptions to the blacklist should be allowed, it's an example of what is possible with the blacklist - something similar could be used for sites such as tinyurl.com. Peter James (talk) 13:48, 14 January 2020 (UTC)
            @Peter James: It is my understanding that having a bare domain name is exploitable as people can build on it, or try ot manipulate it. Over the times we have seen all sorts of games from LTAs, determined vandals, spammers and spambot developers. If there is the means to whitelist to a safe landing url that is not exploitable, then I am all for it. If you are asking for simple stem urls even with negative lookahead then I think that is going to be problematic. We need something with resilience to abuse.  — billinghurst sDrewth 12:44, 16 January 2020 (UTC)
            It is not possible for any URL that matches the blacklist to be added directly unless it is whitelisted, so it's only exploitable in the same way that blacklisting an entire site is - the difference is in the inconvenience when adding a link that clearly should be added, or editing a page in which that link exists. Peter James (talk) 13:13, 16 January 2020 (UTC)
            @Peter James: People can build on the bare url as billinghurst is stating here (I will not go further per en:WP:BEANS). But the bigger problem is that the top domain is often the problem in the first place. Many organisations spam their websites in statements like 'Example.org is the best company in the world', 'find this product on example.org', 'vote for our petition '<name of petition>' on change.org' (the latter is even done without the domain 'go to change.org, find <this> petition and vote for us'.. people have a cause, then consider if people earn money with it); school students replace their school website with the top domain of big porn websites on a regular basis. Whitelisting example.org or the top domain of the porn sites on a wiki then just allows that abuse to continue. --Dirk Beetstra T C (en: U, T) 06:47, 23 January 2020 (UTC)
            If the regex is correct, "People can build on the bare url" is only true in the same way that "People can bypass the blacklist entirely" is - probably the only exception (to both) is in Wikidata properties with URL data type. Peter James (talk) 12:20, 23 January 2020 (UTC)
            "It is technically made impossible to link to blacklisted links" [14] Peter James (talk) 12:30, 23 January 2020 (UTC)
            @Peter James: Do you really want me to spell it out: you can build on the bare url. I'll email it to you. --Dirk Beetstra T C (en: U, T) 12:47, 23 January 2020 (UTC)
            Any link can probably be added the same way (although there may be places it does make a difference what is blacklisted). Peter James (talk) 14:55, 23 January 2020 (UTC)
            Without access to the base domain it is much harder. Nonetheless, the base domain itself is often the problem already. —Dirk Beetstra T C (en: U, T) 17:18, 23 January 2020 (UTC)
Return to "Spam blacklist/Archives/2019-12" page.