Talk:Spam blacklist/Archives/2020-01

Active discussions

Proposed additions

  This section is for completed requests that a website be blacklisted

Request for 3 domains









As per my request for the account locks here this is being used to spam some garbage site and is potentially a source for hoaxes. Praxidicae (talk) 15:56, 2 January 2020 (UTC)
@Praxidicae:   Added to Spam blacklist. --Martin Urbanec (talk) 16:02, 2 January 2020 (UTC)
@Martin Urbanec: I added another from this spam set that's being used for the same purpose. Running a report now. Please add this as well. Also pinging @Ohnoitsjamie: as this might be of interest to you...Praxidicae (talk) 16:27, 2 January 2020 (UTC)
@Praxidicae:   Added to Spam blacklist. --Martin Urbanec (talk) 16:29, 2 January 2020 (UTC)
Good catch. I blacklisted schooltips on en for repeat spamming obits. Ohnoitsjamie (talk) 16:36, 2 January 2020 (UTC)

Indian financial scheme spam





Request moved from English wikipedia SBL per request due to cross-wiki spam. See COIBot reports for pradhanmantri.info and pmagreement.in for both English and Hindi wikipedias. Ravensfire (talk) 03:33, 2 January 2020 (UTC)

@Ravensfire:   Added to Spam blacklist. -- — billinghurst sDrewth 10:30, 3 January 2020 (UTC)

donotlink.it



Redirection service. JzG (talk) 18:11, 4 January 2020 (UTC)

@JzG:   Added to Spam blacklist. --Martin Urbanec (talk) 18:15, 4 January 2020 (UTC)

lnkclik.com



Link shortener reported at enWP blacklist. Can't verify much about it, alas. My filters block it as malvertising. JzG (talk) 10:53, 5 January 2020 (UTC)

@JzG:   Added to Spam blacklist. Seems definitely as a malicious site. It's also not a normal shortener, through. --Martin Urbanec (talk) 12:11, 5 January 2020 (UTC)

www.wikitechy.com






Cross-wiki spam. Tgeorgescu (talk) 15:15, 31 December 2019 (UTC)

It was a short burst spamming from a user, so I am not certain that it is needed from a single short spell. I would prefer that we monitor rather than blacklist, and have set COIBot to do its monitoring and reporting.  — billinghurst sDrewth 10:24, 3 January 2020 (UTC)
@Tgeorgescu:   Declined at this time  — billinghurst sDrewth 03:02, 15 January 2020 (UTC)

www.thrillophilia.com



Please blacklist this link. It is being spammed on multiple articles on cities, mainly on the English and Simple English Wikipedia. Instances include this on the Dubai article on simplewiki, this on Dubai (emirate), again on simplewiki, and an attempt on the talk page on the Singapore article at enwiki. The edits were made by separate accounts, whose only edit was adding in the link and adding some random facts. Nigos (talk) 12:17, 23 January 2020 (UTC)

@Nigos:   Added to Spam blacklist. I just blacklisted this on en.wikipedia, realizing that it was wider than that. I did however not investigate the other wiki's too much. Thanks for the explanations! --Dirk Beetstra T C (en: U, T) 13:13, 23 January 2020 (UTC)

429.kim



This (until now) involves 2 subdomains, one abused on WikiData to replace piratebay links (with extreme insistence ..), and one cross-wiki. At least one of the involved edits is suppressed on en.wikipedia, and it involves again link-replacements (hijacking of references) and other spamming. First requested on en.wikipedia here by user:Paul_012. --Dirk Beetstra T C (en: U, T) 10:50, 28 January 2020 (UTC)

@Paul 012:   Added to Spam blacklist. --Dirk Beetstra T C (en: U, T) 10:51, 28 January 2020 (UTC)

Cloud mining spam







Pushed to cloud mining. —Dirk Beetstra T C (en: U, T) 08:50, 31 January 2020 (UTC)

@Beetstra:   Added to Spam blacklist. --Dirk Beetstra T C (en: U, T) 08:50, 31 January 2020 (UTC)

Proposed removals

  This section is for archiving proposals that a website be unlisted.

thenexthint.com



This news site is approved by Google News. I thing we want this type of site for reference links. but now I saw this site is black listed so now I would like you to please remove this site from blacklist because anybody want this type of site for references. —The preceding unsigned comment was added by Wikialinaparker (talk)

@Wikialinaparker:   not globally blacklisted so there is nothing for us to do here. It looks to be blacklisted at English Wikipedia   Defer to w:en:Mediawiki talk:spam-blacklist  — billinghurst sDrewth 11:29, 16 January 2020 (UTC)

appen.com



Hello! I don't know the reason "Appen.com" is in the blacklist. But I was translating the English article into Spanish, and I can't submit the article translations because it contains website name that is in the blacklist. How could we fix this? Regards! --Ryo567 (talk) 00:51, 15 January 2020 (UTC)

@Jon Kolbert: One of yours and no corresponding discussion.  — billinghurst sDrewth 03:07, 15 January 2020 (UTC)
Was used in multiple spam userpages, will remove from blacklist as there seems to be legitimate use for link. Jon Kolbert (talk) 03:44, 21 January 2020 (UTC)
@Ryo567:   Removed from Spam blacklist. Jon Kolbert (talk) 03:44, 21 January 2020 (UTC)
Great! Thanks a lot Jon Kolbert! :) --Ryo567 (talk) 09:10, 21 January 2020 (UTC)

Troubleshooting and problems

  This section is for archiving Troubleshooting and problems.

d:Q21980377 (Sci-Hub)



Need to add the new official URL, https://sci-hub.si, with the Recommended rank (other URLs need not be changed as well as their ranks). Sci-Hub is globally blacklisted for a good reason, but this is the element devoted to the website itself, so it needs this link, especially since other sci-hub.* URLs are already there. Perhaps the simplest way to do it is to de-blacklist it temporarily. --colt_browning (talk) 17:32, 10 January 2020 (UTC)

{{declined}} You can whitelist it locally at Wikidata with mediawiki:Spam-whitelist. --Martin Urbanec (talk) 17:10, 11 January 2020 (UTC)
@Martin Urbanec: No, it should stay blacklisted. Otherwise, people will add links to Sci-Hub to the elements of scientific articles (in good faith) which is bad because Sci-Hub violates copyright (see blacklisting discussion). If there is a way to whitelist an URL for just a single element, please explain how to do it. Or, do you mean temporary whitelisting? --colt_browning (talk) 19:46, 11 January 2020 (UTC)
@Colt browning: temporary whitelisting is an option, but that will affect all wikis that use the wikidata item to display the official link (as it is then on that page, editing on each wiki will be problematic). This likely needs a phab ticket to really get to a proper solution, as any official website that is blacklisted runs into this problem. I don't think we can do anything really here, and would rather strongly suggest against doing something on WikiData due to the effects that will have. --Dirk Beetstra T C (en: U, T) 08:24, 12 January 2020 (UTC)
@Beetstra: Since other URLs in that Wikidata item are also affected by the blacklist, all wikis already cannot use Wikidata to display the official link, so adding the new URL won't harm them. Also, this Wikidata element is used by an external website whereisscihub.now.sh which is not affected by blacklisting, so there is perfect sense in updating the Wikidata element. With this in mind, maybe still consider temporary whitelisting/de-blacklisting? Anyway, I agree that it needs a proper solution and will prepare a Phabricator ticket. --colt_browning (talk) 09:29, 12 January 2020 (UTC)
@Colt browning: the point is, that the other URLs in that Wikidata item are causing problems on other wikis - they cannot be 'used' on other wikis. Complete de-listing causes a horde of problems (quite some material on that site should never be linked to), local whitelisting will solve the WD problem, but we still have/get the horde of other problems. This really needs another solution - either we need a global whitelist for /about pages so we can avoid this, or a completely different solution (e.g. a flag on WD that the data is there on WD and locally whitelisted, but cannot be 'pulled' onto other wikis). None of that is now there. --Dirk Beetstra T C (en: U, T) 10:55, 12 January 2020 (UTC)
@Colt browning and Beetstra: Maybe use negative lookahead in the blacklist to make sure it's not /about? Not sure what WD wants to link to through. --Martin Urbanec (talk) 11:17, 12 January 2020 (UTC)
@Martin Urbanec: A good idea. In fact, if we globally whitelist links to frontpages only (smth like \/\/sci-hub\.\w*\/?$), it solves the issue completely. Sci-Hub is blacklisted because of the copyrighted content, so the link to the front page is harmless (and no one is going to spam it, I guess). --colt_browning (talk) 11:27, 12 January 2020 (UTC)
@Colt browning: No, it doesn't. The frontpage is often the one that is the source of the abuse/spam, allowing only the frontpage like you did is enabling the same rubbish that it is supposed to stop. My base example here is pornhub.com, the front page is on en.wikipedia only abused a handful of times a day (example: https://en.wikipedia.org/wiki/Special:Log/spamblacklist/194.132.131.100 .. why does a Russian submarine need a link to the frontpage of pornhub? Or an Ohio school: https://en.wikipedia.org/wiki/Special:Log/spamblacklist/98.100.24.226). And all the spam companies that are being blacklisted.
@Martin Urbanec: Yes, but only on request we could exclude with a negative lookahead (a recurring example where you don't want to do this standard is Encyclopedia Dramatica, who will just abuse what you whitelist). But I agree, that is likely not what WD wants. For the Wikipedia's you want a representative landing page, for WikiData you want the data ... --Dirk Beetstra T C (en: U, T) 12:40, 12 January 2020 (UTC)
@Beetstra: Well, Sci-Hub is not Pornhub. Its frontpage doesn't violate anything. It is blacklisted because it gives access to copyrighted content, and users add direct links to the copyrighted content on Sci-Hub. People add links to Pornhub because they think it's funny, it's just common vandalism. Also, there are lots of websites that are interested to have links from Wikipedia (e.g., some news websites), so they add links to particular pages or the frontpages, and this is common spam. But Sci-Hub is a completely different case. It's not spam, it's not vandalism, it's just people linking copyrighted content in good faith. --colt_browning (talk) 12:48, 12 January 2020 (UTC)
@Colt browning: people can also add links to websites to spam, pornhub is indeed a particular example but that does not mean that it does not happen with companies as well - sci-hub would not be the first company that replaces any (wikilink to) Sci-Hub with a link to their frontpage to spam. And if not, then Sci-Hub would be rather an exception than a rule. --Dirk Beetstra T C (en: U, T) 12:52, 12 January 2020 (UTC)
@Beetstra: Yes, it is an exception, and that's exactly what I'm talking about. Also, it's not a company. See w:Sci-Hub. I bet if you check the spam filter hits (I don't know whether it's possible) you will find only direct links to articles on Sci-Hub, not the kind of spamming you're talking about. --colt_browning (talk) 12:58, 12 January 2020 (UTC)
@Colt browning: you don't know how to check, but you bet that it will be only that. I know Sci-Hub is not a company per sé, but also state funded musea need to show that their website is efficient and have been found spamming. Not being a company is not a reason not to spam.
Now I do agree that for Sci-Hub at the very least most of the 'abuse' is on direct linking to (likely) copyvio material. But I'd prefer a proper solution which is more general, and this is not it. (and I am still pondering whether it is possible to abuse this through some template magic). --Dirk Beetstra T C (en: U, T) 13:10, 12 January 2020 (UTC)
you don't know how to check, but you bet — that's the whole point: if you check my prediction and find it correct, that would convince you better. Scientific method at its best. Not being a company is not a reason not to spam — I've never said it is (although in this case it's not simply not a company or a state organization, it is a single person who is already banned at least on ruwiki anyway). I do agree — well, thanks. --colt_browning (talk) 13:20, 12 January 2020 (UTC)

┌─────────────────────────────────┘
@Colt browning: but that is the point, it is not my job to show that you are correct. It is on you to show that a site is not problematic, and I have given 2 reasons why we do not exclude top domains, and I hinted at a not disclosed reason that excluding top domains does allow for the technical possibility to abuse (though you need to know how to, but it is rather easy - people try to evade blacklists all the time and this opens one of the ways of doing this, and I recall having seen this trick before).

It does not matter, I used the word 'organisations' for a specific reason, meaning to cover a one-person organisation as well. The single-person owner of sci-hub did set it up to make a point, and other people who agree with that point (and even those who disagree with that point) can try to force the point. The text 'find <this petition> on change.org' or even 'type https://www.change.org in your browser and type <this petition> in the search box and vote' has been used to circumvent that spam blacklist block. People find their ways. There is no need to make that as easy as possible.

Do note that this global spam-blacklist is to protect 800+ mediawiki wikis and thousands of external wikis that chose to use this blacklist. Your mileage may vary but I would need very strong arguments why we have to change this spam-blacklist practice to enable just one of them. You are not convincing me that excluding the top domain is the best solution, but maybe other admins here can be convinced to do exactly that. It is just an advice that excluding top domains is in my opinion a very bad idea. This is in desperate need of a proper Phab ticket that solves the problem properly, we likely need a flag on WD that states that certain external links (or other data) on WD should not (i.e. NEVER) be used on any client-Wiki as that would block editing the client page on all wikis that are client - that allows WD to have the data (through local whitelisting of some kind) while not possibly disturbing editing on hundreds or thousands of wikis). --Dirk Beetstra T C (en: U, T) 11:12, 14 January 2020 (UTC)

Discussion

  This section is for archiving Discussions.

omeida.com



This is a very old entry added by Special:Diff/82686 (in 2004!) and we do not have discussion logs on why this was added. A search on Google tells that we probably do not want to remove the entry, but can we at least add a \b so it does not match stuff like ayakomeida.com (the official site of a Japanese composer)? We have a request on Japanese Wikipedia to add this latter site to local spam-whitelist, but given the aged entry I feel more prudent to request for adding \b instead of blindly adding a whitelist entry.--ネイ (talk) 15:19, 13 January 2020 (UTC)

@ネイ:   Done good catch, thanks for the notification. I have done some of the others in that time and space, not that I wandered too far around the list.  — billinghurst sDrewth 22:48, 13 January 2020 (UTC)
Thanks - I have replied to the request in jawiki accordingly.--ネイ (talk) 02:00, 14 January 2020 (UTC)

up.to



This entry was added in 2007 (Special:Diff/598596) and the corresponding discussion is at Talk:Spam blacklist/Archives/2007-05#way.to, up.to. Since jawiki has a request to whitelist bob-up.tokyo (probably because this blacklist entry does not have a \b at the end), just wonder if it is feasible to add it?

Side note: bob-up.tokyo is the official site of a Japanese idol girl group so it appears legitimate - the only question is whether it should be unblocked through whitelisting or through a change on the blacklist entry.--ネイ (talk) 13:33, 18 January 2020 (UTC)

@ネイ: It definitely needs to be a tighter rule.   Done Thanks for the note. I even wonder whether it should be tighter still, though it will suffice for the moment.  — billinghurst sDrewth 23:26, 18 January 2020 (UTC)

omnislots.com



Hi, can I please have some more information regarding the reason that this domain has been blacklisted? https://meta.wikimedia.org/wiki/User_talk:COIBot/XWiki/omnislots.com --Jeditom (talk) 13:36, 2 January 2020 (UTC)

I hope billinghurst I have done it at the correct section.

Are we able to ask why your interest, and where you have tried to add it, and the circumstances of the addition.  — billinghurst sDrewth 10:43, 3 January 2020 (UTC)

I know that honesty is appreciated, so I can mention that I was requested by this brand (which is 100% legitimate and with official game licence) to investigate the matter of the blacklist.As far as I know, the company tried to create the company's page with some basic business profile info and for some reason the article was removed and the domain ended up in the blacklist. So, can someone please explain what was the wrongdoing, as all references and sources used were from official websites? --Jeditom (talk) 14:01, 3 January 2020 (UTC)

@Jeditom: The domain was link spammed outside of the criteria for link addition of external sites, a reasonable read would be w:WP:EL, and it is not generally a site that we would be using as a reliable source. We get hammered with gambling, betting and slots site spam, and simply have a very low tolerance for it. I have left some links about paid editing on your user talk and would encourage you to use Wikipedia:Teahouse as the open place to get guidance about information about notability and editing. If the notability threshold is met, they will assist you with getting a short term whitelisting of an /about/ type url to add.  — billinghurst sDrewth

billinghurst thank you for the feedback. Believe me the brand -and the company behind it- are legit and there is no problem for them to check if any 3rd party validations are needed. Their domain is clean from spam, has no ads (not even participating in Google ads) and it has all the proper Casino licences. As a reminder, that wiki page was created just for branding. Not for affiliates, not for ads or promotions. I would like also to ask you why the paid editing should be an option for this brand? They don't plan to do any paid advertising, as they are not after doing any promotion. Can you please provide a clear guidance of which links were creating the problem (as I am sure that the content writer used only official sources) and what other actions should the company take, in order to prepare better their content? Also, just as a suggestion, can it be that you have flagged wrongly some links as spammy? In that case, I would be glad to research and give you my feedback for the links that you believe are responsible for the blacklisting. In any case, I am available to perform any checks that are required.

This is nothing about legitimacy, this is not about punishment, this is a technical restriction and control where links have been abused, and have potential for further abuse. For how to get a link added to an English Wikipedia, you will need to discuss that process at English Wikipedia. Links were provided and requirements explained as previously stated.  — billinghurst sDrewth 09:57, 7 January 2020 (UTC)

billinghurst Yet you provide no actual reasoning behind the blacklisting or what exactly was the mistake/problem. If this is not about a punishment, then I don't really know about what it is.And what you are saying is that you are acting like a domain is guilty in advance, just because you may feel like it could be guilty in the future.--Jeditom (talk) 14:12, 7 January 2020 (UTC)

I disagree with your assessment, as I have provided reasoning and explanation. I have provided you with means to move forward. There are links provided on this page (Spam blacklist/About), and I have provided links to assist you to understand, and to comply. I will not provide commentary about your attempted additions, or your lack of compliance with terms and condition at English Wikipedia, that is an enWP discussion, though it guides my actions and response to you.

There is a community that is here watching this page and willing to make comment about the actions of anyone who makes additions and removals to the list. Their commentary will guide me on your request.  — billinghurst sDrewth 22:10, 7 January 2020 (UTC)

I can see no real proof of the "link spammed" claim or any other indication of what could be wrong with the posting of that page. No specific examples from that page are used to show what is or could be wrong, even if it has been requested repeatedly. Therefore everything in this discussion is just theoretical with no evidence of any wrong doing. How can I request the whitelisting of that page and the review of the content from another 3rd party? --Jeditom (talk) 11:40, 8 January 2020 (UTC)

@Jeditom: Thank you for your unbiased opinion. Not. All available information that you wish to have has been supplied or is available by those links.

I am again directing you to the terms of use and now instructing you to comply with these to be able to edit at this site. Please also note the FAQs linked from that page, and also note the results of failing to comply with the terms.  — billinghurst sDrewth 21:39, 8 January 2020 (UTC)

Protected edit request on 19 January 2020



\becuadorianhands\.com\b 45.225.47.94 19:47, 19 January 2020 (UTC)

Why do you want to remove that row? Could you do a proper remove request at #Proposed removals? --Martin Urbanec (talk) 19:57, 19 January 2020 (UTC)
Please read instructions at the top of this page, and ask in the correct subsection with a proper rationale. --Dirk Beetstra T C (en: U, T) 05:26, 20 January 2020 (UTC)
  Not done see advice and resubmit  — billinghurst sDrewth 11:39, 27 January 2020 (UTC)

buatkaosmurahdemak.wordpress.com, linkedin.com

  • Link/text requested to be blacklisted: buatkaosmurahdemak.wordpress.com
  • Link/text requested to be blacklisted: linkedin.com

Please see--Turkmen talk 20:22, 13 January 2020 (UTC)

@Turkmen: I think that we have zero chance of blacklisting linkedin, it will be widely added, there will be some authoritative matter there, so would need a clear example of problematic abuse, and a broad consensus of community, following consultation with the big wikis. The first mentioned domain we get COIBot to monitor, which may be more relevant as it has a low abuse rate, and may simply be more appropriate to locally block. Anyway, let us see what the reports show.  — billinghurst sDrewth 22:57, 13 January 2020 (UTC)
@Turkmen: if there is significant abuse, we can however blacklist the specific LinkedIn pages (or facebooks, twitters, etc.). --Dirk Beetstra T C (en: U, T) 13:05, 14 January 2020 (UTC)
Billinghurst and @Beetstra: Thank you for your comments. Billinghurst I think he's right about Linkedin.--Turkmen talk 17:51, 14 January 2020 (UTC)
@Turkmen: If you do wish to submit a problematic url, please utilise Template:BLRequestLink and remove the {{LinkSummary}} templates. That will allow us to progress.  — billinghurst sDrewth 03:02, 15 January 2020 (UTC)
@Billinghurst: I changed the template.--Turkmen talk 13:10, 15 January 2020 (UTC)

dailybuff.ru



Hello, many users of the site dailybuff, who participate in discussions and in improving wikipedia, noticed that the site is in the black list. Is it possible to remove it from this list? —The preceding unsigned comment was added by 185.64.229.45 (talk)

Not without "links to the articles they are used in or useful to, and arguments in favour of unlisting", and, ideally, from a high-volume user wanting to use that link in an article. --Martin Urbanec (talk) 18:42, 11 January 2020 (UTC)
What exactly do you need to provide? Look for people who noticed that the site is blacklisted, I can not. I do not leave links in Wikipedia. I know that the site has a lot of gaming news that diverge in Google News, Yandex News, Rambler News and which can be published by someone in Wikipedia as a source.—The preceding unsigned comment was added by 185.64.229.45 (talk)
Why would we want to have this domain being added to the Wikipedias? Is the site authoritative? Is the site a reliable source? Does the removal meet any of the guidance provided at the top of the page about why we would remove? That not one editor adding the site appears to be a registered user does not lend credence that it is a required credible source.  — billinghurst sDrewth 23:30, 13 January 2020 (UTC)
Because someone abused them on purpose, at least because it should be removed from the spam sheet. And a non-authoritative resource would not go to Google News.—The preceding unsigned comment was added by 185.64.229.45 (talk)
Someone abused them on purpose .. that is the reason it is blacklisted. That other sites are using them is fine, here apparently there are no requests from established editors that the site is of use to them. All we do here is stop abuse of the site. --Dirk Beetstra T C (en: U, T) 06:43, 16 January 2020 (UTC)
Can you remove a site from the spam list?

WD and blacklisted links

There are now several threads on this page regarding the removal/whitelisting of links that are needed on WD. Time to write out things in a more general way so we can then discuss.

First: YES, I fully agree that any subject on WD should have a link to their official website, regardless of whether it is blacklisted. The problem is that it will result in many technical problems as we are talking here about the global blacklist. This blacklist is used by 800+ WikiMedia wikis and thousands of non-WikiMedia wikis. I do not know whether outside of the 800+ WikiMedia wikis any other wikis use WD for data, so I will keep it to 800+ possible cases of 'disruption' per allowed WikiData item (noting that some property calls do display WD data of one item on multiple pages on one wikipedia).

  1. whitelisting the item on WD enables WD to save the item. However, any item that uses the WD item (e.g. if facebook.com was blacklisted on meta and one would whitelist en:Donald Trump's facebook on WD and add it to his properties you might disrupt editing of that one page on 207 wikis (if all of them use the WD data); for en:Pornhub (which is globally blacklisted) it could disrupt editing of that one page on the 47 wikis that currently connect to the item; for some subjects it may be several hundreds of wikis). All wikis that use the WD data will have to individually whitelist the same link, which allows that link then to be used on any page on that Wiki, and hence negates the global blacklisting (for PornHub that was the problem, as it is for many spam top domains). (for those with the incentive (which spammers have: it pays their bills) and the technical know-how: this can be (and has been/is) abused to link to any link anywhere on any wiki that followed the WD suit of whitelisting).
  2. excluding the top domain here allows both WD and any local wiki to save the data, and it would not disrupt any of the other wikis. However that allows all wikis to use that link everywhere. Again, that negates the reason for blacklisting the top domain (for those with the incentive (which spammers have: it pays their bills) and the technical know-how: this can be (and has been/is) abused to link to any link to that domain anywhere on any wiki).
  3. whitelisting or excluding a neutral landing page (/about page e.g.) does give a reasonable way to stop random abuse (the random school kid will not add pornhub's /about as it is a) not as fun, and b) less obvious). The local whitelisting on WD needs also whitelisting everywhere else, a problem that is not reflected with the global exclusion of /about (but that requires a large adaptation to the meta spam-blacklist). Of course, linking to the /about is 'not correct' for WD (it is not the homepage), while that is less problematic for the other wikis (it is a representative page of the company; it is the standard practice on en.wikipedia).
  4. Exluding WD from using the global spam-blacklist (or override the global spam-blacklist by a blanket whitelist) would enable WD to do whatever they want, it would however result in the same disruption as described in 1 and 2. Moreover, WD would also get all the spam they do not want except if they then .. blacklist everything that is spammed locally (and yes, some spammers start at WD nowadays, as you might spam multiple wikis with one edit).

These methods are all available, but for all the possible wikis and possible pages that would likely disable editing on hundreds to thousands of pages per wiki, and disallowing using WD data on all those who do not use it yet (but for the latter, they can't add it locally at the moment either).

I can imagine a solution where data on WD can be set correctly, but be blocked from being used by client wikis. That would however need a separate flag to be defined for each WD item, which needs a phabricator ticket and implementation. But I am open to other solutions, as this does need a proper solution. --Dirk Beetstra T C (en: U, T) 12:03, 14 January 2020 (UTC)

First of all, thank you for summarizing this issue.
A minor note: option (1) is not as disruptive as it may seem, as it is possible to edit a page which actually imports blacklisted URLs from Wikidata. Still, whitelisting on WD is indeed far from being a satisfactory solution.
A technical solution has come into my mind. It does not require a new flag for each item, but still requires much work to implement. The idea is that every WD item has its own whitelist which is editable only by sysops (similarly to editnotices) and affects only that WD item and the pages on other wikis which are associated with that item. --colt_browning (talk) 13:10, 17 January 2020 (UTC)
@Beetstra and Colt browning: Another possible way is to ask Sysadmins e.g. @Reedy (WMF): to request running SQL command *ALTER TABLE* to force adding of a P856-related field, force setting the value(s), and Lock=EXCLUSIVE(?) edit action of that field, but this means that after adding it, we have to see this malpractice everyday: there's a P856 value on an item, which can't edit normally, we can't add further P856 values, nor add qualifiers for it, and nor remove it (or doesn't even have edit button, xor always have a grey edit button that can't click?). --Liuxinyu970226 (talk) 04:43, 18 January 2020 (UTC)
@Colt browning: I don’t think it works. For en.wikipedia the next edit will try to put the link in the db for that page which is disallowed due to the blacklist. I’ll try to test that.
@Liuxinyu970226: that still gives the problem as described for any wiki that uses it, you are basically doing option 1 in a different way. —Dirk Beetstra T C (en: U, T) 11:42, 18 January 2020 (UTC)
@Beetstra:[citation needed] --Liuxinyu970226 (talk) 10:29, 22 January 2020 (UTC)
@Liuxinyu970226: What needs a citation? --Dirk Beetstra T C (en: U, T) 10:40, 22 January 2020 (UTC)
Do you want a demonstration? Go ahead, get '\bpornhub\.com\b' whitelisted on wikidata (even just for the demonstration, I can even de-list it here for the sake of demonstration) and add it as official website to d:Pornhub. Then follow up on en.wikipedia to try and add '{{Official website}}' (without the nowikis) on en.wikipedia (or call the property transclusion directly) on all those 47 wikis and see what you get. You can even do it now by adding '{{Official website}}' (without the nowikis) to en:Cloud mining (no, you did not try that but I have suggested that earlier). Now do that for every item in WD that has a globally blacklisted official website and see how many pages will face troubles over our 800s of wikis. --Dirk Beetstra T C (en: U, T) 10:56, 22 January 2020 (UTC)
@Beetstra: So what do you think of my proposed technical solution? Should I write it up as a phab ticket? I understand that an idea has very little value, only implementation is valuable, but still. Also, this was a great proposal; if you are going to propose it again in 2020, please let me know, I'll call for votes in my home wiki. --colt_browning (talk) 09:08, 26 January 2020 (UTC)
@Colt browning: I would add it as an option to the phab-ticket I created. It will then be up to the developers to see what is most feasible. --Dirk Beetstra T C (en: U, T) 10:16, 26 January 2020 (UTC)
  Comment I wonder if Wikidata's interface (i.e. the Wikibase) can be developed to store SNI informations or not, by this way, 1. users don't have to type any kinds of domains for such "sensitive properties" like official website, they have ways to enter some random Checksums; 2. every Wikipedias have to have a gadget to reverse-populate the domain by checking SNIs; 3. and simply, every websites must therefore support https in order to store in Wikidata, and should better use EV SSL certificates. --Liuxinyu970226 (talk) 10:57, 22 January 2020 (UTC)
That would still result in a local wiki to decode the checksum and have the external link in the final product, which still means that it got 'added' to en.wikipedia which is the part that is prohibited. Blacklisted means blacklisted. Links that are blacklisted cannot be displayed. You can easily test this as well, add 'moc.buhnrop//:sptth' in a template, and write a second template that inverts the text so it displays https://pornhub.com and you will see that it will not want to save your inversion code using template. Do you really think that it is this easy that you can think of tricks to circumvent the blacklist that have not yet been implemented by spammers who get paid to have their links spread all over. It is technically made impossible to link to blacklisted links. --Dirk Beetstra T C (en: U, T) 11:27, 22 January 2020 (UTC)
(by the way, with this you get closer to the abuse trick that I allude to above) --Dirk Beetstra T C (en: U, T) 11:47, 22 January 2020 (UTC)
Wikipedias can be spammed just because "it's shown"? Then this discussion can be closed right now, if solutions based on ESNI, TLS v3.0, QUIC, or even post quantum encryption (which are all kinds of the de facto Google's encryption mechanisms) even can't solve what you're concern, then you're pointing a Fermat-like issue, which is the reason I'll cease my efforts on discussions. --Liuxinyu970226 (talk) 13:14, 22 January 2020 (UTC)
@Liuxinyu970226: I am sorry, but either I don't understand what you are trying to do, or indeed, links that are blacklisted cannot be 'shown' on-wiki. As far as I understand, you want to be able to have (just as an example) d:Q936394 to have https://pornhub.com as the value for the property 'official website', right? --Dirk Beetstra T C (en: U, T) 13:22, 22 January 2020 (UTC) (I tried that: but failed, obviously --Dirk Beetstra T C (en: U, T) 13:24, 22 January 2020 (UTC)
@Beetstra: Yes, but not only this one, I'm asking a valid solution to bypass this blacklist, to which it even unfairly restricted the administrators of Wikidata. --Liuxinyu970226 (talk) 02:42, 23 January 2020 (UTC)
@Liuxinyu970226: Yes, I understand that it is for literally all links that are the official website of a subject. They are a serious problem (and not only on WikiData, also e.g. en.wikipedia would like to be able to link to the official website of a subject. It is a small but constant stream of (both de-blacklist and whitelist) requests on en.wikipedia).
The deeper problem is that the spam-blacklist extension is black-and-white. Things blacklisted are disallowed everywhere, any namespace, any page, by anyone. And where it involves this blacklist (the global one) it is then also on all 800+ wikimedia projects and the thousands of projects outside that use this blacklist. Most of the sites on this list are utterly useless material (viagra spam, etc.), but a reasonable amount are of a, albeit very, very limited, use on our projects (again, back to pornhub: for about 50 pages throughout our wikis have that as the official website - noting that barring a very few exceptions the other 10s of millions of pages on all those wikis combined do not need the link).
We can allow holes in that system through local whitelisting, but that needs to be done with care. For some pages one could easily allow only the top domain (whitelist '\bexample\.com$', or exclude the top domain here '\bexample\.com.' or more complex) but that means that that that top domain can be added anywhere (again back to pornhub: up to 10 hits each day are there because that top domain link is used to replace a school website).
The blacklist being this black-and-white means that we do have a problem. That was already recognized 14 (!!!) years ago (task T6459), and I personally have been trying now for a couple of years to get the spam blacklist overhauled (see Community_Wishlist_Survey_2017/Miscellaneous/Overhaul_spam-blacklist and Community Wishlist Survey 2019/Admins and patrollers/Overhaul spam-blacklist). The spam blacklist breaks stuff, it is too crude. I totally agree that we need a solution, but as I see it currently, I see no real workable solution (maybe except for excluding a neutral non-top domain landing page on the global spam blacklist - it is not really what WD would want but it is currently as close as we can get - anything else allows for spam (or wider: abuse). It would require a major rewrite of many rules here on the spam blacklist (which can be done easier if we adapt our script) but anything else needs a serious phab ticket to overhaul the spam blacklist (which I do not see WMF doing ...). --Dirk Beetstra T C (en: U, T) 05:27, 23 January 2020 (UTC)
Also, even under currently blacklist settings, it looks like that terrorism things e.g. [1] can't be prohibited, how do we think that such edits aren't "spam"s? --Liuxinyu970226 (talk) 03:43, 23 January 2020 (UTC)
@Liuxinyu970226: The blacklist has nothing to do with terrorism, that is totally out of it's scope. The spam blacklist is about links to websites. That may be something for the AbuseFilter which is more suited for that. --Dirk Beetstra T C (en: U, T) 05:27, 23 January 2020 (UTC)
The problem for WD is that the additions there can be abused at the sister wikis. If you need something links outwards, can't you craft something within your system? If we are looking at something that is inward facing, then implement something that is not an active url. The drive for WD perfection seems to come at a cost to everyone else. I am already seeing enough abuse of WD by the same publicity spamhauses that are trying to invade enWP, and there are less defences at WD, and at this stage I see that the removal from the blacklists is just going to worsen the situation.  — billinghurst sDrewth 07:45, 25 January 2020 (UTC)

Temporary solution

As I don't see that task T243484 will be solved anywhere soon (it will need technical changes, testing, etc. etc.) there is currently only one workable solution that we could implement but which will need all parties to be willing to deviate from the 'perfect' solution:

  • We open here a special request section where requests can be posted to exclude a neutral landing page from the global spam-blacklist. Those pages are generally /about or /information pages, not the top domain. When requested by established editors this pretty much defaults to support to change the rule (but with understanding that there will be (rare) exceptions). Please do not request top domains (even while technically possible), it will not be granted.

This excludes a working link that can be used on any wiki, and will hence not result in problems when WD data is being re-used on other wikis. Any other solution that I currently see will result in editing problems on all client wikis. It will require quite some work by admins here (adapting rules), and willingness from WD to have non-perfect data in their fields (at least until task T243484 is solved), but the alternative will be the current status quo. --Dirk Beetstra T C (en: U, T) 07:11, 26 January 2020 (UTC)

Comment
  • What if there is no "neutral landing page" except for the front page (which, however, has no improper content)? --colt_browning (talk) 09:00, 26 January 2020 (UTC)
    • @Colt browning: That is rather exceptional (whitelisting neutral landing pages is common practice on en.wikipedia), but that can then likely be catered for. Note that 'improper content' is not really the reason that we don't allow the frontpage (the frontpage of the organisation is hardly ever 'inappropriate', it is that the frontpage is the abused page). Being 'inappropriate' is also generally not the reason that we blacklist, it is that the page is spammed / abused. --Dirk Beetstra T C (en: U, T) 10:03, 26 January 2020 (UTC)
  • and in the case of no "neutral landing page" (as in the case of Sci-Hub). --colt_browning (talk) 08:45, 27 January 2020 (UTC)
    • @Colt browning: sci-hub.ren/#about would do ... it is really a rare exception. --Dirk Beetstra T C (en: U, T) 10:02, 27 January 2020 (UTC)
      • Why not just sci-hub.si/#? Works for other websites as well. --colt_browning (talk) 10:48, 27 January 2020 (UTC)
        because as a generic solution it is inexact, and doesn't point to about pages at many sites; it is also equally abusable for Beetstra's previous examples.  — billinghurst sDrewth 11:42, 27 January 2020 (UTC)
        I'd agree that it is inexact, and that I would prefer it to be specific (and sometimes a neutral landing page is just the better place to send people to - schoolkids are smart, you can just wait for them to figure out that <porn-site.com>/# is working and has the same 'shock' effect as the top domain; many of the notable (and more 'decent') porn sites do have a non-shock SFW page somewhere). It may be a good one for the odd case where there really is no neutral point of landing. Anyway, I do not disagree with the exclusion in your vote, and we can see that on a case-by-case basis. --Dirk Beetstra T C (en: U, T) 11:56, 27 January 2020 (UTC)
  • This would solve the problem of official website being blocked by the spam blacklist, but what about other properties that links to external URLs such as official blogs, terms of service URL, privacy policy URL or website account on (with a URL qualifier)? --Trade (talk) 22:29, 27 January 2020 (UTC)
    • @Trade: as stated as header, this is supposed to be a temporary solution, task T243484 is supposed to result in a proper solution. The main problem is for the official websites, these are regularly re-used on practically all connected wikis (which may in some cases go up to 200 client wikis). I agree that there are other website qualified but those (for now) could generally just be whitelisted on WikiData. If that data is (heavily) re-used then I would make it fall under this same case (there is no reason that we cannot whitelist '\bexample\.com\/(FAQ$|About$)'). --Dirk Beetstra T C (en: U, T) 05:25, 28 January 2020 (UTC)
  • TBH, the reason why I didn't provide opinions on supporting or not, is in general about the SSL KeyIDs, the SSL certificate combinations of one site may be different by every its servers (afaik one Japanese example of this is Pixiv, where they use one KeyID for their main domain, and some other KeyIDs for Pawoo, Pixiv Sketch, Pixiv Comics, Pixiv company IR info, etc.), by such circumstances, the entried urls may not be worked as-is (it may or may not be a 304 redirect, or it only shows an "under construction" or likely placeholders). --Liuxinyu970226 (talk) 04:46, 3 February 2020 (UTC)
    • @Liuxinyu970226: I am sorry, but I still don't understand what this has to do with KeyIDs. This is about here having a speed procedure to exclude a neutral landing page for blacklisted domains so that can be used as the 'official website' in items that need an official website. What you seem to be talking about is at the moment far outside of the capabilities of the (global) software regarding the spam-blacklist. --Dirk Beetstra T C (en: U, T) 05:29, 3 February 2020 (UTC)
Support
  •  Y  — billinghurst sDrewth 11:34, 26 January 2020 (UTC)
  •  Y (as proposer) --Dirk Beetstra T C (en: U, T) 12:00, 26 January 2020 (UTC)
  •  Y if and only if the main page is whitelisted whenever there is no "neutral landing page" (as in the case of Sci-Hub). --colt_browning (talk) 08:45, 27 January 2020 (UTC)
  •  Y --Trade (talk) 19:50, 9 February 2020 (UTC)
  •  Y Should be fine, but per colt browning point. --Camouflaged Mirage (talk) 12:37, 23 February 2020 (UTC)
Not support
Return to "Spam blacklist/Archives/2020-01" page.