Talk:Spam blacklist/Archives/2018-01

Add topic
Active discussions

Proposed additions

  This section is for completed requests that a website be blacklisted

wordpress subdomains + wikidot







Following removal of the files.wordpress.com and pushing through some alternate solutions.  — billinghurst sDrewth 01:34, 7 January 2018 (UTC)

  Added  — billinghurst sDrewth 01:35, 7 January 2018 (UTC)

businessbroadband.com.hk



New url of now global blacklisted broadbandhk.com Matthew hk (talk) 15:46, 10 January 2018 (UTC)

@Matthew hk:   Added to Spam blacklist. --Dirk Beetstra T C (en: U, T) 18:27, 13 January 2018 (UTC)

shortwww.com



Target link for spam-bots. -- Tegel (Talk) 16:16, 17 January 2018 (UTC)

@Tegel:   Added to Spam blacklist. --Tegel (Talk) 16:19, 17 January 2018 (UTC)

film2018.info



Target link for spam-bots. -- Tegel (Talk) 14:51, 21 January 2018 (UTC)

@Tegel:   Added to Spam blacklist. --Tegel (Talk) 14:52, 21 January 2018 (UTC)

antoniogenna.net



This is an amateur site used basically on it.wiki which is used as source for the biographies of those voice performers that revoice foreign films in Italian language. The problem is that it's not only an amateur, lacking any authority site, but is also full of advertising and spam and most of its trafic is generated through Wikipedia usage since it's linked on 8,000+ articles on it.wiki. -- Blackcat (talk) 20:48, 2 January 2018 (UTC)

Hello @Billinghurst:, has this website been inspected? -- Blackcat (talk) 15:39, 7 January 2018 (UTC)
@Blackcat: not by me. Having a look at it now, it is not something that we would touch from afar with respect to the global blacklist. It is used about 4-500 times, and there is no blacklisting. Generally we are looking for xwiki abuse and spambots; and not to interfere with the qualitative decisions that wikis should make. If the communities banded together in a discussion here to blacklist it, then we would consider that by community consensus. Currently used at frWP, enWP , deWP. If it is truly rubbish, flag it to those communities for removal, and point them here for a conversation.  — billinghurst sDrewth 13:56, 8 January 2018 (UTC)
@Blackcat:   Declined  — billinghurst sDrewth 00:21, 25 January 2018 (UTC)
Ok. -- Blackcat (talk) 09:54, 25 January 2018 (UTC)

buyv1agra.com



spam  — billinghurst sDrewth 00:22, 25 January 2018 (UTC)

  Added as buyv.agra.com  — billinghurst sDrewth 00:32, 25 January 2018 (UTC)


Proposed removals

  This section is for archiving proposals that a website be unlisted.

files.Word-Press.com



As I was importing scans of Ancient Chinese banknotes to Wikimedia Commons the “abuse” filter disallowed an upload because a link was considered to be “spam”. I request that this website is removed from the blacklist so I can source more scans. I have never been paid by Wordpress, I have no relation with Wordpress and I will not receive any financial compensations for uploading images from Wordpress to Wikimedia Commons. --Donald Trung (Talk 🤳🏻) (My global lock 🔒) (My global unlock 🔓) 10:02, 2 January 2018 (UTC)

  Declined at this stage this site is problematic due to spambots. If you have a specific subdomain then request its whitelisting at Commons. You should consider that wordpress is generally not an authoritative domain, and you should look to the underlying authoritative source.  — billinghurst sDrewth 11:06, 2 January 2018 (UTC)
Where can I request whitelisting at Commons? --Donald Trung (Talk 🤳🏻) (My global lock 🔒) (My global unlock 🔓) 11:17, 2 January 2018 (UTC)
C:MediaWiki talk:Spam-whitelist. Matiia (talk) 01:44, 4 January 2018 (UTC)
It seems the issue is that the media files used/stored by "specific subdomains", which may be whitelisted, have the more generic files.wordpress.com path, and are thus now being identified as blacklisted. Adavidb (talk) 14:43, 6 January 2018 (UTC)

A noticeable amount of articles on the English Wikipedia have been tagged with the blacklisted links article due to this. Is the spambot problem across different Wikis? If not then it might be best to blacklist locally. Emir of Wikipedia (talk) 14:28, 6 January 2018 (UTC)

I am also seeing many articles on my watchlist get tagged for non-problematic links (example: en:Jiří Matoušek (mathematician), where a link of this form is used for the author copy of a journal paper used as a reference). I think the blacklist is too broad. —David Eppstein (talk) 16:17, 6 January 2018 (UTC)
Same problem here. w:Self-archived copy of a research paper linked from the author's Wordpress blog, which is otherwise only available behind a paywall. We want this kind of link. I'm not sure though, that there is a better solution than whitelisting as needed. Paradoctor (talk) 16:26, 6 January 2018 (UTC)
That might involve whitelisting thousands or tens of thousands of links. And most Wikipedia editors are unlikely to find instructions for global whitelisting easily. —David Eppstein (talk) 16:42, 6 January 2018 (UTC)
Speaking for myself only, I had no trouble. The bot places a nice big warning box, the rest is merely following orders. Paradoctor (talk) 17:51, 6 January 2018 (UTC)
At the time of writing this W:MediaWiki talk:Spam-whitelist already has 10 requests for whitelisting. I can imagine more will come soon. Emir of Wikipedia (talk) 16:53, 6 January 2018 (UTC)
I have at least three articles similarly marked as barred, but the links are to reputable university sources. For example en:Joan Kerr's trigger indicates that the Journal of Art Historiography from the University of Birmingham is barred; en:Nelly Beltrán's hit indicates that a paper from the University of Buenos Aires is banned; and en:Tillie Hardwick's resulted from an article from the Indigenous Law and Policy Center of Michigan State University's College of Law. This seems to be far to broad a listing of banned links and either needs to be modified or I fear it will effect hundreds if not thousands of articles. SusunW (talk) 16:57, 6 January 2018 (UTC)
The English Wikipedia currently has 4,346 articles containing this site name. I suspect the vast majority of these are non-problematic. —David Eppstein (talk) 17:08, 6 January 2018 (UTC)
Just had two links tagged on en:Cotehele which are academic papers, albeit from the pretty minor journal Regional Furniture. Was about to request whitelisting until I saw Emir's note on previous requests. Jonathan A Jones (talk) 17:20, 6 January 2018 (UTC)
and yet two more en:Ana Vásquez-Bronfman link ties to an academic paper from the Université Paris Sorbonne and both en:Timeline of women in aviation and en:Cheryl Pickering-Moore tie to a publication from the Guyanese Cultural Association of New York. As is pointed out below, they are all archived in Wayback, thus not spam. SusunW (talk) 20:06, 6 January 2018 (UTC)
Blacklisting files.wordpress.com is too broad. I noticed today that an academic paper on the origin of the Tibetan headless script referenced in Dunhuang manuscripts has been tagged as a blacklisted link, which is really not helpful. BabelStone (talk) 16:58, 6 January 2018 (UTC)
  • Add me to the chorus pointing out that this filter is absurdly broad and should be removed. James (talk/contribs) 19:01, 6 January 2018 (UTC)
  • I have no idea of how this works, but this is where I ended up. I have used the reference at https://cognitasresearch.files.wordpress.com/2012/08/human-factors-in-sport-diving-incidents.pdf on several articles. It is a reliable source and has been accepted on GAs and an FA without adverse comment, now after several years in use it is globally blacklisted without explanation. Please do something about it. · · · Peter (Southwood) (talk): 19:28, 6 January 2018 (UTC)
    Ditto Peter's comment; in my case https://climateaudit.files.wordpress.com/2007/03/berknerisland2002annglac.pdf was used in en:List of ice cores; it's an archive of a scientific article and should be allowed. Mike Christie (talk) 20:01, 6 January 2018 (UTC)
  • I come from Wikipedia-land and noticed this got blacklisted. I was linking a file of a rulebook to use as a primary source to verify the existence of an item (which is allowed). It seems this blacklist is too broad and should be reviewed in depth. Leitmotiv (talk) 20:58, 6 January 2018 (UTC)
  • 4,346 articles (according to comment above). If even just half of that are false positive... and that's just one language Wikipedia. Let's just think about that for a second. -- KTC (talk) 21:52, 6 January 2018 (UTC)
  • Ditto everyone above. Block too broad, please remove. Renata3 (talk) 22:06, 6 January 2018 (UTC)
  • See this edit [1] which indicates that the pattern has now been white listed at the English Wikipedia. Jonathan A Jones (talk) 22:09, 6 January 2018 (UTC)
  • http://metofficenews.files.wordpress.com/2010/12/min-temps.jpg is from the UK Met Office. This new rule is insane. Do we really think we're now so powerful we can stop the world using Wordpress? Will Wordpress now ban mediawiki sites? --Northernhenge (talk) 22:47, 6 January 2018 (UTC)
  • Agree. This blacklisting is too broad. Because it is an easy and free/cheap place to create a web presence, it is used by a whole host of individuals and organisations. It should be blacklisted against the subdomains which represent independent people and organisation who are shown to be unreliable . In the articles I am seeing with the blacklist warning, it is a local history organisation whose publications about the local history of the area are being blacklisted. Local history organisations generally have no money and are generally quite authoritative about the history of their area; most cannot publish books any more (other than vanity press which they can't afford either) as publishers don't think the market is big enough to warrant it. As it says in English Wikipedia, context matters when evaluting reliable sources. This blanket blacklisting is not considering context; that should be left to contributors to individual articles to assess. By all means blacklist individual sites within wordpress where appropriate, but not all automatically. 23:25, 6 January 2018 (UTC)
  • Yes, yes. I am another one! Good quality, non-problematic, and highly desirable references are suddenly suspect because of this blacklisting. It is far too broad. WordPress is a highly popular internet tool, and it is just madness to set ourselves up against it. Timothy Titus (talk) 00:22, 7 January 2018 (UTC)
  Removed  — billinghurst sDrewth 00:53, 7 January 2018 (UTC)
English Wikipedia could have resolved this by whitelisting the domain, or by managing subdomains, so the issue was completely manageable at enWP. Both Commons and enWP administrators were made aware of this matter. The spambot abuse seems to have stopped, so I have removed the block and we can look to other means to manage this sort of matter into the future.  — billinghurst sDrewth 00:59, 7 January 2018 (UTC)
Will the bot be removing the banners it left or do we have to do this by hand? If the latter, where do we find a listing of all the damage done? · · · Peter (Southwood) (talk): 02:47, 7 January 2018 (UTC)
I'm hoping the bot maintainer(s) will take it upon themselves to remove the banners. Up to 500 such taggings can be viewed at a time via this link to the bot's contributions. Eight-plus pages of such contributions are documented between 12:03 and 13:57 on 6 January (UTC), though I didn't verify that every one of these was triggered by 'files.wordpress.com'. Adavidb (talk) 04:53, 7 January 2018 (UTC)
There is no global or meta bot creating any banners, or doing any labelling, so it must be a local matter. You will need to talk to whomever is running the bot and get them to manage it.  — billinghurst sDrewth 07:30, 7 January 2018 (UTC)
That bot already removes the relevant template from pages which no longer contains any blacklisted links (whether through link removal or blacklist or whitelist modification). It just doesn't run 24/7. -- KTC (talk) 11:05, 7 January 2018 (UTC)
Thanks for the clarification KTC, I will wait a few days and see if they go away. · · · Peter (Southwood) (talk): 16:00, 7 January 2018 (UTC)
  • @Billinghurst: Your blacklisting had blocked 2024 existing URLs across 997 sub-domains of files.wordpress.com on several thousand articles. Whitelisting each sub-domain to get around that would have been insane. We should probably simply consider blacklisting more specific sub-domains of this sub-domain to keep the damage more contained without causing it.—CYBERPOWER (Chat) 14:47, 7 January 2018 (UTC)
  • @Cyberpower678: Now we have that data, we could easily have compiled a handful of regexes that would have blocked all files.wordpress.com sites except for these 997 domains. That would have resolved a lot of issues ..
  • Sigh .. if only WMF would use their developers to finally rewrite the spam-blacklist extension and make something that could have global whitelisting, and/or be managed by userlevel (only give problems to non-autoconfirmed editors and IPS). The current system is just WAY too crude for this type of things, and we are left with the abuse. --Dirk Beetstra T C (en: U, T) 17:56, 7 January 2018 (UTC)
  • @Cyberpower678: Putting 10+ sub-domains of files.word-press.com a day into the blacklist was not the way that I wished to manage spam and additions to the global blacklist, especially as one has to manually sort through the various abuse filters to identify and then register them (count may not be accurate, but it seemed that way). I am sure you understand the difficulty and tedium of using abuselog entries one by one as a means to identify and manage spambots. It can be said that it can be the better way to manage this is still to blacklist and allow the sites to whitelist, either wholly or partially, especially as many of the wikis do not, and would not, utilise "files.wordpress.com" as sources for their works.

    While providing information here covers our arses and does inform the community, it doesn't get to the broader communities. What we could do better is to look build a better means to alert administrators at wikis of significant administrative changes that are being made, and where biting changes are made that we notify admins. Predominantly the work done here is done by a small group of people, with little feedback from the broader community trying to identify and block the total detritus that the spambots, and the xwiki spammers throw at us. So in that regard we somewhat work in the grey zone or the dark until we significantly crash into something of value and cause a ruckus.  — billinghurst sDrewth 03:18, 8 January 2018 (UTC)

    @Billinghurst: I never meant to imply that your addition was made in bad faith or was misguided, but considering that this is considered something of value, I would consider this a significant crash. Beetstra's enwiki talk page has a list of sub-domains. We should figure out how to whitelist those 997 sub-domains first without blowing up the Whitelist or edit save times. Then we can add back the blacklist entry as then the collateral damage would be almost completely eliminated.—CYBERPOWER (Chat) 03:51, 8 January 2018 (UTC)
    @Cyberpower678: Yep, I know that, if you were going to be critical it would be direct and off-piste. I was explaining the difficulties, with you as the interlocutor, and hoping that the community come up with some useful ideas to how we can do this better. We don't have one wiki as many sites do, and our nesting solution is okay, though still is imperfect. The real "elephant in the room" is that the bots get through the captcha so easily, and that we don't have the tools to deal with this well. Followed up by that so few see the overarching problem it never reaches the mass numbers needed for a resourced solution (and that some throw their time and efforts to help manage.) This problem causes burnout. — billinghurst sDrewth 06:21, 8 January 2018 (UTC)
    @Billinghurst and Cyberpower678: Is the Blacklist global, but the Whitelist local, then? Anyways, this case suggests to me the need for Graylisting: any match to the graylist can be forced through (after a warning maybe, so editors don't add these blindly without being aware of the potential problems) by any autoconfirmed user (or whatever "in good standing"-type flag makes sense). Optional enhancement above that is some way for an URL thus accepted (vouched for by an autoconfirmed user) to find its way to the permanent Whitelist (maybe by way of some log somewhere, or a filter/tag on global changelog, etc.). This distributes the decision on whether to "trust" a given URL to individual editors, presumably familiar with the specific URL in question (much much more fine grained), and to local judgement on each Wiki (e.g. enwiki and nowiki and enWS and Commons could all potentially have divergent policies or guidelines applicable to the same URL); meanwhile, normal blacklisting would still apply for any untrusted user and IP who cannot be assumed to be acting in good faith (in this context) or with any particular level of familiarity with our policies. I'd file a Phabricator ticket but I'm not sufficiently familiar with the existing Blacklist/Whitelist functionality to have any faith I'd be able to string together something parseable for the MW developers.—The preceding unsigned comment was added by Xover (talk)
    @Xover: That phabricator ticket, in a way, already exists for several years. The lack of WMF engagement in these issues is the problem, the captcha is too easy to circumvent - and appropriate means to stop abuse is limited (AbuseFilter too heavy for this, blacklist too harsh, ... ). There is NO adequate way to handle spam. --Dirk Beetstra T C (en: U, T) 13:24, 8 January 2018 (UTC)

storz-bickel.com



Please remove from blacklist. So that the link can be used in following articles.

and if it will not be deleted also

--2001:16B8:24F7:5800:7569:3032:D6EF:1C57 00:41, 4 January 2018 (UTC)

    • I would ask any admin here to deny this request as there have been no change since the previous three times this got requested. --Wiki13 talk 01:24, 4 January 2018 (UTC)

  Declined. If you want that link get added on those pages, please request to whitelist them locally. That page has still no use for wikimedia wikis. Matiia (talk) 01:41, 4 January 2018 (UTC)

This section was archived on a request by:  — billinghurst sDrewth 12:00, 4 January 2018 (UTC)

banknotes.com



At first I wanted to only request local whitelisting for attribution to PD-scans found on this website, but it actually contains a lot of detailed articles on the history of banknotes and is a great educational source. --Donald Trung (Talk 🤳🏻) (My global lock 🔒) (My global unlock 🔓) 12:04, 7 January 2018 (UTC)

@Donald Trung:   Declined, there are other places to get them, and the owner seems to be quite restrictive about the use of the images: https://en.wikipedia.org/w/index.php?title=User_talk:12.214.210.51&diff=prev&oldid=55681652. (actually, reading that remark, I am not sure whether the images are PD). Note that this was spammed by a large number of IPs on several wikis. If no alternatives exists, then whitelisting can solve the issue, and for attribution of images we don't need a working link, just leave off the http://. --Dirk Beetstra T C (en: U, T) 18:03, 7 January 2018 (UTC)
@Donald Trung: That website is not a reliable source. That we see you here so often with complaints about blacklisting should set off your internal alarms about what you are missing, not that we have got it wrong.  — billinghurst sDrewth 06:30, 8 January 2018 (UTC)

stylecnc.com



The domain is on the global blacklist, and I don't think it should be. It seems that it was blacklisted in January 2018 as a spammer. The site has many useful articles and reviews for CNC users. I think it is a good website. 11:12, 8 January 2018 (UTC) {subst:unsigned|Jiangzaiyuan}} and —The preceding unsigned comment was added by 183.90.191.207 (talk)

@183.90.191.207:   Declined, not blacklisted here. Dirk Beetstra T C (en: U, T) 11:15, 9 January 2018 (UTC)

banknotes.com



At first I wanted to only request local whitelisting for attribution to PD-scans found on this website, but it actually contains a lot of detailed articles on the history of banknotes and is a great educational source. --Donald Trung (Talk 🤳🏻) (My global lock 🔒) (My global unlock 🔓) 12:04, 7 January 2018 (UTC)

@Donald Trung:   Declined, there are other places to get them, and the owner seems to be quite restrictive about the use of the images: https://en.wikipedia.org/w/index.php?title=User_talk:12.214.210.51&diff=prev&oldid=55681652. (actually, reading that remark, I am not sure whether the images are PD). Note that this was spammed by a large number of IPs on several wikis. If no alternatives exists, then whitelisting can solve the issue, and for attribution of images we don't need a working link, just leave off the http://. --Dirk Beetstra T C (en: U, T) 18:03, 7 January 2018 (UTC)
@Donald Trung: That website is not a reliable source. That we see you here so often with complaints about blacklisting should set off your internal alarms about what you are missing, not that we have got it wrong.  — billinghurst sDrewth 06:30, 8 January 2018 (UTC)

(above revived from archive to continue discussion --Dirk Beetstra T C (en: U, T) 12:40, 11 January 2018 (UTC))


Apparently this was archived while History-of-China.com and other discussions weren't, but here's my response.

@Beetstra: I don’t think that you understand how either copyright or sourcing on Wikimedia projects works. Scans of banknotes cannot be considered to be “original works” and the owner of the website should have no right to blacklist his/her website on that alone, banknotes are 2D objects and if their original copyright expired then a scan can't be considered to be copyrighted (photographs from certain angles can, but this website only hosts scans. Further if you upload an image to Wikimedia Commons without adding a proper source it can't be verified to have actually come from there and will get deleted. Even if the owner wants their website to be blacklisted that isn't a justification to withhold proper attribution from the information about banknotes hosted on the website, since when does “courtesy blacklisting” exist? This website is useful for most Wikimedia projects and just because the owner doesn't like it when we share their knowledge for free doesn't mean that they have the right to prevent anyone from using their website as a source. --Donald Trung (Talk 🤳🏻) (My global lock 🔒) (My global unlock 🔓) 11:00, 11 January 2018 (UTC)

'I don't think that you understand how either copyright or sourcing on Wikimedia projects works.' .. I don't know what that assumption has to do with my remark. My remark was, and still is 'Note that this was spammed by a large number of IPs on several wikis. If no alternatives exists, then whitelisting can solve the issue, and for attribution of images we don't need a working link, just leave off the http://.' Moreover the owner, as mentioned, was in the past quite restrictive - whether that is rightful or not, it is his right, if he doesn't want it then fine, we'll get it somewhere else.
Continuing, IF the copyright on the original banknotes is expired, then there is no problem to get your own scan or get it anywhere.
In short, this was blacklisted because it was spammed, it is not needed to attribute in file namespace - in fact, live links are never needed - and seen the remark of the owner I don't know whether these images are PD. And for those that are, still whitelisting will solve it.   Declined. --Dirk Beetstra T C (en: U, T) 12:49, 11 January 2018 (UTC)

iln.co.uk



Per https://en.wikipedia.org/w/index.php?title=User_talk:Iridescent&oldid=821329639#Collecting_references_to_news_pictures_for_articles. I blacklisted it on 2011 but I can't remember the details. Apparently this is not appropriate now. —MarcoAurelio (talk) 20:27, 19 January 2018 (UTC)

@MarcoAurelio, Carcharoth, Iridescent, and Jo-Jo Eumerus:   Removed from Spam blacklist. Dirk Beetstra T C (en: U, T) 07:35, 21 January 2018 (UTC)

Troubleshooting and problems

  This section is for archiving Troubleshooting and problems.

Discussion

  This section is for archiving Discussions.

fiverr



Hi all. There is article at ar.wiki about the website and because it is listed in spam blacklist it is not possible now to edit the article and this hinder improving its content. Any suggestions to solve this issue?--مصعب (talk) 23:00, 27 January 2018 (UTC)

@Billinghurst: Hi. Can you help?--مصعب (talk) 00:29, 28 January 2018 (UTC)
@مصعب: add the exact problem url, or a suitable regex of it, to w:ar:Mediawiki:spam-whitelist, either temporarily—whilst it is saved and embedded into the article—or permanently—if needed elsewhere. Noting that mediawiki is meant to recognise a previously saved url and not prevent saving of the article, so it will have been a significant change to have that hit the blacklist again.  — billinghurst sDrewth 04:17, 28 January 2018 (UTC)
This section was archived on a request by:  — billinghurst sDrewth 23:04, 13 February 2018 (UTC)
Return to "Spam blacklist/Archives/2018-01" page.