From Wikipedia, the free encyclopedia

Proposed merger of SpamBlacklist and AbuseFilter

Please comment at phab:T337431. Suffusion of Yellow ( talk) 19:19, 2 June 2023 (UTC) reply

gkrocket.in

gkrocket.in: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot- Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Repeated additions despite several warnings. Please see Special:Contributions/2409:40F2:103F:6F11:8000:0:0:0. Maliner ( talk) 14:24, 4 June 2023 (UTC) reply

Maliner, is there any other IP address or account that has spammed this link before? ~ ToBeFree ( talk) 14:43, 4 June 2023 (UTC) reply
@ ToBeFree: Not sure, the actual reason to report this link here was to prevent repeated spamming by that Ip range despite several warning. Please see the diffs and talk page of the spammer. Maliner ( talk) 14:57, 4 June 2023 (UTC) reply
They're blocked. Please re-report if different IP addresses start adding the same link. ~ ToBeFree ( talk) 15:05, 4 June 2023 (UTC) reply
@ ToBeFree got it, thanks. Maliner ( talk) 15:06, 4 June 2023 (UTC) reply

avetruthbooks.com

Large-scale addition of a source that promotes Armenian genocide denial and hosts copyright-infringing materials. See also this ANI report. – LaundryPizza03 ( d ) 02:56, 4 June 2023 (UTC) reply

LaundryPizza03, have any other users ever linked to that website, indicating that a block (or the proposed and seemingly passing ban) is insufficient to deal with the issue? ~ ToBeFree ( talk) 14:48, 4 June 2023 (UTC) reply
I do not know, but WP:COPYLINK is clearly established here. – LaundryPizza03 ( d ) 14:52, 4 June 2023 (UTC) reply
Possibly; that policy section describes a user conduct issue. The word "blacklist" or "blocklist" doesn't appear once in that policy though. Links are usually not blacklisted if blocking (or here, banning) a user is sufficient to deal with the issue. ~ ToBeFree ( talk) 15:10, 4 June 2023 (UTC) reply

gulfnp.com, ceoww.com

Recent spam campaign from UK Vodaphone ranges. plus Added to MediaWiki:Spam-blacklist. OhNoitsJamie Talk 20:18, 7 June 2023 (UTC) reply

digitaljunkies.com.au

Fairly constant spam over past couple of weeks. The link surprisingly belongs to a digital marketing agency... Apparition11 Complaints/ Mistakes 21:45, 9 June 2023 (UTC) reply

@ Apparition11: plus Added to MediaWiki:Spam-blacklist. -- OhNoitsJamie Talk 22:16, 9 June 2023 (UTC) reply

lockmanage.com

Persistent spamming from IP editors, not responsive to talk page warnings. - MrOllie ( talk) 12:34, 10 June 2023 (UTC) reply

@ MrOllie: plus Added to MediaWiki:Spam-blacklist. -- OhNoitsJamie Talk 15:22, 12 June 2023 (UTC) reply

Northerntransmissions.com

This site was block-listed a decade ago due to a link-spamming issue. Near as I can tell, that is no longer a problem and discussion of Northern Transmissions as a reliable source on WT:ALBUM considers it a good source to use for album reviews. The site does not contain malware or any other issues that would make it dangerous for viewers, so I'm not seeing a reason to keep it block-listed. See Wikipedia:Reliable_sources/Noticeboard/Archive_390#Northern_Transmissions and Wikipedia_talk:WikiProject_Albums/Archive_67#Proposed_reliable_sources_for_Wikipedia:WikiProject_Albums/Sources. ― Justin (koavf)TCM 14:22, 13 June 2023 (UTC) reply

Manning and Manning

This website is still being blacklisted as I have tried again today to add a reference. Should I wait some time or what can I do? Thanks for any help. Richard Nowell ( talk) 09:35, 15 June 2023 (UTC) reply

Aircheckdownloads.com

This website features audio clips of UK and Irish radio stations and is a good source for references of station openings and for when presenters were at specific stations. I am puzzled as to why such a useful resource is blacklisted and would request its removal from the Wikipedia blacklist. Rillington ( talk) 01:38, 18 June 2023 (UTC) reply

Per this entry, it was blacklisted per WP:COPYRIGHT concerns. OhNoitsJamie Talk 02:29, 18 June 2023 (UTC) reply


Filmcompanion.in

This RSN discussion may of interests to editors active on this board. Abecedare ( talk) 18:48, 22 June 2023 (UTC) reply

chemicalbook.com

chemicalbook.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot- Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com This is very widely used in articles dealing with elements, chemical compounds and chemistry in general. A rough count found c200 articles using this as a reference. The site is a commercial marketplace for chemicals and reagents but show no evidence of any editorial oversight. This appears to me to be the very definition of link spam. In addition the site appears to breach violate copyright by constructing its content from other sources as for example at Photographic film, this reference is a direct copy vio from this 1977 paper by Meredith and from a 1968 book by Meredith here. Any source that itself contains copyright violations cannot, in my view, be an acceptable source for Wikipedia. In all cases that I have searched there are readily available RSs for the information referenced with nearly all available at PubChem, much of whose content has simply been "borrowed" by chemicalbook.com. There are c 193 articles citing this as a reliable source. I have raised this at WT:RSN.   Velella   Velella Talk   13:13, 23 June 2023 (UTC) reply

iasscore.in

Gsscore07 ( talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
Seobyshivangi ( talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
IASSCORE ( talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)

Three strikes. plus Added to MediaWiki:Spam-blacklist. OhNoitsJamie Talk 16:20, 23 June 2023 (UTC) reply

AssamJobz.com

This Website serves the People of Assam (A state in India) with jobs News and Educational news. This is one of the popular websites of the State, The Website has 3 lakh traffic every month. This site has a great reputation, I strongly believe this website is in the spam list by mistake or my competitor has done something wrong, I request Wikipedia admins Please review my website. and reconsider my website. please — Preceding unsigned comment added by 223.176.13.60 ( talk) 16:17, 25 June 2023 (UTC) reply

no Declined Please see the very large, bold text at the beginning of this section: "Requests from site owners or anyone with a conflict of interest will be declined." The blacklisting of this site was not a mistake. Sam Kuru (talk) 16:25, 25 June 2023 (UTC) reply

dummy-tickets.com

Posting to use tge “make my life easy” script. Courcelles ( talk) 05:08, 26 June 2023 (UTC) reply

@ Courcelles: plus Added to MediaWiki:Spam-blacklist. -- Courcelles ( talk) 05:09, 26 June 2023 (UTC) reply

hugedomains.com

This is a domain squatter, which usurps domains and breaks dead links. There are currently 602 articles containing links to this domain, which are added by well-meaning bots and scripts because they don't understand if a link actually works or just redirects them to a useless garbage page. I don't think editors intentionally spam links to hugedomains, but maybe if they can be technically prevented from adding links to it, fewer of our links will get broken before people have a chance to find an archive or an alternative source or somehow deal healthily with the link rot. I've never made a request here before, but this feels nonstandard, so I'm placing this edit here instead of the above section, in the hopes that I can be educated if this is actually a terrible idea. Folly Mox ( talk) 09:42, 18 June 2023 (UTC) reply

I think the first course of action would be to fix the scripts that are adding the domain. Blacklisting it now means that suddenly 602 articles can't be edited until someone manually removes the link; it would also result in numerous scripts polluting the spam blacklist log with fruitless attempts to re-add it. Until then, I think blacklisting is premature. OhNoitsJamie Talk 17:56, 18 June 2023 (UTC) reply
OhNoitsJamie - note that GreenC has a bot solution to obviate the concern about removal of a blacklisted source causing problems with article editing, described here for the Healthline issue. Quote from GreenC: the correction bot "eliminates the entire reference between ref tags including the ref tags, links in external links, etc.. everything related to this source including named refs like <ref name="example" /> disappears. The text the cite sources would stay in place." Zefr ( talk) 18:22, 18 June 2023 (UTC) reply
As I feared, these are getting added by reFill ( Example). The community is unwilling to block it, and the programmers responsible for it mostly don't fix bug reports for lack of skills or time. This stand off has been ongoing for 5 years or more. So I am hesitant to do work that will later get undone by reFill.
This is a usurpation situation. The citations stay in place, only need to remove the offending URLs and convert the url-status to usurped. It won't stop reFill though, as it will convert the source URL back to the hugedomains, since that is where the 301 status redirect goes. At best we might try to outfox it by usurping the citations then quickly adding a spam blacklist to block reFill. I have the tool to do the usupations. Ping me with a plan of action if you want to proceed. -- Green C 18:51, 18 June 2023 (UTC) reply
If refill is only run manually, I'm not bothered with your proposed usurp-then-blacklist scenario; maybe it will encourage someone to fix it. OhNoitsJamie Talk 21:53, 18 June 2023 (UTC) reply
Yes, these usurped domains are being added by multiple scripts all reliant on Citoid. Neither Citoid nor any of the scripts that use it seem to do any error checking on their output, and frequently parse usurped domains and 404 pages, returning them as well-formatted but worthless citations. There's a big problem with ownership and competing priorities all up and down the stack. I don't see too much of a problem with 602 pages being uneditable until the offending citation is deleted in favour of a {{ cn}} tag or an archived version found or the link rot otherwise addressed in a constructive way.
I don't think there are any bots that perform this kind of task, despite my earlier inclusion of them. I was thinking of Citation bot, but I'm not actually sure it of its behaviour and I think it's only run manually as well. Folly Mox ( talk) 14:51, 19 June 2023 (UTC) reply
It's 411 pages in terms of URLs the rest is metadata probably. My bot can delete citations, but it requires oversight of every edit due to the error rate. It's doable with 411 pages. Turns out usurping the cites I mentioned above (eg. converting url-status to usurped etc) will not work because of loss of URL information. The cites need to be entirely deleted in most cases. Working on it now. Will be a hairy job with manual editing involved.
Detecting 404s and soft-404s is hard and error prone. If Citoid has trouble that is understandable all bots have the same issues. The question is do they check at all, or just pass through and expect the end user application to check, because that's a recipe for no one taking responsibility. -- Green C 15:31, 19 June 2023 (UTC) reply
I've found while cleaning up after ReferenceExpander (at this project) that the best solution is to find the diff where the usurped URL replaces the old dead one (which can be guessed in the page history since the scripts call themselves out by name), and then finding an archive of the dead URL, finding another source for the claim, or deleting it entirely and tagging {{ cn}}. Unfortunately this takes about ten minutes each time.
I haven't looked at ReFill's code, or Citoid's, so I don't know if they do any error checking at all, but I've looked at ReferenceExpander and it just pastes in Citoid's output verbatim. I feel like the situation we have at present is that the script devs trust the users to double check their scripts' output, and the users trust the scripts to function accurately in most cases. I have seen cases where an editor will decline an algorithmically generated citation or self-revert in the next edit, but there's an excess of trusting others and a defecit of due diligence. Folly Mox ( talk) 16:12, 19 June 2023 (UTC) reply
yeah well, manual solutions don't scale well in this case it's 68+ hours according to your estimate and it won't get done by drive-by editors and we won't be able to add blacklists etc.. the best solution is to nuke these refs quickly and get the blacklist added to stop any more from being added, then if someone wants to do that manual work go ahead, it doesn't require hugedomains.com to be live in the article. -- Green C 17:14, 19 June 2023 (UTC) reply
I agree. Feel free to nuke the refs and we'll deal with it post facto. Any way you could make a permanent list of affected pages for people to go over later? Folly Mox ( talk) 19:51, 19 June 2023 (UTC) reply
User:Ohnoitsjamie: hugedomains.com is removed from EnWiki: [1] It exists in IABot cache and elsewhere and will probably get re-posted soon without a block. It might even with the SBL, if the new blacklist bot-override right is working, we'll see. -- Green C 20:23, 20 June 2023 (UTC) reply
plus Added OhNoitsJamie Talk 21:39, 26 June 2023 (UTC) reply
Couldn't one method be to look at the URLs registration date and if it is newer than the cite's date- then 'comment out' the URL. until it can be edited to another internet archiver website with an old snapshot? CaribDigita ( talk) 03:40, 27 June 2023 (UTC) reply

Migration to MediaWiki:BlockedExternalDomains.json

Hi, Please replace content of this page with User:Ladsgroup/MediaWiki:Spam-blacklist and content of MediaWiki:BlockedExternalDomains.json with User:Ladsgroup/MediaWiki:BlockedExternalDomains.json. That would make editing faster for everyone and generally easier to maintain. I explained more in Special:Diff/1160884499 Ladsgroup overleg 20:35, 26 June 2023 (UTC) reply

Is there a version of whitelist support for this yet? I would think migration would be premature until that arrives. MrOllie ( talk) 21:33, 26 June 2023 (UTC) reply
@ MrOllie Mostly that someone needs to do it (I'm doing this in my volunteer capacity) but also most cases don't need a whitelist entry (e.g. the xzy ones only needed if we block .xzy and we can simply not do that right now). So we can start with migrating the big bulk of them and then let's see what's left. Ladsgroup overleg 08:30, 27 June 2023 (UTC) reply
You are requesting that domains that currently have whitelisted URLs be moved to your new system. That should not be done until the new system has whitelist support. Even those that don't need whitelist support now might need it in the future - and moving domains back and forth from one system to the other is not ideal. MrOllie ( talk) 11:11, 27 June 2023 (UTC) reply

healthofchildren.com

Mainly because it is based upon the Gale Encyclopedia. I understand they use advertising, but this is not in itself a reason to blacklist serious encyclopedias. tgeorgescu ( talk) 22:16, 27 June 2023 (UTC) reply

 Defer to Whitelist Advameg sites were blacklisted for good reason. OhNoitsJamie Talk 22:22, 27 June 2023 (UTC) reply

happymod.co.in, etc.

Spam links to pirated software/APK sites. plus Added to MediaWiki:Spam-blacklist. OhNoitsJamie Talk 13:44, 30 June 2023 (UTC) reply

paybis.com

Turned up for the first time on a user page of a 'new' user [2], can never have any encyclopaedic value and is likely a scam. — Trey Maturin 16:54, 27 June 2023 (UTC) reply

plus Added to MediaWiki:Spam-blacklist. -- OhNoitsJamie Talk 13:45, 30 June 2023 (UTC) reply