From Wikipedia, the free encyclopedia

This page is for requesting modifications to URLs, such as marking dead or changing to a new domain. Some bots are designed to fix link rot; they can be notified here. These include InternetArchiveBot and WaybackMedic. This page can be monitored by bot operators from other language wikis since URL changes are universally applicable.

Big Cartoon DataBase

Per Wikipedia:Templates for discussion/Log/2024 January 16#Big Cartoon DataBase Template:Bcdb and Template:BCDB title are being deleted, however there are many other non-templated links to that website that aren't working (see for example the second reference at Tod Carter or the external link at Knight-mare Hare). Reporting here as I don't think anything is currently done with these (archived, marked as dead, or removed) Gonnym ( talk) 14:04, 23 January 2024 (UTC) reply

Gonnym, I see about 1,000 instances of the templates, and another 1,400 links. The site has been "excluded from the Wayback Machine". But, the first one I checked is available at archive.today. There are a number of options:
  • Convert the 1,000 templates to normal square links, then convert those plus the 1,400 to archive.today, where available, or add a {{dead link}} if not. That way if the site is ever un-excluded from the Wayback in future those archives could get added.
  • Nuclear option: completely eliminate all citations and links to this site.
  • Some other combo, like nuking the 1,000 but trying to save the 1,400 and if any those don't archive then nuke those etc..
Both options are a bit of work, nuking is not clean it's semi-automated each one has to be visually verified it didn't mangle things, but I have done it before and the quantity isn't too high. The conversion and archiving is more automated. My suggestion, if you think the site is completely unreliable and should be eliminated even when it has archives, the nuclear option, otherwise the first option. -- Green C 14:40, 23 January 2024 (UTC) reply
I have no real opinion here as I hadn't participated in that discussion but I'll ping here others that did. @ Snowmanonahoe @ TechnoSquirrel69 @ WikiPediaAid. Gonnym ( talk) 14:47, 23 January 2024 (UTC) reply
The site is a wiki... I'm impressed it managed to amass 1400 citations. I say nuke it, because again, it's a wiki. Snowmanonahoe ( talk · contribs · typos) 15:57, 23 January 2024 (UTC) reply
Thanks for the ping, Gonnym! The links being generated by the template are already being removed by a bot since the TfD closed as delete, so we don't need to worry about those. I would rather not indiscriminately delete the other links in citations, just add the archive URL along with a |url-status=dead if applicable. TechnoSquirrel69 ( sigh) 15:00, 23 January 2024 (UTC) reply
Sounds like that bot is not only eliminating the template, but also the entire citation to BCD. Sounds like a limitation of the bot, it can only delete templates without the option to convert to square links. That's unfortunate because TfD should concern removing templates, not removing citations, which is more the domain of WP:RSN. This is a common scenario with a mix of templates and links and we end up with this inconsistency. Some cites are completely deleted because of the template, others are kept because they are square links, it's random. Anyway this is not directly related to BCD just observing. I can try to archive what is left no problem. -- Green C 15:25, 23 January 2024 (UTC) reply
I don't think the bot is removing citations, just the links generated by the {{ bcdb}} template. All of the cite links should still be around. TechnoSquirrel69 ( sigh) 15:45, 23 January 2024 (UTC) reply
For now, I'll retain the citations and treat the links as dead. There is no clear consensus to nuke cites entirely. -- Green C 01:47, 24 January 2024 (UTC) reply
Thanks, GreenC! TechnoSquirrel69 ( sigh) 23:06, 24 January 2024 (UTC) reply
I made the following edits
  • Remove pre-existing Wayback links since they don't work
  • Add archive.today links when available (1,025)
  • Add {{dead link}} for the rest (697)
  • Update iabot.org so changes can propagate to 300+ other language wikis
If in the future the restriction on Wayback is lifted the bots should be able to convert the dead links. -- Green C 02:59, 25 January 2024 (UTC) reply

Gemini, Apollo, Shuttle Mission "Chronology of Wake-up Calls"

This weblink PDF ( https://history.nasa.gov/wakeup%20calls.pdf) is used as a secondary source across a large number of articles for the Gemini, Apollo, and especially Space Shuttle missions. It recently got 404'd, but a very recent archived link is available here ( https://web.archive.org/web/20231220093919/https://history.nasa.gov/wakeup%20calls.pdf). It would be great if y'all can add this archive link to the queue. SpacePod9 ( talk) 00:54, 24 January 2024 (UTC) reply

I submitted an IABot job to process the 56 pages where it's located. -- Green C 01:51, 24 January 2024 (UTC) reply
Thanks for the help! SpacePod9 ( talk) 03:43, 24 January 2024 (UTC) reply

Canoe.ca

It appears that canoe.ca was once a news website that is referenced in quite a few articles, but it has since been usurped by another gambling website. Unfortunately, the new owners have also blocked the Wayback Machine and only some of the pages I've seen are in archive.today. However, some of the links appear to be salvageable by changing "canoe.ca" to "canoe.com" and then going into the Wayback Machine. Is this something that the bots can help with? Thanks! :Jay8g [ VTE 23:32, 27 January 2024 (UTC) reply

That was probably a little confusing. There are basically three ways that existing canoe.ca links can be archived:
  • Archive.today might have a direct archive of the canoe.ca URL
  • The Wayback Machine might have an archive of the same page with "canoe.ca" replaced with "canoe.com"
  • Archive.today might have an archive of the same page with "canoe.ca" replaced with "canoe.com"
As far as I can tell, the canoe.ca and canoe.com pages were completely identical, but all of the links I've checked seem to be dead on both domains. Unfortunately, there are over 10,000 of these links according to Special:LinkSearch, which is too much for me to deal with manually. There are also quite a few dead links to canoe.com itself, but at least those aren't usurped and can be found in the Wayback Machine normally. :Jay8g [ VTE 23:45, 27 January 2024 (UTC) reply
Notes for canoe.ca ie. canoe.com:
Proposal for canoe.ca in five runs of WaybackMedic:
  1. Pass 1a (canoe1): Remove all Wayback links  Done - remove 391 archives
  2. Pass 1b (canoe3 & canoe4): Remove all WebCite links (SSL errors and unstable)  Done - remove 329 archives
  3. Pass 2 (canoe2): Attempt conversion to archive.today. Else add {{dead link}}  Done - add 8,353 archive.today, 633 {{dead link}} (total including existing), change 578 |url-status=live to dead
  4. Pass 3a (canoe5): For canoe.ca with a {{dead link}}: check the API if a Wayback link exists if it were converted to canoe.com - if so, change source link to canoe.com and set to live status and remove {{dead link}}  Done - 157 URLs converted to canoe.com
  5. Pass 3b (canoe6): Check the canoe.com links from Pass 3a for link rot, if so, convert to Wayback or archive.today links  Done - 294 Wayback URLs added to canoe.com URLs in the same set of articles processed during Pass 3a (excess due to pre-existing canoe.com links that were dead)
  6. Pass 3c (canoe7): Make a list of citations with {{dead link}}  Done 406 cites listed at Wikipedia:Link rot/cases/canoe.ca
  7. Pass 4 (judi14a and judi14b): Convert canoe.ca to a usurped citation per steps at WP:USURPURL. This will include completely deleting citations that have no archive URL  Done Edited approximately 6,000 pages.
Proposal for canoe.com
  1. Pass 5 (canoecom): Check for dead links and soft-404s as normal  Done Edited 1,132 articles out of 1,953 checked. Added 1,820 archive URLs. Change 371 |url-status=live to dead
----
User:Jay8g per above proposal. Each pass of the bot has different settings enabled. When done in this order, it should work. The "Pass 3" might result in a lot of deleted citations, I'll let you know before running that one. This will require at least 4 runs of the bot of 6k pages each, plus some manual steps it will take a while. -- Green C 01:39, 28 January 2024 (UTC) reply
That all sounds good to me! Thanks! :Jay8g [ VTE 04:01, 28 January 2024 (UTC) reply
I just thought of one issue with pass 4: Because canoe.ca was a news aggregator, some of the citations that currently link to it can be found on other, unrelated websites. For example, the reference in Dwayne Johnson (the first link that comes up for me in the 6,148-page search) points to http://www.canoe.ca/SlamWrestlingArchive/feb24_rocky.html on canoe.ca, but the same article can be found at https://slamwrestling.net/index.php/1998/02/24/a-piece-of-the-rock/ on Slam Wrestling's own website. That exact article is also available using the Wayback Machine with canoe.com, but if it was not available there, replacing it with the slamwrestling.net URL would be better than deleting it. Of course, there's no way to do that without manual work, and anything that's just a bare URL is gone for good.
I will be interested to see how many canoe.ca links are left after steps 1-4, to see whether it makes sense to remove those links entirely or try to find the same articles posted elsewhere first. I'm not sure if this is a situation that has come up before with usurped URLs like this or what the standard practice is. :Jay8g [ VTE 04:18, 28 January 2024 (UTC) reply
For the rocky example, there is no map to know where the canoe.ca link should go. And since canoe.ca is now a usurped vice site we are supposed to hide it from view. And if no archive is available, delete it. Let's wait and see how many there are after Pass 3. One solution is rather than delete the entire cite, convert to {{citation}} which doesn't require a URL, convert the |work= to Slam Wrestling, and remove the canoe.ca URL. This kind of work is laborious because there are so many permutations of citation templates and argument combinations people use it's not consistent. Also the square and bare links that don't use templates. -- Green C 16:54, 28 January 2024 (UTC) reply
Yes, there's no automatic way to fix that. I'm also not sure how many of the links would even be able to be manually fixed, since some might not be able to be easily found on other domains. I agree with waiting to see what is left after the bot tries to find archive links to see if it's worth me trying to fix the leftovers manually. :Jay8g [ VTE 22:05, 28 January 2024 (UTC) reply
User:Jay8g: Here are the remaining 406 citations with {{dead link}}: Wikipedia:Link rot/cases/canoe.ca .. there are over 11,000 in total on enwiki so the archival success rate was about 96% which is very good. Something still needs to be done with the 406. Options are nuke the citation, which is the only choice for square links. Convert to {{cite news}} and remove the |url= - this option is normally done when the cite can be found offline like microfiche of a newspaper. Of course, there is manual work, where anything is possible. In the mean time, I'll start processing the rest of the canoe.com links, many appear inoperable. -- Green C 14:36, 30 January 2024 (UTC) reply
I spot-checked several of the remaining 406 dead links and was unable to find alternative links for any of them, so I think we should be good to remove the remaining links. Thanks for all your help on this -- I'm impressed by how many links were able to be fixed! :Jay8g [ VTE 21:50, 30 January 2024 (UTC) reply
User:Jay8g sounds good. I'll be working on this over the next few days and will post when done. Thanks for bringing this to attention. I've been aware of Canoe, but didn't know it was usurped and excluded from Wayback, that's a new scenario (plus the canoe.com twist). It basically required every feature my bot has and then some, never made so many passes. This was a good learning experience what the bot can do and how. -- Green C 02:14, 31 January 2024 (UTC) reply
As noted above, this is all done finally. -- Green C 02:34, 5 February 2024 (UTC) reply
Most of the content on canoe.ca was from the Sun Media newspapers, so many of these articles can probably be found in Canadian newspaper archives (Web archives like https://web.archive.org/web/*;type=text/torontosun.com/* or newspaper archives like NewspaperARCHIVE.com). It looks like the URL's with "-cp" were Canadian Press stories and a bunch of them list The Canadian Press as the author, publisher, agency, etc. and the URL's with "-ap" were Associated Press stories. Articles from those agencies should be available in a variety of places. Finding them is the challenge.
The wrestling articles could probably all be found on Slam Wrestling if someone is willing to do the work. I didn't see any equivalent partner sites for other sports or categories.-- Jahalive ( talk) 02:22, 2 February 2024 (UTC) reply
I'm guessing you're not interested in customizing a bot to pull the news agency and date from the URLs of those CP and AP stories.-- Jahalive ( talk) 00:38, 13 February 2024 (UTC) reply
User:Jahalive, your idea is a good one. I'm going to pass because there is more work than I have time for. I want to use the bot and my time where it has the most impact, fixing link rot, that's really the bots specialty. Your idea could probably be done by other bot writers. Could try BOTREQ or AWBREQ -- Green C 01:19, 13 February 2024 (UTC) reply

Warren Abstract Machine citations

Some citations at Warren Abstract Machine are broken, including this one: http://wambook.sourceforge.net/ 185.151.251.58 ( talk) 08:54, 31 January 2024 (UTC) reply

I ran IABot on the page but it might take a few tries before the bot decides a link is dead. - Green C 02:19, 2 February 2024 (UTC) reply
It was a soft-404 - I set it dead at iabot.org and reran the bot. -- Green C 03:50, 13 February 2024 (UTC) reply

bibliotecadigital.ciren.cl

This Chilean digital library seems to have reformatted its URLs and is used in numerous articles as a source. Here's a list - it seems like they still host most if not all articles but under different URLs. Jo-Jo Eumerus ( talk) 13:52, 31 January 2024 (UTC) reply

User:Jo-Jo_Eumerus is there an example of old to new? Most likely if it's not obvious how to change there is nothing we can do other than treat the old links as dead and add archives. -- Green C 02:15, 2 February 2024 (UTC) reply
It seems like they still share the titles: https://bibliotecadigital.ciren.cl/server/api/core/bitstreams/72bd0a55-5f0d-4ea6-98c4-116797dce09e/content becomes https://bibliotecadigital.ciren.cl/items/96666f36-9fc4-4833-8a95-0e85c6fd98ce Jo-Jo Eumerus ( talk) 11:13, 3 February 2024 (UTC) reply
Jo-Jo Eumerus It looks like https://bibliotecadigital.ciren.cl/server/api/core/bitstreams/72bd0a55-5f0d-4ea6-98c4-116797dce09e/content is working. Maybe they had time to repair it. But most of them are still not working. Without a map of old to new, I suggest only check if they are dead and if so add an archive URL. For example https://bibliotecadigital.ciren.cl/handle/123456789/7049 becomes https://web.archive.org/web/20160629061606/https://bibliotecadigital.ciren.cl/handle/123456789/7049 .. I think the new page would be https://bibliotecadigital.ciren.cl/items/96666f36-9fc4-4833-8a95-0e85c6fd98ce but it looks different.-- Green C 00:27, 12 February 2024 (UTC) reply
Aye, same content but a slightly different looking platform. Jo-Jo Eumerus ( talk) 12:22, 12 February 2024 (UTC) reply

Jo-Jo Eumerus: The bot ran on 25 pages. It added 10 archive URLs, and 9 {{dead link}}. The pages with {{dead link}}. -- Green C 04:01, 13 February 2024 (UTC) reply

cnnphilippines.com

CNN Philippines has ceased operations as of January 31, 2024. As of now, https://cnnphilippines.com feeds back a 503. We'll need IABot to comb through the roughly 2,200 pages (~3,000 links total) it's linked on and add archives to those citations. Relevant discussion at WT:TAMBAY#Archiving news articles of CNN Philippines. Chlod ( say hi!) 17:17, 31 January 2024 (UTC) reply

Submitted to IABot. -- Green C 02:12, 2 February 2024 (UTC) reply
I don't know why but IABot missed over 1,000 links so I reran it with WaybackMedic and got the rest. -- Green C 02:36, 5 February 2024 (UTC) reply
Many thanks, @ GreenC! Chlod ( say hi!) 12:48, 5 February 2024 (UTC) reply

themessenger.com

themessenger.com has shut down [1], we have around 186 uses per themessenger.com  HTTPS links  HTTP links. All of the news articles are now linking to a blank page (e.g. [2]) Hemiauchenia ( talk) 19:46, 1 February 2024 (UTC) reply

Submitted to IABot. -- Green C 02:17, 2 February 2024 (UTC) reply
User:Hemiauchenia IABot processed this domain, but I had to run it a second time through WaybackMedic. The problem is IABot is missing a lot for reasons I don't understand. Of the 184 articles that contain this domain, after IABot processed it, Medic edited an additional 101 pages adding archive URLs, and converted 43 instances of |url-status=live to dead. -- Green C 15:29, 13 February 2024 (UTC) reply

Wst.tv

Hi, with a heavy heart, the World Snooker Tour has changed its website and changed how all of their links work, and has no real naming convention for most links from wst.tv.

For instance: https://wst.tv/players/jimmy-white/ now is at https://www.wst.tv/players/6100064a-0ea4-4a0c-b8ee-0e2ddaa3def4

News articles and other items have also moved. If there is a smart way for this to be fixed, let me know, but I'm assuming we'd need to archive/mark as dead for the remainder. Lee Vilenski ( talkcontribs) 19:39, 2 February 2024 (UTC) reply

User:Lee Vilenski I don't see a way to migrate the links, without redirect information. If some have links have a redirect the bot will pick it up automatically. Otherwise it will add an archive URL or {{ dead link}}. Looks like 379 pages. -- Green C 05:57, 3 February 2024 (UTC) reply
All of the news articles have moved from https://wst.tv/murphy-takes-season-opener/ to https://www.wst.tv/news/2023/july/21/murphy-takes-season-opener/
It's a mess, I certainly don't see a way to fix it. Lee Vilenski ( talkcontribs) 09:04, 3 February 2024 (UTC) reply
It's surprisingly common how often websites migrate to a new platform, and don't leave redirects. If you want, contact them to ask if they plan to leave redirects and mention Wikipedia as an example. For now I can still add the archives, and if in the future they add redirects, the bot can undo the archives, make it live again and migrate to the new redirected URL. Either way it's basically flipping a switch in the bot. -- Green C 14:12, 3 February 2024 (UTC) reply
Regarding contacting WST: My experience is that they do not respond. It might be better to try to convince their software suppliers to provide redirects. It would appear that there are two companies involved. One is https://urbanzoo.io/ and the other is https://www.imgarena.com/.  Alan  ( talk) 12:42, 4 February 2024 (UTC) reply
It looks like content was not migrated. For example old site https://wst.tv/white-completes-epic-comeback/ search at the new site: "White Completes Epic Comeback" in the news tab Search with no result. Likewise Google: https://www.google.com/search?client=firefox-b-1-lm&q=%22White+Completes+Epic+Comeback%22+site%3Awst.tv .. looks like a complete resetting of the site and any matches found, like with the /players, could be happenstance. --- Green C 17:39, 4 February 2024 (UTC) reply

I was able to build a preliminary map of the player pages, by headless browsering https://www.wst.tv/players/ and reformatting the HTML into this table, making a best guess on the left column. If the bot encounters a URL in the left column, it will replace with the right column. -- Green C 17:14, 4 February 2024 (UTC) reply

I think it is much more complex than that. The old site had pages for many more players than are currently included in https://www.wst.tv/players which only has current players. Look at https://web.archive.org/web/20221126125804/https://wst.tv/player_category_taxonomy/other-players/. Most of these are gone completely, and many are referred to in our articles.  Alan  ( talk) 10:12, 5 February 2024 (UTC) reply
...for instance: if you search in https://www.wst.tv/players for "Davis", you will only get Mark Davis. The old site included Steve Davis, Joe Davis and Fred Davis, who were significant players, apparently now forgotten by WST.  Alan  ( talk) 10:27, 5 February 2024 (UTC) reply
OK I was afraid of that, it didn't seem like many players. It does appear the old site and content was completely abandoned, and the new site has some overlap but that is happenchance and can't be assumed to contain the same actual content on the page even if a match can be made. They didn't do a site migration. In this case for citation verification purposes the correct action is treat everything from the old site as a dead link and hope there are archive available. -- Green C 14:40, 5 February 2024 (UTC) reply
That's pretty much what we've been doing. If you look at the List of snooker players you'll see that all the references have working archives.  Alan  ( talk) 15:14, 5 February 2024 (UTC) reply
Extended content
awk -ilibrary 'BEGIN{f=readfile("snook1.html"); for(i=1;i<=splitn(f,a,i);i++) {j++; if(j == 5) {j = 1; print "https://wst.tv/players/" tolower(fname) "-" tolower(lname) " --  https://www.wst.tv/" subs("href=\"/","",id) }; if(j == 1) {match(a[i], /href=["]\/players\/[^"]+[^"]/, d); id=d[0]}; if(j == 2) {fname=strip(a[i])}; if(j==4){lname=strip(a[i])}  }  }'

https://wst.tv/players/mark-allen --  https://www.wst.tv/players/c37aba27-5b12-4fae-8a8b-9e749c7a25f3
https://wst.tv/players/zhang-anda --  https://www.wst.tv/players/0512f55a-faea-48df-a8fc-895fbcaef511
https://wst.tv/players/muhammad-asif --  https://www.wst.tv/players/3f7a3e33-3889-4c3f-91e3-a6d876c8b999
https://wst.tv/players/john-astley --  https://www.wst.tv/players/49e85842-53d7-4fdb-b69b-4a0db92ff06d
https://wst.tv/players/stuart-bingham --  https://www.wst.tv/players/ac932300-dacb-4e91-803b-99a03fa20853
https://wst.tv/players/luca-brecel --  https://www.wst.tv/players/cd124662-9d97-413c-9609-5051d002ab3b
https://wst.tv/players/jordan-brown --  https://www.wst.tv/players/c49e98bc-101d-419a-81aa-ff2caedb1734
https://wst.tv/players/oliver-brown --  https://www.wst.tv/players/fe7732cc-435e-4ba8-84bf-25f771f0f376
https://wst.tv/players/alfie-burden --  https://www.wst.tv/players/b6350368-74fc-4adf-92c8-ff9126e90541
https://wst.tv/players/ian-burns --  https://www.wst.tv/players/80c5ce19-2c01-48a4-85e4-c0304ac1ea4a
https://wst.tv/players/james-cahill --  https://www.wst.tv/players/4b7b307c-8ec8-4b53-b46e-6817081b95c4
https://wst.tv/players/stuart-carrington --  https://www.wst.tv/players/37a87bd0-792f-46ae-9377-56df3bef9034
https://wst.tv/players/ali-carter --  https://www.wst.tv/players/c796b82d-1040-422d-b27d-9249310b99a3
https://wst.tv/players/ashley-carty --  https://www.wst.tv/players/32dedd2f-0e09-4c03-bed3-679646da516b
https://wst.tv/players/jamie-clarke --  https://www.wst.tv/players/b29c7ae2-4f1c-413c-92bb-01ce78d99b08
https://wst.tv/players/sam-craigie --  https://www.wst.tv/players/edcdfdad-8c65-48fb-94f0-b9b3ac9ad04d
https://wst.tv/players/dominic-dale --  https://www.wst.tv/players/86fd8e51-3964-497c-97c3-729cef44b1f0
https://wst.tv/players/mark-davis --  https://www.wst.tv/players/0398e6dc-dcbf-4ff0-9ff2-7515212bc818
https://wst.tv/players/ryan-day --  https://www.wst.tv/players/5d419487-e341-4301-a4f5-e493a2a78754
https://wst.tv/players/ken-doherty --  https://www.wst.tv/players/e9c5eddd-e493-473e-b688-a3a2ea861800
https://wst.tv/players/scott-donaldson --  https://www.wst.tv/players/ff710b2f-cf05-45d6-840e-e10a7dc9f921
https://wst.tv/players/mostafa-dorgham --  https://www.wst.tv/players/14243478-1def-4ce2-a9a0-80a2858abe32
https://wst.tv/players/graeme-dott --  https://www.wst.tv/players/e0f5c435-470e-4ac3-8406-5ccd39fd475c
https://wst.tv/players/adam-duffy --  https://www.wst.tv/players/2fc33800-aaf8-4e7f-9af0-afc58df79ed2
https://wst.tv/players/ahmed aly-elsayed --  https://www.wst.tv/players/f65d2c9a-513a-458b-9c8b-edfc3aebbce6
https://wst.tv/players/dylan-emery --  https://www.wst.tv/players/0106063a-5a37-47c3-9cbf-67a891012a5e
https://wst.tv/players/reanne-evans --  https://www.wst.tv/players/bc4020ad-76c2-42a4-8994-dd0f756d0b6a
https://wst.tv/players/tom-ford --  https://www.wst.tv/players/69df4145-0b26-4a1e-9afb-c9ae74fa3fd1
https://wst.tv/players/marco-fu --  https://www.wst.tv/players/5012642c-60cc-4ab3-a41b-b152370562eb
https://wst.tv/players/david-gilbert --  https://www.wst.tv/players/9b2532c1-a189-4573-8320-f254d2f9bfde
https://wst.tv/players/martin-gould --  https://www.wst.tv/players/2a0e2004-856c-4f0b-ae3e-54dded6141f8
https://wst.tv/players/david-grace --  https://www.wst.tv/players/ad650d94-b08b-4dc5-9c5f-1653dc909127
https://wst.tv/players/liam-graham --  https://www.wst.tv/players/75baf94d-2c63-42dc-8acb-4e7a5a7bcb09
https://wst.tv/players/xiao-guodong --  https://www.wst.tv/players/c3d39c08-92fd-471b-8901-903a4bd22027
https://wst.tv/players/he-guoqiang --  https://www.wst.tv/players/5587fb4d-8517-4572-918e-65ff83b71d74
https://wst.tv/players/ma-hailong --  https://www.wst.tv/players/a2dbb55d-a612-4aef-9a1c-b9401232eac5
https://wst.tv/players/anthony-hamilton --  https://www.wst.tv/players/a3789843-3f0c-4161-b68a-b770fff83f96
https://wst.tv/players/lyu-haotian --  https://www.wst.tv/players/022c7a82-72c5-4fb5-a748-eb9b249d33fb
https://wst.tv/players/barry-hawkins --  https://www.wst.tv/players/ec561f17-e982-43b3-8807-82fc76adbe75
https://wst.tv/players/louis-heathcote --  https://www.wst.tv/players/e8d25a73-348b-40cd-b4e8-f757250d8900
https://wst.tv/players/stephen-hendry --  https://www.wst.tv/players/8ef2e9be-1769-40e9-8235-a143c9ed5951
https://wst.tv/players/andy-hicks --  https://www.wst.tv/players/66dd278a-0996-41ce-a3c4-3213fda0693c
https://wst.tv/players/john-higgins --  https://www.wst.tv/players/a5eecca1-8302-4739-84fc-6721627baa43
https://wst.tv/players/andrew-higginson --  https://www.wst.tv/players/83deba83-12f0-446d-ab47-e43f5b8ab09e
https://wst.tv/players/liam-highfield --  https://www.wst.tv/players/15860676-6802-4c5d-a06e-ce1356e8cdb7
https://wst.tv/players/aaron-hill --  https://www.wst.tv/players/be51ee14-4b28-4932-8d3d-af8011dc9201
https://wst.tv/players/liu-hongyu --  https://www.wst.tv/players/b614e094-3724-419a-a052-13261ace5b05
https://wst.tv/players/ashley-hugill --  https://www.wst.tv/players/6be559fd-aaac-45af-bd53-5eaa54b22553
https://wst.tv/players/mohamed-ibrahim --  https://www.wst.tv/players/1aa06013-1544-4fd7-b3e7-e8682676acd5
https://wst.tv/players/asjad-iqbal --  https://www.wst.tv/players/b765daf4-6bf6-41e5-b298-50769ed0d841
https://wst.tv/players/himanshu-jain --  https://www.wst.tv/players/218661d8-4ebe-4700-9907-0d0e2af0aeeb
https://wst.tv/players/si-jiahui --  https://www.wst.tv/players/f3c7e0cf-7cb6-405e-9ba1-4d02716a20c3
https://wst.tv/players/jak-jones --  https://www.wst.tv/players/036bc430-6c51-4d63-a366-a6ca218f7f39
https://wst.tv/players/jamie-jones --  https://www.wst.tv/players/a85bdd17-6038-43c8-9cec-d492e4a8a2df
https://wst.tv/players/mark-joyce --  https://www.wst.tv/players/710a2723-9694-4cca-8827-64ee50386179
https://wst.tv/players/jiang-jun --  https://www.wst.tv/players/cf6b1e24-e90e-4420-8290-1c1b0f9ea97e
https://wst.tv/players/ding-junhui --  https://www.wst.tv/players/3ff06750-8c3c-456c-8fac-58209b6f679e
https://wst.tv/players/pang-junxu --  https://www.wst.tv/players/9c842985-9f09-4bd0-aa6a-dafe523b40ee
https://wst.tv/players/anton-kazakov --  https://www.wst.tv/players/cbe2d832-5b47-4b91-bf4e-1e482c875825
https://wst.tv/players/jenson-kendrick --  https://www.wst.tv/players/17e59e8f-42b0-4332-bfaa-452366af8280
https://wst.tv/players/rebecca-kenna --  https://www.wst.tv/players/36672a61-a02f-428b-94a1-d42323bccbb3
https://wst.tv/players/lukas-kleckers --  https://www.wst.tv/players/ccd2b587-4c53-40a5-8b4a-e90b7663ce56
https://wst.tv/players/sanderson-lam --  https://www.wst.tv/players/52ba4e5c-fea6-426c-8ab0-7ca6828d13d5
https://wst.tv/players/rod-lawler --  https://www.wst.tv/players/c9a6633d-a5f9-4302-aacd-c2869fe9259b
https://wst.tv/players/julien-leclercq --  https://www.wst.tv/players/690dc31c-2392-4dd0-8dd9-52e5825cab46
https://wst.tv/players/andy-lee --  https://www.wst.tv/players/d758aa70-d8b1-446a-8284-b2a1ace120bb
https://wst.tv/players/david-lilley --  https://www.wst.tv/players/6757b432-8dc6-4c8d-a345-dac8eb58edf5
https://wst.tv/players/oliver-lines --  https://www.wst.tv/players/c7c75376-75ce-4e4b-ba26-d6c8a098ec9b
https://wst.tv/players/jack-lisowski --  https://www.wst.tv/players/d56f02ab-f2df-41ca-b9a4-24167aded141
https://wst.tv/players/stephen-maguire --  https://www.wst.tv/players/c07238de-bca9-4067-9749-00841bd06d28
https://wst.tv/players/anthony-mcgill --  https://www.wst.tv/players/ac8407bc-1cbf-4642-86a3-1e3cacbaeb62
https://wst.tv/players/ben-mertens --  https://www.wst.tv/players/e9a8f8aa-aa8c-4e64-baa4-3fcfd07ebb26
https://wst.tv/players/hammad-miah --  https://www.wst.tv/players/0ffdae01-5fad-40c8-8b9f-8eb3a942ecac
https://wst.tv/players/robert-milkins --  https://www.wst.tv/players/95eec847-2905-491f-abbe-92ff39038bda
https://wst.tv/players/stan-moody --  https://www.wst.tv/players/a65d6cc8-05fa-4827-8294-a1da17c975f6
https://wst.tv/players/ross-muir --  https://www.wst.tv/players/8051730e-7460-4773-b262-9188f2166f61
https://wst.tv/players/shaun-murphy --  https://www.wst.tv/players/03fe92d3-ad85-434c-bc17-5fe02a496187
https://wst.tv/players/mink-nutcharut --  https://www.wst.tv/players/ae9dffcf-4e09-472a-848e-21bf165f975e
https://wst.tv/players/fergal-o'brien --  https://www.wst.tv/players/cefe88f9-89da-4460-9ed6-6e04ec69cec3
https://wst.tv/players/joe-o'connor --  https://www.wst.tv/players/c2809815-3bd0-41fa-b727-458e22c98070
https://wst.tv/players/martin-o'donnell --  https://www.wst.tv/players/8195961a-a4b7-4ba7-960b-08ab4778dbd3
https://wst.tv/players/sean-o'sullivan --  https://www.wst.tv/players/50da4361-072d-418d-a2a0-721866983d02
https://wst.tv/players/ronnie-o'sullivan --  https://www.wst.tv/players/226c7294-655e-4925-bcde-17330ddfc438
https://wst.tv/players/jackson-page --  https://www.wst.tv/players/19ce247e-1824-4f94-8fe3-c94ce4056802
https://wst.tv/players/andrew-pagett --  https://www.wst.tv/players/d338eb63-5268-427e-a60c-52cb55a56625
https://wst.tv/players/tian-pengfei --  https://www.wst.tv/players/4b168b1a-298b-4c0a-adf6-e3190e36caff
https://wst.tv/players/joe-perry --  https://www.wst.tv/players/a33b80af-7f17-4bb1-8c5d-d36e45eb801c
https://wst.tv/players/andres-petrov --  https://www.wst.tv/players/fc2f8de1-4d6a-40a1-84d2-faea2c5fdb8d
https://wst.tv/players/manasawin-phetmalaikul --  https://www.wst.tv/players/b95907dd-e602-4448-9c78-00c865f4bcd5
https://wst.tv/players/liam-pullen --  https://www.wst.tv/players/44b09a9f-4ded-4b51-80f5-dbd28eb86274
https://wst.tv/players/jimmy-robertson --  https://www.wst.tv/players/4e7f33e8-925d-4442-b8f7-6023cd920d9e
https://wst.tv/players/neil-robertson --  https://www.wst.tv/players/8b83133a-4c15-4275-811e-bdf2cb02702f
https://wst.tv/players/noppon-saengkham --  https://www.wst.tv/players/aaf6c342-11f7-4d03-86b3-1144a4fd92f8
https://wst.tv/players/victor-sarkis --  https://www.wst.tv/players/a91dbb92-a44c-4076-8694-5c08cd40c534
https://wst.tv/players/mark-selby --  https://www.wst.tv/players/ba7831b4-ab75-4435-946a-c6f02e4e2d4b
https://wst.tv/players/matthew-selt --  https://www.wst.tv/players/c1ac359d-8359-405b-9879-74dd9b4a5b2c
https://wst.tv/players/xu-si --  https://www.wst.tv/players/f5586d0e-89f5-434e-8723-65046b1d6fe9
https://wst.tv/players/yuan-sijun --  https://www.wst.tv/players/734865fe-9ee2-4a3e-b4d1-035bf819aff2
https://wst.tv/players/ishpreet-singh chadha --  https://www.wst.tv/players/cc2c8bf7-0c67-4751-9e36-7b86718164b1
https://wst.tv/players/baipat-siripaporn --  https://www.wst.tv/players/53cd277e-28fe-48ed-a0ce-4d5d9745c85f
https://wst.tv/players/elliot-slessor --  https://www.wst.tv/players/b1239913-b987-4bae-a7f6-ff4eb481f503
https://wst.tv/players/matthew-stevens --  https://www.wst.tv/players/af1c65bd-d676-4bfc-8e93-65e34adf93c7
https://wst.tv/players/zak-surety --  https://www.wst.tv/players/24564b03-cfd6-474c-a653-0268241d632f
https://wst.tv/players/allan-taylor --  https://www.wst.tv/players/d1cf990f-e5b8-4584-acce-2bd9b534fcb5
https://wst.tv/players/ryan-thomerson --  https://www.wst.tv/players/1227cfd1-3132-405f-a672-4bdf64538df3
https://wst.tv/players/rory-thor --  https://www.wst.tv/players/9d43b39f-b17f-415f-b779-eebc550cd265
https://wst.tv/players/judd-trump --  https://www.wst.tv/players/e2f3cfe7-6138-4ce6-b1dc-77dcc1d0a65f
https://wst.tv/players/thepchaiya-un-nooh --  https://www.wst.tv/players/67203224-1d66-4c1e-b655-150f4f835aba
https://wst.tv/players/alexander-ursenbacher --  https://www.wst.tv/players/12be0769-d225-4c97-b687-4753e3c1bc26
https://wst.tv/players/hossein-vafaei --  https://www.wst.tv/players/99019ac8-ad6a-4927-9f93-1935ea43ca55
https://wst.tv/players/chris-wakelin --  https://www.wst.tv/players/a1beeb4b-2493-476c-9682-1900eb83c2d5
https://wst.tv/players/ricky-walden --  https://www.wst.tv/players/80b7e0a3-61eb-4a12-b4c4-9d6da83d5b24
https://wst.tv/players/daniel-wells --  https://www.wst.tv/players/a458950b-c644-4f16-b89a-543ccfccc61c
https://wst.tv/players/jimmy-white --  https://www.wst.tv/players/6100064a-0ea4-4a0c-b8ee-0e2ddaa3def4
https://wst.tv/players/michael-white --  https://www.wst.tv/players/9728dd54-b60e-4bf5-9149-cecb93b530ee
https://wst.tv/players/robbie-williams --  https://www.wst.tv/players/8954fbf2-3b42-4af9-981b-333ec1cd8b03
https://wst.tv/players/mark-williams --  https://www.wst.tv/players/6aaddcbb-345c-474a-9069-e7757e155729
https://wst.tv/players/gary-wilson --  https://www.wst.tv/players/e5f4377c-5119-4c0a-9a88-e42eb8e48677
https://wst.tv/players/kyren-wilson --  https://www.wst.tv/players/a8c0d3a6-706b-4bf0-8dce-9cde97fe88c4
https://wst.tv/players/ben-woollaston --  https://www.wst.tv/players/8ad4ff3f-9f92-44ba-a884-6c8a8e0dcf08
https://wst.tv/players/peng-yisong --  https://www.wst.tv/players/78c09fb8-3382-4cb0-a3e8-d0f041f23389
https://wst.tv/players/wu-yize --  https://www.wst.tv/players/d935d534-e696-4292-b773-e9b8efee1ea7
https://wst.tv/players/dean-young --  https://www.wst.tv/players/2354ac0b-0b04-4965-8ae3-1f135713005c
https://wst.tv/players/zhou-yuelong --  https://www.wst.tv/players/960cd1e6-2bb4-4229-aefe-447646412bf2
https://wst.tv/players/cao-yupeng --  https://www.wst.tv/players/3a9eca87-f640-4942-a9a7-74a47f40c562
https://wst.tv/players/long-zehuang --  https://www.wst.tv/players/40859ee8-e438-4062-aa9b-84e4e8e22bac
https://wst.tv/players/fan-zhengyi --  https://www.wst.tv/players/8cbf82f6-c417-421c-ae39-17c8103284cd
  •  Done User:AlH42, the bot is done. It edited 371 articles. Added 1,267 archive URLs. Converted 1,248 cases of |url-status=live to dead. -- Green C 03:20, 6 February 2024 (UTC) reply
Good work! My poor, poor watchlist. Just need to work out what we can do with the remainder. Lee Vilenski ( talkcontribs) 08:07, 6 February 2024 (UTC) reply
User:AlH42: Not too bad, articles where the bot added a {{dead link}}
-- Green C 14:48, 6 February 2024 (UTC) reply
Thank you. I think we still have a lot to do though. And the WST player template is a problem.  Alan  ( talk) 15:10, 6 February 2024 (UTC) reply
The bot should have processed every link for the domain in mainspace. It might have missed some rare cases where it has trouble parsing the page. The template space I didn't do. There might be some in File space, I have not checked. Anyway if you think you need more bot help, let me know. -- Green C 15:44, 6 February 2024 (UTC) reply

Google cache

Apparently, the Google cache (webcache.googleusercontent.com) is about to be shut down. There are over 5,000 pages with these links, and many of them appear to already be broken. These should probably be replaced with the original URL and/or proper archive links if available, depending on how they are currently being used. :Jay8g [ VTE 00:59, 5 February 2024 (UTC) reply

I'll work on this.  Doing... - if you see this request brought up elsewhere point them here. The links are messy and so are placements within templates it will need some care. -- Green C 01:29, 5 February 2024 (UTC) reply
Would archive.org still have the info? If so we should try to get all of it so it is easily replaceable by regex. Geardona ( talk to me?) 15:29, 5 February 2024 (UTC) reply
Not all the now-dead original urls have archive.org links, is it possible to put google cache archive links into archive.org to 'save' the pages? Kingsif ( talk) 22:47, 8 February 2024 (UTC) reply
The bot is more sophisticated than blindly converting to archive.org links. It will take 4 different actions, depending on the status of the source URL (live or dead), and archive availability for 1) the source URL and 2) Google Cache URL (at archive.org). In terms of creating new archive.org pages from the GC page, that only would work if the GC is still working which in most cases it not true, and when it is true, the source URL is usually live anyway, so there is no reason for either GC or archive.org -- Green C 17:25, 9 February 2024 (UTC) reply
  •  Done - Google Cache is eliminated from Enwiki. It was in about 5,000 pages. It was a significant undertaking for multiple reasons. There are still 834 inside archive.org pages. One of four actions were taken: 1) original URL is live simply remove the Google Cache and replace with the original URL 2) Original URL is dead and no archives available, remove the Google Cache replace with the original URL and add a {{dead link}} 3) Original URL is dead but has an archive at another provider available 4) Original URL is dead and the Google Cache URL has an archive at another archive provider (the 834 linked above). Option #1 was most common surprisingly. For anyone wanting to do this elsewhere, I made a tool to convert Google Cache URLs to the original source URL: https://github.com/greencardamom/Googcacheparse -- Green C 16:19, 11 February 2024 (UTC) reply
    Thanks again for your work on this! :Jay8g [ VTE 22:32, 11 February 2024 (UTC) reply

linguistlist.org

This site is linked to by the linglist parameter in {{ Infobox language}}. Snowmanonahoe ( talk · contribs · typos) 23:19, 5 February 2024 (UTC) reply

User:Snowmanonahoe: I only see it on two pages: /info/en/?search=Special:LinkSearch?target=linguistlist.org%2Fmultitree --The site itself looks dead since 2008 or 2009. -- Green C 00:49, 6 February 2024 (UTC) reply
GreenC: try Special:LinkSearch/multitree.org/codes/. Those urls all redirect to linguistlist.org/multitree now. Snowmanonahoe ( talk · contribs · typos) 00:58, 6 February 2024 (UTC) reply
User:Snowmanonahoe: Ok. There are 75 pages. Compare results at Archive.today with WaybackMachine. I recommend a first pass using Archive.today, and any not available a second pass will use WaybackMachine. Sound alright? BTW the entire linguistlist.org site looks like it needs review 421 pages. They made a new website and the old inbound links are not working right. The new website links are working. -- Green C 02:30, 6 February 2024 (UTC) reply
I think Kwamikagami should weigh in on this first. Snowmanonahoe ( talk · contribs · typos) 03:08, 6 February 2024 (UTC) reply
I gave up on getting multitree links to work back when they were basically offline. I didn't know they were up again.
Multitree is generally not a RS. I would avoid using them except for extinct languages where Linglist maintains the description of the ISO code (like Ethnologue does for living languages); for classification trees of various authors (e.g. on our Austroasiatic article); and maybe a couple other things I'm not thinking of, but not as a general reference.
Is there something in particular you wanted me to weigh in on? I'd think we'd want to update the links when we use them, as I can't think of any reason we'd want to preserve or link to old versions of their pages. — kwami ( talk) 03:30, 6 February 2024 (UTC) reply
I would avoid using them except [some] .. OK my job is to save the dead links by adding an archive URL. It's only about 75 links. You can remove some citations and keep others as you prefer, once the archives are added, so you will be able to see what the content of the page is. -- Green C 14:54, 6 February 2024 (UTC) reply
That should work just fine. No need for you to evaluate the quality of the ref. — kwami ( talk) 15:25, 6 February 2024 (UTC) reply
For the 75 pages with multilist.org/codes URLs it is a multi-pass run:
  1. Pass 1 (multitree1): Remove existing archive.org links
  2. Pass 2 (multitree2): Add archive.today where available
  3. Pass 3 (multitree3): Add archive.org where available
User:Kwamikagami: 75 pages with multilist.org/codes - they should have either an archive URL or a {{dead link}} otherwise the bot had trouble parsing the citation. -- Green C 04:00, 12 February 2024 (UTC) reply
Thanks. I'd only reviewed instances called from the info box. Will go thru them over the next few days. Looks like about half should be removed, as they're things that can be cited to RS's. — kwami ( talk) 08:11, 12 February 2024 (UTC) reply
  • linguistlist.org was also processed (about 450 pages) and many problems were found and repaired: Dead links, soft-404s, migrated links, Cloud Flare blocks. -- Green C 20:32, 12 February 2024 (UTC) reply
    Thanks for all the work with that. — kwami ( talk) 23:26, 12 February 2024 (UTC) reply
    Is there a better way to handle the 512 auto-generated refs at Category:Languages with Linglist code? Or would they all have to be done by hand? — kwami ( talk) 23:50, 12 February 2024 (UTC) reply
It is being generated by Template:Infobox_language/linguistlist. Are most multitree.org/codes URLs dead, or only some? Or not sure? -- Green C 00:05, 13 February 2024 (UTC) reply
It's also in Template:Infobox language/ref and Module:Infobox language. It looks like all of multitree.org is retired. What if change the template to use a generic archive URL, and hope for the best: Special:Diff/1140877092/1206755696, Special:Diff/996938315/1206753611 and Special:Diff/1114901671/1206760361 - this is a stop-gap solution because archive.org won't have archives for all of the URLs. Ideally multitree.org would be removed from Template:Infobox_language and sub-templates and individually archive URLs added to replace the ones auto-generated, at the same location where it was auto-generated. Somewhat difficult. -- Green C 01:45, 13 February 2024 (UTC) reply
Yeah, they appear to be defunct. But they are the official ISO repository for descriptions of languages extinct before ca. 1950, equivalent to Ethnologue for recent languages. We really should have a link to the official site. — kwami ( talk) 02:40, 13 February 2024 (UTC) reply
Maybe it's OK with generic archive URLs at the Infobox layer. If not enough, will need to remove the Infobox support, add the citations individually to each article, and run archive bots to add archive URLs. -- Green C 03:49, 13 February 2024 (UTC) reply

hobbes.nmsu.edu

OS/2 repository going offline in April. Only a few pages on enwiki. [3] -- Green C 15:32, 6 February 2024 (UTC) reply

 Done -- Green C 16:34, 13 February 2024 (UTC) reply

iltalehti.fi

I've noticed that some of the 1,222 Iltalehti URLs are dead but bots don't fix them:

All those pages give the Finnish-language text "Hakemaasi sivua ei valitettavasti löytynyt." (= "Unfortunately, the page you were looking for could not be found."). I tried to Google those URLs' headlines, but I couldn't find new URLs for them, so I think Iltalehti has removed those articles from their website completely. Could a bot go through Iltalehti URLs and set an archive link for the Iltalehti webpages that have that exact text on them? Also, if there's a way to fix these, can it be set that InternetArchiveBot fixes them eventually on other language wikis as well? Like GreenC did a month ago in the discussion above #Ilta-Sanomat to the Ilta-Sanomat URLs. For example, there are 10,070 Iltalehti URLs on fi.wikipedia. Thank you again. 85.76.13.79 ( talk) 15:35, 11 February 2024 (UTC) reply

I requested IABot to run on Maj-Len Grönholm and it fixed it Special:Diff/1194582882/1206244327. Probably IABot hasn't automatically processed the pages yet. I'll take a look at it though, because I know IABot has gaps in coverage what it processes. I'll run it through WaybackMedic which will get them all, plus look for soft-404s like that "Unfortanately" string when the pages otherwise return status 200. Whatever it finds it will update the IABot database, and that should eventually propagate to the rest of the wikis. -- Green C 16:35, 11 February 2024 (UTC) reply
Thanks again. One thing I noticed though: If either blogit., m. or plus. preceded iltalehti.fi in the URL, the bot changed the URL to the main page URL https://www.iltalehti.fi. I found 12 edits in question with this search: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. Can the bot fix these or do we have to fix these by hand? 85.76.13.79 ( talk) 13:00, 14 February 2024 (UTC) reply
Oh sorry looks like I missed those, they are soft-404s. If you will manually restore them to the original URL, I can rerun the bot on those pages. It will add an archive URL instead of following the redirect to the homepage. -- Green C 14:21, 14 February 2024 (UTC) reply
Alternatively you can just revert the entire edit by the bot if there is no intervening edit, and the bot will redo the entire page, if that's easier. -- Green C 14:23, 14 February 2024 (UTC) reply
Done. 85.76.13.79 ( talk) 20:26, 15 February 2024 (UTC) reply
Also done. Special:Diff/1207818208/1207836829 -- Green C 21:20, 15 February 2024 (UTC) reply

Normally I catch these. Output of the "l4s4" script (ie. show redirects with 4 or more cases):

mintbox:[] ./l4s4 
7 -  https://www.iltalehti.fi/politiikka/a/201712072200588364 
4 -  https://www.iltalehti.fi/perhe/a/200612185426589 
4 -  https://www.iltalehti.fi/popstars/a/200701145593138 
4 -  https://www.iltalehti.fi/uutiset/a/2016061121711142 
4 -  https://www.iltalehti.fi/viihdeuutiset/a/201801072200651274 
12 -  https://www.iltalehti.fi 
4 -  https://www.iltalehti.fi/viihde/a/2009073010005660 
8 -  https://www.iltalehti.fi/politiikka/a/201801182200679167 

ie. there were 12 pages with redirects to https://www.iltalehti.fi .. But I forgot to run the script before committing changes to wiki. -- Green C 21:20, 15 February 2024 (UTC) reply

newindianexpress.com

Many old links don't redirect to their new ones, like this doesn't take us here. Better to tag the old ones as dead. Kailash29792 (talk) 13:09, 12 February 2024 (UTC) reply

 Doing... -- Green C 17:55, 16 February 2024 (UTC) reply
 Done - The domain exists in 15,261 pages. The bot made changes in 8,467 pages. The changes were adding new archive URLs 5,240. Added 238 {{ dead link}} where no archive URL existed. Changing 1,220 |url-status=live to dead. And a bunch of other misc cleanup work. Changes are also uploaded in IABot so it will propagate to 300+ other wikis. User:Kailash29792 this was a much needed cleanup thank you for bringing to attention. -- Green C 18:13, 17 February 2024 (UTC) reply

crossrail.co.uk

All URLs under the crossrail.co.uk domain are now redirecting to https://web.archive.org/web/20221229005042/https://www.crossrail.co.uk/# with subpages just going to the same archive of the main page breaking links. All links therefore need to be marked as dead and pointed to an archive earlier than 29 December 2022. Thryduulf ( talk) 12:51, 14 February 2024 (UTC) reply

Interesting. Never seen that before (HTML redirect to archive.org for the entire site). I like it. The site appears to be mostly usable via the archive version. Simple solution for general purposes. Well, like you say, we can do better at enwiki. I'll add more specific archive URLs for each page. -- Green C 18:05, 16 February 2024 (UTC) reply

 Done - The bot checked 124 pages that have the domain. It edited 101 pages. Added 161 archive URLs. Converted 51 |url-status=live to dead. Added two {{dead link}}. Updated IABot with information so it propagates to 300+ other wikis. Thryduulf thank you for the notification. -- Green C 21:05, 18 February 2024 (UTC) reply

pomus.net

Sometimes it redirects to a pornsite and sometimes to different fake "I am not a bot" websites. There are many links to it, all of which require url-status=usurped - Altenmann >talk 07:11, 15 February 2024 (UTC) reply

Altenmann: Added to the WP:JUDI (usurpation) queue Special:Diff/1202023308/1207703597, thank you. -- Green C 13:46, 15 February 2024 (UTC) reply

royin.go.th

Several years ago, the Royal Institute of Thailand changed its name to the Royal Society of Thailand and most (but not all) of the content from its old website, under the domain www.royin.go.th, is now preserved under the subdomain legacy.orst.go.th . Can this be handled by a bot? -- Paul_012 ( talk) 10:12, 22 February 2024 (UTC) reply

58 pages. When I try the first one http://www.royin.go.th/th/knowledge/detail.php?ID=639 it doesn't work at http://legacy.orst.go.th/th/knowledge/detail.php?ID=639 rather wants to redirect to https://www.orst.go.th/?ID=639 however I can't read Thai and don't know if that is a soft-404 or legitimate page. -- Green C 15:17, 22 February 2024 (UTC) reply
It seems links like that one are too old and weren't preserved, and that they constitute more than a small minority. 58 isn't a lot; maybe I can check them manually and replace them with AWB. -- Paul_012 ( talk) 15:45, 22 February 2024 (UTC) reply
Thanks. It would be better if you can. -- Green C 16:17, 22 February 2024 (UTC) reply

Vice Media

Just wanted to flag that per Vice reporters on social media, there are concerns that the Vice Media website is about to be shutdown (a la The Messenger (website)). Is there a way to make sure that all articles using it as a source have archive links? Thanks! Sariel Xilo ( talk) 21:16, 22 February 2024 (UTC) reply

Concerns about the total shuttering have just been picked up by Hollywood Reporter with the top editor unable to confirm if the website will be pulled down. Sariel Xilo ( talk) 21:23, 22 February 2024 (UTC) reply
Wow that's over 17,000 pages. No worries if they shut it down we'll add archives. Any link added to Wikipedia should be archived at Wayback automatically, and big sites like this are typically crawled entirely. Too bad if true they had a lot of good content. -- Green C 22:01, 22 February 2024 (UTC) reply
Confirmed that they'll stop publishing on Vice.com [4], but as to whether they'll leave the website up as a historic archive like Gawker was historiclally left or it will be taken down is anyone's guess. Hemiauchenia ( talk) 22:19, 22 February 2024 (UTC) reply

dcist.com

WAMU has shutdown the DCist and if you go to the website, it shows a popup stating it will redirect you to "WAMU.org in 15 seconds" ( Washington Post mentions the redirect). It looks like this redirect popup is occurring on both the homepage and all of the articles so DCist links should be marked as dead. Sariel Xilo ( talk) 18:36, 23 February 2024 (UTC) reply

604 pages -- Green C 19:46, 23 February 2024 (UTC) reply
Sariel Xilo, the domain was set permanent dead by IABot.org in 2017. IABot has gaps in coverage so I rechecked with WaybackMedic and it added about 200 archive links and changes to |url-status=. -- Green C 20:31, 23 February 2024 (UTC) reply
Thanks! Let's hope that's it for a bit with news outlets dying. Sariel Xilo ( talk) 20:34, 23 February 2024 (UTC) reply
Just a note that this site is now live again, for at least the next year, per WAMU reporting. 19h00s ( talk) 22:00, 28 February 2024 (UTC) reply
19h00s & Sariel Xilo. This site is yo-yo. Shut down in 2017. Restored in 2018. Shut down in 2024. Restored in 2024. My bot has a feature "make live". I can do that. -- Green C 23:27, 28 February 2024 (UTC) reply
 Done made live again. -- Green C 01:49, 29 February 2024 (UTC) reply
Thank you! 19h00s ( talk) 01:55, 29 February 2024 (UTC) reply

theherald.com.au

Formerly the main domain of The Newcastle Herald, it now redirects to smh.com.au, which is a different newspaper. Most links can be resurrected by replacing theherald.com.au with newcastleherald.com.au. About 1500 links can be found in Special:LinkSearch/www.theherald.com.au and almost all can be dealt with by just replacing the domain, and http can be updated to https.

Tim Starling ( talk) 01:01, 24 February 2024 (UTC) reply

  •  Done edited nearly 1,000 articles. Migrated the links and/or |url-status=. -- Green C 04:46, 24 February 2024 (UTC) reply

bibalex.org

Deprecate web archive provider http://web.archive.bibalex.org and http://web.petabox.bibalex.org (on-hold pending verification site is permanently down)-- Green C 15:28, 3 March 2024 (UTC) reply

 Done around 250 pages converted to other archive providers or added {{dead link}} -- Green C 15:00, 16 March 2024 (UTC) reply

bhu.ac.in

Erstwhile simple links such as bhu.ac.in/history has been replaced with complex and complicated urls such as bhu.ac.in/Site/Home/1_2_16_Main-Site with no fix pattern or co-relation. Therefore, it is requested that all bhu.ac.in (except those bhu.ac.in/Site/***) be marked dead and archived url be preferred for them. Thanks, Please feel free to ping/mention -- User4edits ( T) 05:33, 16 March 2024 (UTC) reply

Also iitbhu.ac.in has a lot of dead links and redirects. Doing those same time. -- Green C 17:00, 16 March 2024 (UTC) reply
 Done Checked 132 pages containing bhu.ac.in and iitbhu.ac.in - added 142 new archive URLs, 6 new {{ dead link}} templates, and migrated 57 links to a new redirect location (mostly http->https). Everything else appeared to be working or previously archived. Search -- Green C 17:48, 16 March 2024 (UTC) reply

finlex.fi

Finlex.fi URLs aren't dead but for some reason InternetArchiveBot keeps adding archived URLs for them. This was brought up at meta:User_talk:InternetArchiveBot#Finlex.fi_URLs_aren't_dead a month ago: Bot's edits: [5], [6], [7]. Some URLs it tagged as dead but are actually working: [8], [9], [10]. Those finlex.fi URLs that now have both a working URL and an archive URL should be tagged with the |url-status=live tag, and could someone try to tell IABot that Finlex is live? Thanks. 2001:14BA:9C94:9A00:E866:DADA:1085:E3D9 ( talk) 09:28, 17 March 2024 (UTC) reply

Just noticed that this same issue is being discussed at fi.wikipedia: fi:Wikipedia:Kahvihuone_(tekniikka)#Botti_hakee_arkistosta_kumottuja_lakeja 2001:14BA:9C94:9A00:E866:DADA:1085:E3D9 ( talk) 09:41, 17 March 2024 (UTC) reply
The site has a "Are you human?" check box (CloudFlare). This is causing the bot to think it's a dead site. I logged into iabot.org and changed the domain to "Subscription" status and that will cause the bot to avoid this domain, it won't set live or dead. My bot WaybackMedic has capabilities to bypass CloudFlare. I can try to process this domain and see what happens. My bot also has a feature "make live" ie. convert a citation from dead to live state. Unfortunately my bot only works on English Wikipedia. I'll let you know what happens. -- Green C 15:13, 17 March 2024 (UTC) reply
Unfortunately, this site has maximum security enabled, none of my tools can get through. It started happening in late January 2024. I don't know what to do because no bot is able to determine if a link is live or dead. And no archive service such as WaybackMachine is able to archive a page. Only humans can get through, and they need to solve a captcha. It might be worthwhile waiting to see if they relax security in the future, since this is a recent development. -- Green C 00:40, 19 March 2024 (UTC) reply

squashinfo.com

www.squashinfo.com is a standard reference in articles about squash players. The problem is that current links mostly have the form www.squashinfo.com/players/12345-playername. This leads to the alphabetical players list on squashinfo instead of to the individual player profile. The solution would be to change "players" to "player" without the s. I just did this for Hannah Chukwu where I changed the respective link from http://www.squashinfo.com/players/13679-hannah-chukwu to http://www.squashinfo.com/player/13679-hannah-chukwu. We currently have several hundred articles about squash players and most of them have squashinfo-links so this may rerquire a bot to fix. Proofreader ( talk) 13:31, 17 March 2024 (UTC) reply

@ Proofreader: Looks like about 472 articles. I can do this, and some other things up like conversion to https and check for dead links. Might be be a few days before I get to it. -- Green C 15:21, 17 March 2024 (UTC) reply
Thanks a lot. -- Proofreader ( talk) 15:23, 17 March 2024 (UTC) reply
 Done Checked 779 articled. Edited 737 articles. Types of edits per above, http->https and conversion of /players/ to /player/. The site has bot blockers which made it difficult, it's inconsistent some URLs not blocked, some partially and others fully - no rhyme or reason. So I blindly did the conversions without verifying the URL actually works, spot checks suggest this is OK. -- Green C 19:51, 19 March 2024 (UTC) reply

RateTheRef.net

The website RateTheRef.net seems to have been usurped by a Thai gambling site. I don't know how many pages this affects, or whether the old content has been archived, but I figured someone ought to be told. DavidKVT ( talk) 21:21, 18 March 2024 (UTC) reply

User:DavidKVT: Thank you. Added to the JUDI list for a batch job later: Special:Diff/1207703597/1214769148 -- Green C 01:26, 21 March 2024 (UTC) reply

cinestaan.com

It looks like the site is dead as I cannot find it on Google search, and an article is error 503. Check this out too. Kailash29792 (talk) 11:10, 23 March 2024 (UTC) reply

2,243 pages. Offline since December 2023. I can do this. -- Green C 14:16, 23 March 2024 (UTC) reply
 Done Checked 2,243 pages. Edited 2,206 pages. Added 2,371 archive URLs all WaybackMachine. Added 312 {{dead link}} tags. Added 255 |url-status=dead for existing archive URLs previously set live. Updated IABot database so changes will propagate to 300+ other wiki language sites. -- Green C 16:54, 24 March 2024 (UTC) reply

educationengland.org.uk

The site isn't reliable, but as discovered by user Bendegúz Ács at RSN, it hosts some documents that are either not easily accessible or are behind paywalls. We agreed that these documents should still be referenced on Wikipedia, but through the Wayback Machine instead of linking directly to the site. This approach reduces the risk of losing access to the content and prevents any malicious or spammy content from being added to articles. It also minimizes the potential for spamming or manipulation of the information. Upon checking for insource:"educationengland.org.uk/documents", I found that there are around 148 articles with links to this site's document path, which need to be archived. Also, please note that educationengland.org.uk is redirecting to education-uk.org so they should be archived/replaced accordingly. I would like to know if this can be done through a bot or if it requires manual action. Thank you. GSS💬 18:04, 24 March 2024 (UTC) reply

Sure no problem. I'll run the entire domain, archive the /documents links, and the rest move to the new URLs (or archive if soft-404s). And update https and other fixes. -- Green C 20:19, 24 March 2024 (UTC) reply
 Done - Edits visible at Special:Contributions/GreenC_bot. -- Green C 23:39, 24 March 2024 (UTC) reply

police.it

The original official site for the Italian glasses brand Police (brand) has been usurped by a scam site.

The old link used to be https://police.it. I know it affects all the pages related to Police (brand) in all the languages this page was translated to.

The new link appears to be https://policelifestyle.com/. Erniwastaken ( talk) 00:42, 27 March 2024 (UTC) reply

I can't find any pages that have police.it: [11] -- Green C 13:47, 27 March 2024 (UTC) reply

policelookbeyond.com

The https://policelookbeyond.com domain, which used to be property of the Italian glasses brand Police (brand) has been recently usurped. I cannot find a new version of the site, would it be possible to look for an archived one?

The link is present at least in some versions of the Police (brand), and I don't know of other uses. Erniwastaken ( talk) 00:48, 27 March 2024 (UTC) reply

I can't find any pages that include policelookbeyond.com: [12] It does appear to have been fixed: Special:Diff/1186189568/1215771089 .. unfortunately my bot does not operate on other language wikis, and IABot is not programed for this kind of work. I suppose we could set both domains to dead in IABot, so at least they are converted to archive URLs on the other wikis. -- Green C 13:52, 27 March 2024 (UTC) reply
Both domains are now "Permadead" and IABot will convert to archive URLs on the 300+ other wikis. -- Green C 13:54, 27 March 2024 (UTC) reply

theinsiter.org

The theinsiter.org domain, which used to be property of a Maltese student newspaper, has been usurped. Some more recent articles seem to be available at the same path but on the domain insite.mt, but not all of them. AlexandraAVX ( talk) 10:02, 27 March 2024 (UTC) reply

Thank you. Added to WP:JUDI for later batch processing conversion to usurped. Special:Diff/1214769148/1215848898 -- Green C 13:56, 27 March 2024 (UTC) reply

Texas Rose Festival

The main URL subtitle should be texasrosefestival.org (not com). The domain was lost in January apparently and sold. It’s a volunteer-based charity event anyway so it was changed to the org suffix. On some search engines the top rank result is the wiki page with the bad address. Thank you for reading this and for your help. Jckmlvny ( talk) 14:46, 29 March 2024 (UTC) reply

I could only find one case of texasrosefestival.com - in Tyler, Texas - and that link is dead (even when .org), it should remain as .com since the archive URL is also .com -- Green C 14:55, 29 March 2024 (UTC) reply

cfa-www.harvard.edu

URLs of form cfa-www.harvard.edu/iauc can be converted to cbat.eps.harvard.edu/iauc

http://cfa-www.harvard.edu/iauc/08500/08524.html -->
http://cbat.eps.harvard.edu/iauc/08500/08524.html

56 pages

-- Green C 13:55, 5 April 2024 (UTC) reply

 Done - converted 63 links: Example Special:Diff/1209696306/1218342515. All edits: [13] -- Green C 04:24, 11 April 2024 (UTC) reply

archive.thisislancashire.co.uk

Conversion:

http://archive.thisislancashire.co.uk/1998/5/8/801697.html -->
https://www.lancashiretelegraph.co.uk/archive/1998/5/8/801697.html/ (include trailing slash)

110 pages

-- Green C 14:50, 5 April 2024 (UTC) reply

 Not done - too many false positives about 50%. Requires manual checks for each link (aprox 160). Contact me if interested in doing this work, can provide the data. -- Green C 13:06, 11 April 2024 (UTC) reply

herbaria4.herb.berkeley.edu

Conversion:

http://herbaria4.herb.berkeley.edu/eflora_display.php?tid=21820 -->
https://ucjeps.berkeley.edu/eflora/eflora_display.php?tid=21820

218 pages

-- Green C 14:59, 5 April 2024 (UTC) reply

 Done - converted 232 links: Example Special:Diff/1165643139/1218392247. All edits: [14] -- Green C 13:22, 11 April 2024 (UTC) reply

fallingrain.com

Conversion:

http://www.fallingrain.com/world/PK/3/Toru.html -->
https://www.fallingrain.com/world/PK/03/Toru.html

1,318 pages -- Green C 04:19, 6 April 2024 (UTC) reply

 Done - converted 1,204 links: Example Special:Diff/1216003034/1218402425. All edits: [15] -- Green C 14:36, 11 April 2024 (UTC) reply

ilmbwww.gov.bc.ca

Conversion:

http://(wlap|srm|ilmb)www.gov.bc.ca/bcgn-bin/bcg10?name=5586 -->
https://apps.gov.bc.ca/pub/bcgnws/names/5586.html

73 pages -- Green C 04:25, 6 April 2024 (UTC) reply

 Done - converted 60 links: Example Special:Diff/1179972920/1218415470. All edits: [16] -- Green C 15:59, 11 April 2024 (UTC) reply

quinzaine-realisateurs.com

Conversion:

http://www.quinzaine-realisateurs.com/qz_an/1998/ -->
http://www.quinzaine-cineastes.fr/fr/edition/1998

66 pages -- Green C 04:41, 6 April 2024 (UTC) reply

 Done - converted 49 links: Example Special:Diff/1112134327/1218507144. All edits: [17] -- Green C 03:31, 12 April 2024 (UTC) reply

sherdog.com

Conversion:

http://www.sherdog.com/news/press%20releases/Cage-Warriors-Announce-Line-Up-10246 -->
https://www.sherdog.com/news/pressreleases/Cage-Warriors-Announce-LineUp-10246

22 pages -- Green C 14:01, 6 April 2024 (UTC) reply

 Done - converted 24 links. Example Special:Diff/1193444370/1218565151. All edits: [18] -- Green C 13:43, 12 April 2024 (UTC) reply

organismnames.com

Many links are marked dead, but are actually live. Reprocess and reset.

135 pages

-- Green C 14:21, 6 April 2024 (UTC) reply

 Done - converted 68 citations to live status. Example Special:Diff/1190213071/1218577608. All edits: [19] -- Green C 15:14, 12 April 2024 (UTC) reply

fchd.info

  • Convert all to https
  • If URL contains a long-dash convert to short dash eg.
https://www.fchd.info/cups/facup1951–52.htm --> https://www.fchd.info/cups/facup1951-52.htm

1,813 pages

-- Green C 14:36, 6 April 2024 (UTC) reply

 Done - converted 2,455 links to https. 329 switched from dead to live ( Special:Diff/1212622277/1218602461 & Special:Diff/1174652764/1218602000). Fix 5 with long-dash error: Special:Diff/1193226855/1218601104. All edits: [20] -- Green C 20:49, 12 April 2024 (UTC) reply

beta.latimes.com

Conversion:

http://beta.latimes.com/world/africa/la-fg-zimbabwe-arrest-american-20171103-story.html -->
https://www.latimes.com/world/africa/la-fg-zimbabwe-arrest-american-20171103-story.html

96 pages -- Green C 15:16, 6 April 2024 (UTC) reply

 Done - Converted 101 links. Removed 47 {{dead link}}. Switched 12 |url-status=dead to live. All edits: [21] -- Green C 01:31, 13 April 2024 (UTC) reply

archive.ilmb.gov.bc.ca

Conversion:

http://archive.ilmb.gov.bc.ca/bcgn-bin/bcg10?name=1141
https://apps.gov.bc.ca/pub/bcgnws/names/1141.html

71 pages -- Green C 17:14, 6 April 2024 (UTC) reply

 Done - Converted 46 links. Removed 6 {{dead link}} templates. Added 22 {{ dead link}}. Switched 11 |url-status=dead to live. All edits: [22] -- Green C 01:55, 13 April 2024 (UTC) reply

www.hrc.org/blog

Conversion:

https://www.hrc.org/blog/hrc-endorses-u.s.-rep.-colin-allred-and-state-rep.-julie-johnson -->
https://www.hrc.org/news/hrc-endorses-u-s-rep-colin-allred-and-state-rep-julie-johnson

The "/news" could also be "/press-releases/". The "." convert to "-"

260 pages

 Done - Checked 258 pages and edited 168 pages. Converted 418 links. Switched 24 |url-status=live to dead. Added 54 archive URLs (50 Wayback). -- Green C 19:32, 13 April 2024 (UTC) reply

Conversion:

https://www.hrc.org/press/hrc-endorses-kyrsten-sinema-for-u.s.-senate
https://www.hrc.org/press-releases/hrc-endorses-kyrsten-sinema-for-u.s.-senate

11 pages

-- Green C 17:27, 6 April 2024 (UTC) reply

 Done - Converted 12 links manually. -- Green C 18:16, 13 April 2024 (UTC) reply

arrs.run

Conversion:

http://www.arrs.run/ATM_Mara1984.htm
https://arrs.run/MaraRank/ATM_Mara1984.htm

Add "/MaraRank/", "https" and remove "www"

51 pages -- Green C 17:33, 6 April 2024 (UTC) reply

 Done - Checked 36 pages and edited 36 pages. Converted 36 links. Removed 23 {{dead link}} templates. -- Green C 20:30, 13 April 2024 (UTC) reply

algerie360.com/sport

Conversion:

https://www.algerie360.com/sport/division-1-division-2/hemani-lache-laso-pour-le-csc/
https://www.algerie360.com/hemani-lache-laso-pour-le-csc/

Remove everything in path but last element.

21 pages -- Green C 17:42, 6 April 2024 (UTC) reply

 Done - Checked 19 pages and edited 16 pages. Converted 14 links. Removed 3 {{dead link}} templates. Added 1 {{dead link}}. Switched 9 |url-status=dead to live. Added 2 archive URLs (2 Wayback). -- Green C 01:51, 14 April 2024 (UTC) reply

soccerbase.com

Conversion (players):

http://www.soccerbase.com/players_details.sd?playerid=63162
https://www.soccerbase.com/players/player.sd?player_id=63162

792 pages

 Done - Checked 791 pages and edited 785 pages. Converted 1345 links. Removed 14 {{dead link}} templates. Switched 342 |url-status=dead to live. Switched 1 |url-status=live to dead. Added 16 archive URLs (6 Wayback).

Conversion (managers):

http://www.soccerbase.com/managers2.sd?managerid=891
http://www.soccerbase.com/managers/manager.sd?manager_id=891

163 pages

 Done - Checked 162 pages and edited 160 pages. Converted 449 links. Switched 167 |url-status=dead to live.

Conversion (referees):

http://www.soccerbase.com/refs2.sd?refid=1042
http://www.soccerbase.com/referees/referee.sd?referee_id=1042

61 pages

 Done - Checked 60 pages and edited 60 pages. Converted 65 links. Removed 1 {{dead link}} templates. Added 2 archive URLs (0 Wayback).

Conversion (teams):

http://www.soccerbase.com/teams2.sd?teamid=2493
https://www.soccerbase.com/teams/team.sd?team_id=2493

86 pages

 Done - Checked 86 pages and edited 86 pages. Converted 95 links. Switched 10 |url-status=dead to live. Added 7 archive URLs (0 Wayback).

-- Green C 16:11, 14 April 2024 (UTC) reply

boxingscene.com

Conversion:

https://www.boxingscene.com/%20/arum-fury-wilder-happen-even-2021-then-joshua-whyte--150822
https://www.boxingscene.com/arum-fury-wilder-happen-even-2021-then-joshua-whyte--150822

73 pages -- Green C 18:47, 6 April 2024 (UTC) reply

 Done - Checked 72 pages and edited 72 pages. Converted 93 links. Removed 7 {{dead link}} templates. Added 2 archive URLs (2 Wayback). -- Green C 16:57, 14 April 2024 (UTC) reply

nzfootball.co.nz

Conversion:

https://www.nzfootball.co.nz/newsarticle/77966?newsfeedId=569432
https://www.nzfootball.co.nz/newsarticle/77966

220 pages -- Green C 18:53, 6 April 2024 (UTC) reply

 Not done - nothing to do. Links have same content. -- Green C 17:02, 14 April 2024 (UTC) reply

wnbl.com.au

Conversion:

(old): http://wnbl.com.au/todhunter-re-signs-rangers/
(new): https://wnbl.basketball/blog/news/todhunter-re-signs-rangers/

(old) http://wnbl.com.au/bendigo-spirit-welcome-back-special-k/
(new) https://wnbl.basketball/blog/news/bendigo-spirit-welcome-back-special-k/

If path does not contain "/" or "?" or "&" or "#" .. test replacement URL at wnbl.basketball/blog/news

308 pages.

Conversion:

http://wnbl.com.au/bendigo_news/spirit-reaches-sky/
https://wnbl.basketball/bendigo/news/spirit-reaches-sky/

"/bendigo_news/" --> "/bendigo/news/"

 Done - Checked 310 pages and edited 185 pages. Converted 403 links. Removed 29 {{dead link}} templates. Added 2 {{dead link}}. Switched 29 |url-status=dead to live. Switched 1 |url-status=live to dead. Added 151 archive URLs (145 Wayback). -- Green C 15:14, 15 April 2024 (UTC) reply

unpo.org

Conversion:

http://www.unpo.org/news_detail.php?arg=11&par=3886
https://unpo.org/article/3886

6 pages -- Green C 21:51, 6 April 2024 (UTC) reply

 Done (manually) -- Green C 01:35, 8 April 2024 (UTC) reply

nonleaguescotland.org.uk

Conversion:

http://nonleaguescotland.org.uk/nairncounty.htm
http://nonleaguescotland.org.uk/nairncounty.html

94 pages

-- Green C 22:48, 6 April 2024 (UTC) reply

 Done - Checked 96 pages and edited 82 pages. Converted 591 links. Added 1 {{dead link}}. Switched 95 |url-status=dead to live. Added 30 archive URLs (30 Wayback). -- Green C 17:16, 15 April 2024 (UTC) reply

mediapost.com

Conversion:

http://www.mediapost.com/publications/index.cfm?fa=Articles.showArticle&art_aid=80921
https://www.mediapost.com/publications/article/80921/

29 pages

-- Green C 03:32, 7 April 2024 (UTC) reply

 Done - Checked 22 pages and edited 21 pages. Converted 20 links. Removed 3 {{dead link}} templates. Switched 15 |url-status=dead to live. -- Green C 19:18, 15 April 2024 (UTC) reply

thehill.com

Convert from http to https. Some http are 404 but https version is 200.

3,464 pages

-- Green C 03:42, 7 April 2024 (UTC) reply

 Done - Checked 3,465 pages and edited 3,344 pages. Converted 7,679 links. Removed 1 {{dead link}} templates. Added 9 {{dead link}}. Switched 105 |url-status=dead to live. Switched 27 |url-status=live to dead. Added 347 archive URLs (254 Wayback). -- Green C 14:24, 16 April 2024 (UTC) reply

rugbyleagueproject.org

Conversion:

http://www.rugbyleagueproject.org/competitions/NSWRL_1945.html
http://www.rugbyleagueproject.org/seasons/NSWRFL_1945/summary.html

26 pages

-- Green C 14:53, 7 April 2024 (UTC) reply

 Done - Checked 23 pages and edited 24 pages. Converted 20 links. Removed 2 {{dead link}} templates. Switched 1 |url-status=dead to live. -- Green C 15:45, 16 April 2024 (UTC) reply

projects.militarytimes.com

Conversion:

http://projects.militarytimes.com/citations-medals-awards/recipient.php?recipientid=1068
https://valor.militarytimes.com/hero/1068

570 pages

-- Green C 14:59, 7 April 2024 (UTC) reply

 Done - Checked 570 pages and edited 568 pages. Converted 647 links. Removed 3 {{dead link}} templates. Switched 449 |url-status=dead to live. Added 5 archive URLs (0 Wayback). -- Green C 16:50, 16 April 2024 (UTC) reply

bundesliga.com

Conversion:

https://www.bundesliga.com/en/bundesliga/news/noblsp-dfb-cup-final-live-blog-bayern-muenchen-borussia-dortmund.jsp
https://www.bundesliga.com/en/news/Bundesliga/noblsp-dfb-cup-final-live-blog-bayern-muenchen-borussia-dortmund.jsp

501 pages

-- Green C 15:12, 7 April 2024 (UTC) reply

 Done - Checked 515 pages and edited 116 pages. Converted 136 links. Removed 3 {{dead link}} templates. Added 0 {{dead link}}. Switched 7 |url-status=dead to live. Switched 0 |url-status=live to dead. Added 7 archive URLs (2 Wayback). -- Green C 21:21, 16 April 2024 (UTC) reply

plus.lesoir.be

Conversion:

http://plus.lesoir.be/90745/article/2017-04-20/agression-de-deux-policiers-schaerbeek-hicham-diop-sera-juge-en-correctionnelle
https://www.lesoir.be/90745/article/2017-04-20/agression-de-deux-policiers-schaerbeek-hicham-diop-sera-juge-en-correctionnelle

108 pages

-- Green C 15:50, 7 April 2024 (UTC) reply

 Done - Checked 107 pages and edited 106 pages. Converted 119 links. Removed 1 {{dead link}} templates. Switched 3 |url-status=dead to live. -- Green C 01:28, 17 April 2024 (UTC) reply

247sports.com

Conversion:

https://247sports.com/nfl/dallas-cowboys/Bolt/The-Dallas-Cowboys-2018-regular-season-schedule-117463461
https://247sports.com/nfl/dallas-cowboys/Article/Dallas-Cowboys-2018-regular-season-schedule-released-117463461

Follow redirects.

5,011 pages

-- Green C 16:19, 7 April 2024 (UTC) reply

ytfc.net

Conversion:

http://www.ytfc.net/news/article/2016-17/hedges-loan-cut-short-3545577.aspx
https://www.ytfc.net/hedges-loan-cut-short/

71 pages

-- Green C 17:09, 7 April 2024 (UTC) reply

uslpdl.com

Conversions:

http://www.uslpdl.com/news_article/show/759968?referrer_id=2313812
https://www.uslleaguetwo.com/news_article/show/759968

47 pages

-- Green C 20:24, 7 April 2024 (UTC) reply

geoelections.free.fr

Conversion:

http://geoelections.free.fr/USA/elec_comtes/1892bidw
http://geoelections.free.fr/USA/elec_comtes/1892bidw.htm

516 pages (of which 390 already have .htm)

-- Green C 20:29, 7 April 2024 (UTC) reply

m.pitchfork.com

Conversion:

http://m.pitchfork.com/news/63742-kanye-west-says-new-album-coming-this-summer/
https://pitchfork.com/news/63742-kanye-west-says-new-album-coming-this-summer/

33 pages

-- Green C 21:21, 7 April 2024 (UTC) reply

beta.latimes.com

Convert to www.latimes.com

96 pages

-- Green C 23:30, 7 April 2024 (UTC) reply

sundayobserver.lk

Conversion:

http://www.sundayobserver.lk/2009/07/05/mag04.asp
http://archives.sundayobserver.lk/2009/07/05/mag04.asp

1,572 pages

-- Green C 23:35, 7 April 2024 (UTC) reply

nation.com.pk

Conversion:

http://www.nation.com.pk/pakistan-news-newspaper-daily-english-online/Regional/Karachi/31-Dec-2009/Karachi-blast-mastermind-was-arrested-10-days-before-Ashura
https://www.nation.com.pk/31-Dec-2009/karachi-blast-mastermind-was-arrested-10-days-before-ashura

412 pages

-- Green C 23:38, 7 April 2024 (UTC) reply

goldbook.iupac.org

Conversion:

http://goldbook.iupac.org/goldbook/A00446.html
https://goldbook.iupac.org/terms/view/A00446

22 pages

-- Green C 00:42, 8 April 2024 (UTC) reply

timesofindia.com

Redirects to timesofindia.indiatimes.com .. site needs general work for 404s, soft-404s, https, conversion m.timesofindia.com and so on.

7,149 pages

-- Green C 00:51, 8 April 2024 (UTC) reply

economictimes.com

Same as above..

https://www.economictimes.com/news/politics-and-nation/dilip-ghosh-makes-u-turn-says-not-in-favour-of-division-of-bengal/amp_articleshow/85587719.cms
https://economictimes.indiatimes.com/news/politics-and-nation/dilip-ghosh-makes-u-turn-says-not-in-favour-of-division-of-bengal/articleshow/85587719.cms

1,230 pages

-- Green C 21:49, 8 April 2024 (UTC) reply

rugby15.co.za

Conversion:

http://www.rugby15.co.za/2015/07/steval-pumas-announce-new-contracts/
https://www.rugby15.co.za/steval-pumas-announce-new-contracts/

121 pages

-- Green C 01:01, 8 April 2024 (UTC) reply

sportskindle.com

https://www.sportskindle.com/2020/10/14/neufc-kwesi-appiah-signs-contract/
http://sportskindle.com/neufc-kwesi-appiah-signs-contract/

6 pages

-- Green C 22:28, 8 April 2024 (UTC) reply

ssl.ofdb.de

Conversion:

https://ssl.ofdb.de/film/192915
https://www.ofdb.de/film/192915

-- Green C 23:18, 8 April 2024 (UTC) reply

in.rbth.com

Conversion:

https://in.rbth.com/articles/2011/08/22/brahmos_sets_the_gold_standard_for_russian-indian_defence_projects_12899
https://www.rbth.com/articles/2011/08/22/brahmos_sets_the_gold_standard_for_russian-indian_defence_projects_12899

88 pages

-- Green C 00:22, 9 April 2024 (UTC) reply

beta.nydailynews.com

Conversion:

http://beta.nydailynews.com/news/politics/nys-reform-party-executive-committee-split-gov-candidate-article-1.3948595
https://www.nydailynews.com/news/politics/nys-reform-party-executive-committee-split-gov-candidate-article-1.3948595

17 pages Green C 00:27, 9 April 2024 (UTC) reply

www.yfmghana.com

Conversion:

https://www.yfmghana.com/2018/07/26/full-list-of-winners-jd-nightlife-awards-2018/
https://yfmghana.com/full-list-of-winners-jd-nightlife-awards-2018/

13 pages

-- Green C 00:32, 9 April 2024 (UTC) reply

FABLE0424

Test run of the WP:FABLE system. Permanently dead links have been identified by FABLE as having moved to a different URL. Changes manually verified beforehand. Changes committed to wiki by WP:WAYBACKMEDIC. Please report errors. -- Green C 14:49, 10 April 2024 (UTC) reply

There was an error in about 42 pages. They are reverted. If you find any not reverted please let me know. -- Green C 19:58, 10 April 2024 (UTC) reply

Network World

https://www.networkworld.com/article/2881467/application-security/secure-islands-protects-files-with-embedded-classification-encryption-and-usage-rights.html is dead. https://www.networkworld.com/article/2881467/secure-islands-protects-files-with-embedded-classification-encryption-and-usage-rights.html (deleting "/application-security") works and redirects to the new URL. Probably others of this format. * Pppery * it has begun... 15:13, 11 April 2024 (UTC) reply

It looks like when there is something between the number and the last path element, this can signify a problem, for example: http://www.networkworld.com/article/2220304/opensource-subnet/say-what--gnu-emacs-violates-the-gpl.html --> https://www.networkworld.com/article/2220304/opensource-subnet-say-what--gnu-emacs-violates-the-gpl.html .. in this case /opensource-subnet/ is made part of the last path element, in other cases it is deleted entirely. I can check for it.
385 pages. -- Green C 19:23, 13 April 2024 (UTC) reply

RFID journal

A potentially nasty usurped URL case I found: http://www.rfidjournal.com/article/articleview/9632/1/1 currently points to https://www.rfidjournal.com/gs1-releases-guidelines-for-rfid-based-electronic-article-surveillance, an article about GS1 guides, however per the Wayback Machine it previously pointed to https://web.archive.org/web/20180711120951/http://www.rfidjournal.com/articles/view?9632, an article about ScholarChip, which was the intended citation. Not sure there's anything that can be done about it here, but noting it for the record. * Pppery * it has begun... 15:55, 11 April 2024 (UTC) reply

What I can do is process the entire domain, log the source and redirect links, and look for patterns of repeating redirects. Sometimes that will surface soft404s like this. BTW they have some tight rate limiting as a freemium method, not sure how my bot will perform. -- Green C 19:02, 13 April 2024 (UTC) reply
96 pages -- Green C 19:28, 13 April 2024 (UTC) reply

juf.org

Many links might be soft-404 redirects to the home page. -- Green C 18:47, 13 April 2024 (UTC) reply

The site was apparently revamped, and many old links, even those published as recently as 2022 are no longer available. -- Kailash29792 (talk) 13:37, 17 April 2024 (UTC) reply