This page is an
archive. Do not edit the contents of this page. Please direct any additional comments to the
current main page.
whitehouse.gov
A lot of
whitehouse.gov links have died after the domain recently "changed owner". A rare occasion where many Wikipedians may be glad for sources dying. There is an archive at
https://trumpwhitehouse.archives.gov. Example of old broken and new working url:
There is a slim chance/risk that some of the broken links will work again in about four years. Some whitehouse.gov links are working and should not be changed. Can a bot sort it out?
PrimeHunter (
talk) 13:09, 25 February 2021 (UTC)
A bot could test every whitehouse.gov link to see whether it works now or at any of the archives.
PrimeHunter (
talk) 14:02, 25 February 2021 (UTC)
OK, based on your research, I agree it's worth exploring to see how well it works. Will take a look. --
GreenC 14:25, 25 February 2021 (UTC)
Results: modified 8,263 URLs in 5,060 articles. Changed metadata info such as |work=whitehouse.gov. Plus other general fixes by WaybackMedic. Matter of curiosity: 67% were found by the scanning method described above and the rest had working redirects in the header. Most of the working redirects were Obama, Trump had a high proportion of 404s and no redirects, perhaps poorly maintained and/or too soon after leaving office. Also some pages (10%?) can't be archived by any web archive service, they just don't work, there is something in the page that prevents web archiving by third parties but regardless they still work at the National Archives. @
PrimeHunter: --
GreenC 16:46, 3 March 2021 (UTC)
@
GreenC: Thanks. That's a nice low number. I have fixed many of them with guessing or Googling without finding a system. Some were clearly our own fault with url's that never would have worked. Should I remove the fixed ones from
Wikipedia:Link rot/cases/whitehouse.gov?
PrimeHunter (
talk) 02:21, 4 March 2021 (UTC)
Yes about 0.5% of the whitehouse URLs is explainable by local data entry or remote site errors, it's probably better than one might expect. It's a good idea to check for, and great you were able to fix some. Use the page any way you like, markup or delete entries. --
GreenC 03:12, 4 March 2021 (UTC)
Replace atimes.com links
Please replace all instances of atimes.com and its subdomains with asiatimes.com. The old website is replaced by an advertising site. ~
Ase1estecharge-paritytime 10:11, 28 February 2021 (UTC)
Also, if the corresponding page with the new domain is not found, not archived, and there is an archive with the old domain, then do not replace the URL, but add the archive link and mark the URL status as unfit. Thanks. ~
Ase1estecharge-paritytime 10:26, 28 February 2021 (UTC)
Ok. It might take a couple passes, first to move the domain where possible, and second to add the archives+unfit for the remainder. Still working on the whitehouse.gov above could be a few days at least. --
GreenC 15:46, 28 February 2021 (UTC)
I found
many broken links to www.observer.com: some (but not all) of these links no longer lead to the articles that were originally cited.
Jarble (
talk) 21:04, 13 February 2021 (UTC)
Since this is a mix of live and dead probably better to leave it for IABot which should be able to detect the dead. --
GreenC 03:19, 14 February 2021 (UTC)
@
GreenC: IABot won't detect them. I tried running IABot on
this page, but the link is still incorrect.
Jarble (
talk) 21:35, 11 March 2021 (UTC)
IABot won't work. It's pretty complex. First impression is anything "https" is OK. Anything "http" without a hostname is also OK. That narrows it down to about a
thousand possible trouble URLs. Of these, some work and some don't. Some are also redircting to spam links needing |url-status=unfit. There are patterns, but also exceptions. I might need to make a dry run, log what it does, build rules to take into account the mistakes, then make a live run. Hard to say up front what the rules should be. Will take some time to figure out, there are a lot of variables. --
GreenC 01:45, 12 March 2021 (UTC)
The rest were already archived or still working or now tagged with {{
dead link}}. Once the soft404 redirects were identified it was not too difficult. If you see any problems let me know. @
Jarble: --
GreenC 21:39, 13 March 2021 (UTC)