This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the
current talk page.
Presumably you mean parameter values. And Martin's special case would |author-sep=,, where the comma is the value, not some trailing cruft. ♦
J. Johnson (JJ) (
talk) 20:32, 2 November 2018 (UTC)
cs1|2 does not have or support |author-sep=, |author-name-separator=, or |separator=.
zenodo.org has been blacklisted on English Wikipedia for copyright reasons (at least for now). Please disable the addition of it (and allow other edits to be made; the bot currently fails on
Radon). See
Special:PermanentLink/867438103#zenodo.org. (Courtesy ping
Nemo bis and
JzG) (
t)
Josve05a (
c) 21:42, 5 November 2018 (UTC)
and yet researchgate is cool.
AManWithNoPlan (
talk) 22:20, 5 November 2018 (UTC)
i am glad we currently clean up those urls, so we convert pdf links to landing pages.
AManWithNoPlan (
talk) 22:22, 5 November 2018 (UTC)
Well, even cleaning up https://zenodo.org/record/1000677/files/article.pdf to https://zenodo.org/record/1000677 is blacklisted (/me mumbles something angrily) (
t)
Josve05a (
c) 22:34, 5 November 2018 (UTC)
Yes. A friend cannot upload her papers to ResearchGate but can upload them to Zenodo. I think that may be telling. Guy (
Help!) 23:40, 5 November 2018 (UTC)
Blocked and {{fixed}}. Also, a second pull is in place to turn it back off, if that is possible. You are correct, it is one thing to violate your own papers' copyright; but it is another thing to violate everyone's papers copyrights.
AManWithNoPlan (
talk) 18:25, 6 November 2018 (UTC)
Did you verify that there are not whites that got changed? i cannot look right now.
AManWithNoPlan (
talk) 15:44, 13 October 2018 (UTC)
I'm not sure what 'changing whites' would be here, but it did something similar in the previous edit
[3], where normally it removes publisher in cite journals. Headbomb {
t ·
c ·
p ·
b} 18:14, 13 October 2018 (UTC)
If it does than that's another bug. It shouldn't remove the publisher parameter in cite journal templates unless the publisher value would be the same as the journal value. (And, actually, for optimal meta data it shouldn't even remove it then for as long as it is correct, so that both meta data entries journal and publisher can be populated. Instead, seemingly duplicate values should be detected in the cite template and one of the values suppressed in the output, but not in meta data.)
i got auto corrected. whitespaces not whites.
AManWithNoPlan (
talk) 22:37, 13 October 2018 (UTC)
it drops publisher then google books adds it back
AManWithNoPlan (
talk) 03:55, 14 October 2018 (UTC)
I just saw it remove a publisher from a "journal" that is really a newsletter whose publisher should not have been removed:
Special:Diff/866664956. For major well-established academic journals, removal of publisher may be a good thing, but blindly doing it to all journal citations is not. Citation bot absolutely should not be making this kind of decision, and should not even be suggesting it to human editors (as they too-often fail to exercise any judgement of their own). —
David Eppstein (
talk) 21:02, 31 October 2018 (UTC)
{{cite magazine}} seems like the proper template is what i am hearing from you.
AManWithNoPlan (
talk) 21:32, 31 October 2018 (UTC)
If
running the bot again, it removes it. Perhaps a "run bot muliple times, until no changes is attemeted, before saving the edit" rule should be implemented. (
t)
Josve05a (
c) 22:01, 8 November 2018 (UTC)
That is a horrible idea. Although I too have considered it.
AManWithNoPlan (
talk) 23:06, 8 November 2018 (UTC)
Yeah (hence it being in small). I can just imagine the bot edit warring with it self back-and-forth...however, it logically feels as if "all possible edits should be made" before saving the change. (
t)
Josve05a (
c) 23:09, 8 November 2018 (UTC)
And the bot gets banned from database access for repeats and edits take two to three times longer.....
AManWithNoPlan (
talk) 23:11, 8 November 2018 (UTC)
And during periods of high use, big edits fail since the bot is too busy double checking itself....
AManWithNoPlan (
talk) 23:12, 8 November 2018 (UTC)
A (short) time-out for "second round" could be added, or only do it for "small" articles (i.e. if running it manually on a short section), or only run twice if there is not high-use (if that could be "tracked"). Not advocating this be implemented here, though. The issue at hand can (hopefully) be patched this time. Just a thought.(
t)
Josve05a (
c) 23:15, 8 November 2018 (UTC)
|journal=Bjpsych International |journal=Ieee Transactions on Computers |journal=Papers from the Workship Within the Framework of the XIII International Congress of Celtic Studies
What should happen
|journal=BJPsych International |journal=IEEE Transactions on Computers |journal=Papers from the Workship within the Framework of the XIII International Congress of Celtic Studies
OECD should always be capitalized. I've seen it both in |last1= and |publisher=. <ref>https://dx.doi.org/10.1787/9789264239012-en</ref> adds |last1=Oecd(
t)
Josve05a (
c) 10:47, 12 November 2018 (UTC)
It has been that way for about a day. Sometimes you will get lucky and get a 500 error instead of timeout.
AManWithNoPlan (
talk) 16:56, 13 November 2018 (UTC)
Yup. Either, this is very annoying. Headbomb {
t ·
c ·
p ·
b} 17:04, 13 November 2018 (UTC)
Although frustrating, these very slow runs do often perform the requested edits even if they never return to display a result.
Lithopsian (
talk) 20:19, 13 November 2018 (UTC)
This could be generalized to anything that differs only by a leading 'the'. Headbomb {
t ·
c ·
p ·
b} 16:25, 14 November 2018 (UTC)
Good idea The Headbomb. I might want to create a case-intensive str_is_basically_the_same() function.
AManWithNoPlan (
talk) 16:32, 14 November 2018 (UTC)
|doi=10.14288/1.0071732 and |url=
https://doi.library.ubc.ca/10.14288/1.0071732 both lnks to the same place. And it has a recognized doi in the path, and should be removed. We should not add such links.
Removes |publisher=Google for citations to Google Maps
Relevant diffs/links
Don't
We can't proceed until
Feedback from maintainers
* {{Citation | publisher = Google | url = https://maps.google.com/maps/ms?ie=UTF8&hl=en&msa=0&msid=210554752554258740073.00045675b996d14eb6c3a&ll=6.839971,28.205177&spn=170.959424,24.609375&z=1 | type = map (non-exhaustive) | title = Participatory budgeting initiatives around the world}}.
Don't add (identical) |series=Handbook of Development Economics if |title=Handbook of Development Economics already exists (without somehow removing one of them)
Obviously the type of template gets changed more than once as the bot does its thing. I think we can fix.
AManWithNoPlan (
talk) 20:50, 15 November 2018 (UTC)
Also added |journal= to a {{
cite book}}. That sounds odd...(
t)
Josve05a (
c) 21:00, 15 November 2018 (UTC)
The full message is "Alter: isbn. Removed accessdate with no specified URL". It covers both, but admittedly the amount of texted changed appears to be inversely proportional to the length of the message text.
AManWithNoPlan (
talk) 22:22, 17 November 2018 (UTC)
What should be done with .
doi:
10013/epic.10107.d001.
hdl:
10013/epic.10107.d001. {{
cite journal}}: Check |doi= value (
help); Cite journal requires |journal= (
help); Missing or empty |title= (
help) The links works, but it is not allowed formats, and the bot does not expand from them. (
t)
Josve05a (
c) 21:15, 19 November 2018 (UTC)
Probably nothing? Let users fix the errors themselves? Headbomb {
t ·
c ·
p ·
b} 21:51, 19 November 2018 (UTC)
The doi link works since dx.doi.org will resolve non-doi hdl.
AManWithNoPlan (
talk) 22:55, 19 November 2018 (UTC)
Maybe so, but it's still not a valid DOI. Headbomb {
t ·
c ·
p ·
b} 23:02, 19 November 2018 (UTC)
What should be done is that the DOI should be fixed by a human to conform with the DOI specifications. DOI.org is under no obligation to support non-conforming DOI values, and they could remove their de facto support at any time. –
Jonesey95 (
talk) 05:42, 20 November 2018 (UTC)
Request: clean up google search so-called references
While I'm not sure why a citation to Google Search should ever appear in an article, they do quite a lot. It would be good if the bot would remove unnecessary parameters for such URLs as well, as it does with Google Books.
On VERY rare occasions they are valid (example: the term xyz is more popular/common than zyx on the Internet). Almost all the time, it would be more honest to just say <ref>Look it up yourself loser</ref>
AManWithNoPlan (
talk) 20:01, 7 October 2018 (UTC)
While I don't disagree with you (at all), I still feel we (read: the bot) should act as if they are all valid, and clean them, and hope that someone else comes along and finds (any) better references. (
t)
Josve05a (
c) 20:05, 7 October 2018 (UTC)
aqs=chrome..69i57j69i59.14823j0j7 Assisted Query Stats - used for logging purposes only
sourceid=chrome Where the search originated from - used for logging purposes only
ie=UTF-8 input encoding; default is UTF-8
This is in many cases incorrect, despite Crossref stating this. For e.g.
this edit it should be "Middle East Review of International Affairs, Vol. 20, No. 1, pp. 35-59". Perhaps |journal=SSRN Electronic Journal should be forbidden, since there seem to be a lot of misattribution to the real source.
Please see
[7]. It seems extremely improbable that the issue number and page number would both be 061102.
JRSpriggs (
talk) 20:59, 21 November 2018 (UTC)
{{notabug}} the bot generates perfect output and leaves user input fields alone.
AManWithNoPlan (
talk) 21:22, 21 November 2018 (UTC)
{{cite paper}} is an alias of {{cite journal}} and should be supported in the same ways as {{cite journal}} is (only let the template name stay the same)
The style guides are very clear on not including publishers for Journals. 99% of the time the pdf links to publisher pdfs do not work, and even when they do, they often do not last for long. Anyway, it adds nothing that the doi already provides.
AManWithNoPlan (
talk) 16:30, 25 November 2018 (UTC)
the correct publisher is tandy anyway. As usual, it was wrong.
AManWithNoPlan (
talk) 16:40, 25 November 2018 (UTC)
I do not think this fixable, since the only way is to maintain a list of 10,000 magazines. Also, the template are actually exactly the same.
AManWithNoPlan (
talk) 15:40, 24 November 2018 (UTC)
Why did it convert from cite web to cite journal? --
GreenC 15:47, 24 November 2018 (UTC)
They are not exactly the same. The rendering of |issue= and |number= differs, and you cannot set |title=none in cite magazine (there may be other differences). --
Izno (
talk) 18:41, 24 November 2018 (UTC)
That's news to me. I see that this is fairly new change.
AManWithNoPlan (
talk) 19:09, 24 November 2018 (UTC)
For some odd reason, the bot keeps removing all information from citations about the publisher of the source and the location of the publisher. I have noticed it doing this for a while now and have had to keep cleaning up after it. I do not know if these removals are intentional or accidental, but I see no reason why the bot should be removing publishers from citations, considering that the publisher is a fairly essential piece of information about the source.
Relevant diffs/links
Recent examples of this include the bot's activity
here and
here. It has done it before, but I cannot find the other examples right away and would have to go looking for them.
We can't proceed until
Feedback from maintainers
All style guides reject including that information for journals. Also, it is often incorrect. The bot has been doing this for over a decade, so I am sure there are other examples.
AManWithNoPlan (
talk) 15:34, 27 November 2018 (UTC)
Flagging as {{notabug}} until debate is over and this is finalized once and for all.
AManWithNoPlan (
talk) 17:27, 27 November 2018 (UTC)
Is there an API or a way to "find this"? Or is it too much work? (
t)
Josve05a (
c) 13:10, 22 November 2018 (UTC)
Many users prefer direct links to PDF files rather than records (although librarians and website owners prefer links to HTML pages so that they can track the users more easily). That said, this repository attempts to provide the handle in its HTML metadata, but is misconfigured: <meta name="DC.identifier" content="http://hdl.handle.net11245/1.345005"> (slash missing). I suggest to warn the repository administrators.
Their records on BASE are also all broken, some OAI-PMH fixes are in order.
Nemo 14:47, 22 November 2018 (UTC)
the bot is uploading new data slowly. I can not get it to work at the moment.
We can't proceed until
Feedback from maintainers
Also not working for me. I asked it to check
The Bill, so far 25 minutes and it's done nothing.--
5 albert square (
talk) 13:43, December 2018 (UTC)
{{wontfix}} shared server and sadly when it gets slow people often just start trying again and again thus making it worse (similar to shooting someone because they are bleeding and hoping it will help)
AManWithNoPlan (
talk) 15:32, 7 December 2018 (UTC)
Would displaying an error message of some kind be possible here? Something like "<server> is at capacity, try again in <ammount of time depending on server load>"? Headbomb {
t ·
c ·
p ·
b} 23:38, 7 December 2018 (UTC)
This is not a regression. The URL is added before the PMC is present. Will have to think about this. Perhaps move adding Open URL to the end would be best.
AManWithNoPlan (
talk) 16:46, 23 November 2018 (UTC)
Bot added "journal" parameter when "magazine" parameter was already present, creating a duplicate parameter error (since both are aliases of "work"). This is similar to the error which
renames parameters to create aliases of "work", but in this case new parameters are being created.
On November 19th, you removed two wikilinks from
Sackur–Tetrode equation. Both wikilinks seem to be useful; so I restored them. To me, the removal of the wikilinks indicates a bug.
81.153.242.15 (
talk) 15:41, 30 November 2018 (UTC)
removal of partial wikilinks is not a bug. you need to wikilink the entire journal name or it will be removed by the bot.
AManWithNoPlan (
talk) 16:27, 30 November 2018 (UTC)
it is interesting that Wiley cannot handle the doi either. plus signs are a horrible choice.
AManWithNoPlan (
talk) 19:06, 12 November 2018 (UTC)
Anyway to get the cite template to enclode the url better so Wiley can resolve it, or is this up to crossref/Wiley to fix? (
t)
Josve05a (
c) 22:56, 15 November 2018 (UTC)
waiting for bot to come alive to debug
AManWithNoPlan (
talk) 03:21, 13 November 2018 (UTC)
Not only conveting existing doi's, but also
adding bad doi's :/ (
t)
Josve05a (
c) 22:53, 15 November 2018 (UTC)
That is no surprise.
AManWithNoPlan (
talk) 23:30, 15 November 2018 (UTC)
No, but still sad. A bit surprised though that it didn't add |doi-broken-date=, but I guess it tests if broken before parsing what to write. (
t)
Josve05a (
c) 23:47, 15 November 2018 (UTC)
when it gets url encoded, the space becomes a plus sign. When people start using doi with spaces and emojis it is going to suck
AManWithNoPlan (
talk) 00:02, 16 November 2018 (UTC)
Ugggh! Horrible thoughts! Burn them before they end up in doi's! (
t)
Josve05a (
c) 11:38, 16 November 2018 (UTC)
When the actuall website is Reuters.com, it whould be the work (such as |newspaper=), but while Reuters is the author of an article on another website (such as theguardian/nytimes) it should be |agency=. In this case |agency=Reuters be removed. Both |agency=Reuters and |newspaper=Reuters should not be present. (
t)
Josve05a (
c) 14:59, 24 November 2018 (UTC)
Same proble as with assocaited press
AManWithNoPlan (
talk) 17:54, 24 November 2018 (UTC)
In
the same edit, it did not add an extra parameter for the
Associated Press of Pakistan and for Agence France-Presse. All these agencies can often be called a couple of different names (e.g. AP, the Associated Press, or Associated Press), so that might be an issue. wumbolo^^^ 19:44, 24 November 2018 (UTC)
I have added to pull 1102 some code to make it less exact.
AManWithNoPlan (
talk) 23:16, 24 November 2018 (UTC)
{{notabug}} the accessdate is formatted wrong, not what we did. The page says {{tl:Use mdy dates}}
AManWithNoPlan (
talk) 16:47, 10 December 2018 (UTC)
Bot renames "publisher" parameter to "newspaper". However, "website" parameter is already present. This creates a duplicate parameter error since both "website" and "newspaper" are aliases of "work".
What should happen
Don't convert any parameter to any alias of "work" if any alias of "work" (e.g., journal, newspaper, magazine, periodical, website) is already present.
Not sure how to tell if this is fixed, but if it was, it didn't work:
edit at 19:57, 29 November 2018, see citation with title beginning "USA cyclist Tejay van Garderen".
DferDaisy (
talk) 01:31, 30 November 2018 (UTC)
Go to its main page tools.wmflabs.org/citations/, Thorough mode = yes, Commit edits = yes, and insert "Nuuk" into the input box next to "Process page". Then hit "Process page" and the error will occur almost immediately.
Thank you for the report. This comes from arXiv data. We support about a dozen formats that they use. This helps us decode new ones (or in some cases detect and not decode).
AManWithNoPlan (
talk) 15:33, 21 November 2018 (UTC)
chapter= was added to Cite encyclopedia without removing title=, causing there to be one quoted version of the chapter name and one italicized version.
What should happen
Bot should not operate on a citation formatted in this way
|chapter= is not a documented parameter in {{cite encyclopedia}}. |title= is supposed to be used for the encyclopedia entry. The bot should probably not add chapter at all when title is present, and it definitely should not add chapter and leave title in place. –
Jonesey95 (
talk) 18:53, 8 December 2018 (UTC)
The bot's edit summary was also partially incorrect in this edit, in that it claimed to have "Removed parameters", but it did not do so. –
Jonesey95 (
talk) 18:54, 8 December 2018 (UTC)
user enters a worldcat page for the url parameter and Citationbot ignores it
What should happen
worldcat urls should be removed and replaced with the oclc parameter, the same as with pmids and dois that are in the equivalent urls entered and swapped by the bot. In the case below, it should replace
The doi link points to the exact same page and is not prone to breaking as publisher links are. also, this case the pdf file is actually free which is a very unusual for a publisher website.
AManWithNoPlan (
talk) 21:04, 15 December 2018 (UTC)
Should that bot remove the non-functional doi when it the same as the jstor link with 10.2307 added in front of it?
AManWithNoPlan (
talk) 16:30, 18 November 2018 (UTC)
I prefer the second version only, or at least not displaying inactive doi's if other IDs exists. (
t)
Josve05a (
c) 21:16, 19 November 2018 (UTC)
Non-functional DOI links of the form 10.2307/<JSTORID> can be removed if they are broken. Working JSTOR dois, or JSTOR dois of a different form should be left alone. I believe JSTOR used to have internal redirects, but no longer do, so that's why we've got a bunch of crap 10.2307/<JSTORID> DOIs laying around. Headbomb {
t ·
c ·
p ·
b} 21:49, 19 November 2018 (UTC)
Anecdotally, sometimes the works where the JSTOR ID doesn't correspond to a working DOI actually have another DOI from a publisher. I'm not sure if these DOIs were never issued or what.
Nemo 23:04, 20 November 2018 (UTC)
That is correct, some do not actually have the doi issued. Some have one from the publisher and one from jstor (and maybe one from researchgate and and who knows who else.
AManWithNoPlan (
talk) 01:22, 21 November 2018 (UTC)
I will think about the solution since bbc (not bbc sports) is the publisher. Newspaper is one of the many work aliases.
AManWithNoPlan (
talk) 21:16, 8 December 2018 (UTC)
See also
this discussion. The use of |publisher=BBC Sport is a well-established norm and there is consensus for it. Nzd(talk) 08:46, 13 December 2018 (UTC)
Add a 'silent' mode. This would simplify the output to simply
--------------------------------------------------------------------------
[12:13:02] Processing page '[[2018 FFA Cup preliminary rounds]]' – [[edit]] – [[history]]
# No changes required.
when there is no changes made and
--------------------------------------------------------------------------
[12:13:02] Processing page '[[2018 FFA Cup preliminary rounds]]' – [[edit]] – [[history]]
# Updating the page ([[diff]]).
when there is a change made. This could probably made 'default' for categories, with &silent=0 to disable it. Or alternatively, &verbose=1 to enable verbose logs. Headbomb {
t ·
c ·
p ·
b} 12:25, 21 August 2018 (UTC)
difficult to fix: pages that take a while to process will cause an HTTP disconect.
AManWithNoPlan (
talk) 13:13, 31 October 2018 (UTC)
@
AManWithNoPlan: not sure what's that got to do with a simplified output in general? Headbomb {
t ·
c ·
p ·
b} 13:36, 31 October 2018 (UTC)
perhaps output dots as the bot runs. let me think about it.
AManWithNoPlan (
talk) 13:50, 31 October 2018 (UTC)
way to many places in the code would need changed. also likley to drop connection while running. {{wontfix}}AManWithNoPlan (
talk) 17:03, 22 December 2018 (UTC)
Removed/touched a parameter with a comment <!-- some readers have trouble with the link generated by the doi= field? -->, which should "block out" the bot from touching it. (
t)
Josve05a (
c) 09:30, 12 November 2018 (UTC)
Inappropriate capitalisation of foreign language titles - Spanish does not use title case, it uses first letter only capitalisation of titles.
What should happen
For Spanish titles, first letter only capitalisation (i.e. where language=es (and potentially other languages), don't apply English-language capitalisation rules.
Headbomb {
t ·
c ·
p ·
b} 14:09, 21 December 2018 (UTC)
Just because the original source has one style does bot mean we follow it. Thouhts?
AManWithNoPlan (
talk) 17:14, 21 December 2018 (UTC)
{{notabug}} Jounal titles in many styles are capitalized. No the bots fault that the template was used wrong.
AManWithNoPlan (
talk) 16:56, 22 December 2018 (UTC)
API: add &via= option (also what does &edit= do?)
In a call like https://tools.wmflabs.org/citations/process_page.php?edit=toolbar&user=Headbomb&page=Steve_Bieda, does edit=toolbar do anything? Because I'd like to have some ways to tell the bot that it was triggered via {{Draft article}} or
citation expander, or similar.
We might want to rename the parameter to allow something like
Headbomb {
t ·
c ·
p ·
b} 03:43, 26 August 2018 (UTC)
I wonder what the audience of this additional message would be? To most users, what is important is the content and motivation of an edit, rather than the circumstances in which an editor came to make it. If I have a clear understanding of the motivation for this change, I'll be able to consider the best way to implement it.
Martin(
Smith609 –
Talk) 08:56, 27 August 2018 (UTC)
The goal is mostly to have a way to see where Citation bot is used from. How many of those edits were triggered by the web interface? How many were from user scripts and from which userscript, or how many from templates and which templates (and do any need updating)? How many were done via the Citation Expander gadget? It's not necessarily to have 'official' stats (it would be nice though), but knowing where the bot is used from is nice, and could let us give help to newbies that run into issues with the bot. Headbomb {
t ·
c ·
p ·
b} 10:32, 27 August 2018 (UTC)
For example,
[23] was most likely triggered from {{Draft article}}, present on
Draft:Lil ginger ale (we sadly can't feed who used the Template from the template because we don't have a {{CURRENTUSERNAME}} magicword/variable), but knowing it was triggered from the template means it has a fairly high chance of being used by a newbie, and was probably triggered by
one of these people. So that lets us (or at least me) customize feedback to people. If I see someone doing something weird/unusual with the bot from {{Draft article}} vs Web Interface vs Gadget vs User Scripts, well you more or less have a continuum of likely noob vs likely noob/intermediate vs likely intermediate vs likely advanced user dealing with the bot. And you'd have an idea of who could have triggered the bot in that scenario. Headbomb {
t ·
c ·
p ·
b} 10:45, 27 August 2018 (UTC)
If the bot tries to "reformat" a blacklisted link (e.g. https://zenodo.org/record/1223952/files/article.pdf to https://zenodo.org/record/1223952 the bot will not be able to save the edit. We should stop to reformat these URLs, in order to be able to edit such pages. Editing pages with existing links aren't stopped, but formatting them turns them in to new links - which are blacklisted. (
t)
Josve05a (
c) 07:06, 28 December 2018 (UTC)
A better approach would be to find what causes this blacklisting, and see if edit filters can't be tweaked to let Citation Bot work around them. Headbomb {
t ·
c ·
p ·
b} 14:44, 28 December 2018 (UTC)
Given that there are multiple ever changing black lists that would be hard. Awesome, but hard.
AManWithNoPlan (
talk) 14:54, 28 December 2018 (UTC)
Make a
WP:BOTREQ and someone can take care of this with AWB. Headbomb {
t ·
c ·
p ·
b} 16:02, 29 December 2018 (UTC)
Yes please use BOTREQ for URL updates, but be careful using AWB it typically breaks archive URLs and/or doesn't undo previous archivals of the broken URL. --
GreenC 16:08, 29 December 2018 (UTC)
Example what is required. Job done. --
GreenC 18:00, 29 December 2018 (UTC)
one time focused tasks like this are not optimal for the citation bot.
AManWithNoPlan (
talk) 23:54, 29 December 2018 (UTC)
I meant, the job is done. It has been completed.
Special:Search/insource:"naldc.nal.usda.gov/naldc/download.xhtml" shows zero hits. I posted the diff to illustrate for
Headbomb what is required when modifying URLs - it's not a search-replace with AWB because that causes problems with archives. --
GreenC 00:16, 30 December 2018 (UTC)
Thank you for the fix and for the wayback "medication".
Nemo 00:24, 30 December 2018 (UTC)
Request: Shove "additional information" stuff after the pipe in edit summaries
It would probably make more sense to shove "additional information" stuff after the pipe
Via= can't really be implemented in any useful manner, since edit= is currently unused and all usages use edit=toolbar and the draft article uses edit=toolbar and draft does not directly include it (it is two templates deeper, so we can't even tag it as from draft). We have done what we can for now.
AManWithNoPlan (
talk) 23:03, 23 December 2018 (UTC)
How about the way I suggest above? Headbomb {
t ·
c ·
p ·
b} 23:34, 23 December 2018 (UTC)
There’s no way to easily detect how it was run. We already specify category as different. We have improved user detection though.
AManWithNoPlan (
talk) 23:57, 23 December 2018 (UTC)
Yes, but there is a way to recognize what is fed in &via=, or if a &via= is declared. Also, since it got
archived, what's the syntax for via? Headbomb {
t ·
c ·
p ·
b} 00:15, 24 December 2018 (UTC)
it does not exist. there’s no reliable way to do it. We can detect category vs toolbar, but nothing else. That is why edit= is not used.
AManWithNoPlan (
talk) 00:28, 24 December 2018 (UTC)
Well that's what the request in
via was about. To add support for &via=. Headbomb {
t ·
c ·
p ·
b} 03:54, 24 December 2018 (UTC)
I know and we’ve done all we really can. Unless we have some way of actually getting reliable information (which we do not) there’s really no point to adding it.
AManWithNoPlan (
talk) 04:08, 24 December 2018 (UTC)
What do you mean 'reliable information'? what's wrong with just displaying the information that's passed in &via=! That'd be the whole point of via. Headbomb {
t ·
c ·
p ·
b} 04:54, 24 December 2018 (UTC)
we would need an approved list of options to choose from and not just accept random strings.
AManWithNoPlan (
talk) 04:59, 24 December 2018 (UTC)
I honestly doubt anyone would set it, since the toolbar and the citation toolset core that draft pulls information from both set toolbar.
AManWithNoPlan (
talk) 05:01, 24 December 2018 (UTC)
Why would we need a list of options / pre-approved stringers? 99%+ of usages would be from templates and scripts. Headbomb {
t ·
c ·
p ·
b} 06:19, 24 December 2018 (UTC)
I think the pre-approved strings would serve as a kind of input sanitisation. Otherwise at some point you may need to check that you're not inserting junk or spam in edit summaries (where it's hard to remove). I don't know how important a concern this is, but it's not unreasonable to keep it mind.
Nemo 10:08, 27 December 2018 (UTC)
If it is only long dashes and numbers and spaces then remove spaces. Correct?
AManWithNoPlan (
talk) 21:54, 23 December 2018 (UTC)
Could be letters too, like A23 - A48. Convert/fix that to A23–A48.Headbomb {
t ·
c ·
p ·
b} 22:02, 23 December 2018 (UTC)
that get dangerous could be junk like ii - iii, 5-7 or the evil look at pages 5 to seven and browse around pages in the early teens..... I will think about how many letters to allow.
AManWithNoPlan (
talk) 22:06, 23 December 2018 (UTC)
I can start simple and move on from there.
AManWithNoPlan (
talk) 22:53, 23 December 2018 (UTC)
many style guides actually specify capitalization of Foreign journals independent of the what the journal itself is called. It’s an odd thing. Specific journals can be submitted for capitalization as needed.
AManWithNoPlan (
talk) 19:16, 24 December 2018 (UTC)
pages vs. pages is odd. Jstor gives us a range and then we fix that and so it is temporarily a range of pages
AManWithNoPlan (
talk) 19:16, 24 December 2018 (UTC)
websites are not case-sensitve, but I can add a capitalization exception. the initial reference being a mix of a journal and a website confused the bot.
AManWithNoPlan (
talk) 19:16, 24 December 2018 (UTC)
::That is it is dumping all the page meta tags, then cite journal parameters, then a PubMed query. I'm not a PHP programmer, but
this StackOverflow answer may be useful, if you're not already retrieving the meta tag data. I think PRISM may include the Dublin Core dc. tags as a subset, but the BMJ & maybe the Oxford journals also add useful citation_ tags.
dc.contributor Gordon C S Smith
dc.contributor Jill P Pell
dc.identifier 10.1136/bmj.327.7429.1459
citation_title Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials
citation_public_url https://www.bmj.com/content/327/7429/1459
citation_mjid bmj;327/7429/1459
citation_lastpage 1461
citation_doi 10.1136/bmj.327.7429.1459
citation_section Hazardous journeys
citation_article_type Other
citation_pmid 14684649
Hope this is useful
RDBrown (
talk) 12:34, 1 January 2019 (UTC)
Headbomb {
t ·
c ·
p ·
b} 01:12, 3 January 2019 (UTC)
What happens
Whenever there is a |foobar=<SOMETHING>|barfoo=...|foobar=<NOTHING>, the bot changes that to |DUPLICATE_foobar=<SOMETHING>|barfoo=...|foobar=<NOTHING>
What should happen
Whenever there is a |foobar=<SOMETHING>|barfoo=...|foobar=<NOTHING>, get rid of the empty parameter and keep the full one. E.g.
[33] Exception: Keep the handling of author/editor parameters the same (last/first, editor-last/editor-first, etc...) since people often mangle the order by accident.
That is actually the correct information. Hard to deal with people who do not know how to spell when inputting data!
AManWithNoPlan (
talk) 19:01, 5 January 2019 (UTC)
Correct information? There is no journal named 'peprint' out there, and that doesn't seem to be anywhere on the RG page either. Is this GIGO? Headbomb {
t ·
c ·
p ·
b} 19:05, 5 January 2019 (UTC)
yes indeed it is correct. That’s the journal the author entered. Obviously GIGO.
AManWithNoPlan (
talk) 19:14, 5 January 2019 (UTC)
TY - BOOK
AU - Petit, Jean-Pierre
PY - 2016/07/04
SP -
T1 - Schwarzschild 1916 seminal paper revisited : A virtual singularity
JO - peprint
ER -
Anyway, close this one then. No need to code an exception for such uncommon GIGO. Headbomb {
t ·
c ·
p ·
b} 19:33, 5 January 2019 (UTC)
{{wontfix}} as you said. Would have been a lot funnier if they had spelled it will one more e.
AManWithNoPlan (
talk) 19:34, 5 January 2019 (UTC)
PEE PINTS FOAR EVRYONE!! Headbomb {
t ·
c ·
p ·
b} 19:38, 5 January 2019 (UTC)
The bot replaced some valid references by a reference to a completely different article (a book review published 10 years before this journal was established...) Worse, it inserted this faulty reference multiple times but as different references (probably because they were named and got different names). The apparent reason for this is that the URLs of the references had changed although (with 1 exception) they still redirected to the correct page. I have corrected this manually (see article history). I do find it weird that "cite web" references were replaced by "cite journal" ones that were completely inappropriate. Although the bot indicated that it was "user activated", there was no indication about who this user was, who clearly failed to check the edits made by the bot.
This mostly happens with Wiley's "fake DOI" ISSN links (which are often rather spammy by the way, as in this article) and can be conclusively solved only by actually resolving DOI links.
Nemo 10:32, 6 January 2019 (UTC)
Look where the incorrect reference goes. Even though the bot put "journal=Genes, Brain and Behavior", it was to an article in a completely different journal that had "Genes, Brain and Behavior" as title. It didn't go to one of Wiley's URLs at all. Wiley doesn't use these fake DOI URLs any more, although these generally are still functional but redirect to the new (non-DOI) URLs. All that the bot should have done was replace the "fake DOI URL" with the new URL. --
Randykitty (
talk) 10:45, 6 January 2019 (UTC)
That's a problem with fake DOIs that resolve.
AManWithNoPlan (
talk) 15:14, 6 January 2019 (UTC)
I could probably add code to detect DOIs in the form of 10.xxxxx/(ISSN)xxxx-xxxx which are obviously just an ISSN.
AManWithNoPlan (
talk) 15:21, 6 January 2019 (UTC)
Thanks for maintaining this invaluable tool. BTW, I'm still curious why the bot took those fake DOI links and arrived at an old book review, mixing up the review title and the journal name... --
Randykitty (
talk) 15:35, 6 January 2019 (UTC)
The Bot took the journal title which was in the title parameter and did a PMC search and found an exact match and went with it. We do have rare false positives like this.
AManWithNoPlan (
talk) 15:55, 6 January 2019 (UTC)
I see. Yes, that must be rare :-) Thanks again. --
Randykitty (
talk) 16:06, 6 January 2019 (UTC)