Wikipedia_talk:WikiProject_Articles_for_creation/AfC_Process_Improvement_May_2018 References

Looking forward to seeing discussion here! MMiller (WMF) ( talk) 19:01, 3 April 2018 (UTC) reply

Concrete easy improvements (suggestions from Legacypac)

Direct all comments, template responses and discussion to the Draft talk page.

Currently we put the comments in the face of the draft and strip them off when accepting. We also have a link in the decline templates leading the new user to discuss on the reviewer's talk page

It is annoying to both the reviewer and the creator and anyone else to scroll past the long templates.
Seeing the rejections before the content prejudices the next reviewer even when the reasons for the earlier rejection may have been fixed.
no where else on wikipedia do we erase discussion on an article (we do archive, which may be appropriate for some decline notices).
training the new editors to discuss on talk is good
fragmenting discussion across user talk is pointless. Half the time I have to go searching to figure out what Draft the new user - who often don't even sign their posts - is talking about. Reviewers already watch the draft automatically.
we should be dialoging with the user not just templating them
the current system is a holdiver from before Draft space existed and AfC submissions were subpages of the AfC Project space page without talkpages. It is inappropriate now.

Fixing this involves recoding where comments get posted and changing the standard decline template link from the reviewer's talk to the Draft talk. Legacypac ( talk) 20:23, 3 April 2018 (UTC) reply

Legacypac -- would you say that the biggest benefit with this idea would be "improve communication between reviewers and authors to decrease iterations", "increase the speed/ease that reviewers can do their workflow", or both? -- MMiller (WMF) ( talk) 22:50, 4 April 2018 (UTC) reply

Definitely improve communications between reviewers and authors and between different reviewers. Better communication improves workflow in all contexts. Legacypac ( talk) 22:56, 4 April 2018 (UTC) reply

Decline notice

Stop adding the entire long pink templete to the Draft page at all. It's fine to send it to the user's talk page but a simple short message that says "Declined by AfC Reviewer because xyz. If this issue is resolved click to resubmit" posted to the draft talkpage would be more friendly. Legacypac ( talk) 20:23, 3 April 2018 (UTC) reply

When a draft has been declined multiple times, |small=yes is added to the previous decline notices, which removes everything except what you just asked for. Primefac ( talk) 20:40, 3 April 2018 (UTC) reply

Make Notability part of the review process

This is not really that big a change in some ways because most declines are for notability concerns. Most declined pages don't require a even a google search to assess notability as lacking or meet as some pages are obviously notable topics (inhabited places, meets PROF etc) We just need to assess notability on the edge cases. Legacypac ( talk) 20:23, 3 April 2018 (UTC) reply

~~This isn't something that AFC can just decide. This would require an overturning of this RFC. Primefac ( talk) 20:50, 3 April 2018 (UTC)~~ reply

You may want to reread my point and your response which is off point Legacypac ( talk) 20:55, 3 April 2018 (UTC) reply

Fair point. In re-reading your statement, it actually sounds like you want to remove "notability" as a reason for drafts, because it will either be a notable subject (and therefore be accepted if it's not copyvios/G11) or it's not notable and should be deleted. Primefac ( talk) 20:59, 3 April 2018 (UTC) reply

As raised by the page this is attached to, and by User:Insertcleverphrasehere and others we need to do a quick notability check on edge cases. I regularly find pages rejected for notability where the subject is in fact notable but the new editor did not do a great job of showing it. Legacypac ( talk) 21:07, 3 April 2018 (UTC) reply

I cannot imagine why it is not already so. I see a lot of submissions declined on the basis of failure to establish notability - clearly some people interpret the process as requiring an article to meet minimum standards. Guy ( Help!) 21:49, 3 April 2018 (UTC) reply

Legacypac -- does this seem like the sort of idea that WMF could help with via some sort of software change? Or is it more of a policy issue than a technology issue? One idea was to add that "one click" search for notability like AfD has, but there are arguments against that. -- MMiller (WMF) ( talk) 22:55, 4 April 2018 (UTC) reply

I wondered if the Article Wizard could be refactored to follow the subject-specific notability requirements, but I guess WP:PROF is the one that's best organised point by point, & some the others are a bit nebulous. Espresso Addict ( talk) 23:13, 4 April 2018 (UTC) reply

I favor adding "one click" search to MfD like AfD has. The best reason not to delete at MfD is the topic is notable. Maybe incorporating search± links right in the AfC template would help the author and the reviewer too. Never thought of that before. The Draft does have a title we can base the search off of. That would vastly improve sourcing if available and used. Legacypac ( talk) 23:32, 4 April 2018 (UTC) reply

Maybe incorporating search± links right in the AfC template would help the author and the reviewer too - They're already there, in the "How to improve your article" section of the "pending" template, and right out in the open on the decline template. Primefac ( talk) 14:53, 6 April 2018 (UTC) reply

Notability should be as much a part of the process as it is at NPP. However, This this tends to be more of an issue of subjective interpretation of notability by the reviewers. Unless they know them off by heart through years of creating articles or patrolling them, no one knows the mass of notability guidelines properly or has even read them until push comes to shove. Just not having sources in the article is not a reason for a lack of notability if credible claims of notability are expressed in he article. That said, a raft of sources needs to be carefully examined, chances are that the more references that come with a new article, the majority of them are just Internet barrel-scraping, and what's left is barely reliable. It shouldn't necessarily be the reviewer's job to go searching for sources. Creators need to be pointed to WP:RS and WP:V and told to go back and do it themselves. In this respect the Article Wizard could be improved. Certainly add the AfD links to the template: Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL. Kudpung กุดผึ้ง ( talk) 22:58, 5 April 2018 (UTC) reply

Concur about adding the AfD links to the template. -- joe decker ^talk 18:13, 25 April 2018 (UTC) reply

Mainspace Compliant Drafts

Articles are best developed by collaberation - not by brand new editors in isolation. If the topic meets N and V AND would not be subject to immediate CSD in mainspace, it should be passes to mainspace and tagged appropriately for cleanup.

We would not pass copy vio, duplicates existing topics, unremarkable people/organizations etc, G11 spam, Hoaxes, and so on. Basically pass anything that a NPR would pass or tag for clean up instead of seeking deletion on or sending to Draft space. Call it "Mainspace Compliant Drafts" By passing the page we are giving the new editor the ability to start a valid topic much like they could pre-ACTRIAL. Legacypac ( talk) 20:23, 3 April 2018 (UTC) reply

Technically speaking this is how the project already operates. This is less about "overhauling AFC" as it is "reminding reviewers that imperfect drafts are okay". Primefac ( talk) 20:48, 3 April 2018 (UTC) reply

One can review drafts to this standard but most reviewers do not. We need to be more explicit about the standard and remove decline templates that suggest another standard - like the inline citation one which should be reserved for bios. Legacypac ( talk) 21:10, 3 April 2018 (UTC) reply

We've got the attention of the WMF (probably for the next couple of hours) so why not see what they can do to make inline citations easier to use and handle for new editors. You're currently rehashing the old arguments you (and sometimes, I) have had elsewhere, ultimately we know why some acceptable drafts are being declined. What we should be doing here is asking what WMF can do to help eliminate those relatively minor layout quality issues which sometimes result in declines. Nick ( talk) 21:22, 3 April 2018 (UTC) reply

Legacypac -- I have sort of the same question as above -- do you see ways that WMF and software work can help on this front, or is this more on the policy side?

Nick -- I would love to hear more about the inline citations idea. It sounds like that's already been discussed somewhere. Is there a link I can check out?

Thank you. -- MMiller (WMF) ( talk) 23:35, 4 April 2018 (UTC) reply

We have a "inline citations" decline. We could limit it to bios, or raise the standard and require inline for all AfC pages. If User:Primefac or User:MMiller writes a page no one needs to check it for copy vio but almost all the editors we deal with are green and many have no idea they can't copypaste in a Prof bio. or add unsourced things. Inline cites really helps us review amd improves the quality of the page but it is a struggle for some newbies. Legacypac ( talk) 23:42, 4 April 2018 (UTC) reply

The problem is that inline citations are explicitly not required for any content short of featured/good/DYK. Even BLPs only need ONE reference, which can be to facebook, and needn't be inline, to survive BLP-PROD. Espresso Addict ( talk) 23:53, 4 April 2018 (UTC) reply

There are a number of ideas for citations - but my preference is m:WikiCite or a similar system whereby we generate fully formed references either in advance (do-able with major journals and newspaper archives) and on the fly with current press publications and then save them with a unique accession number, so all an editor needs to do is past the URL (or the accession number, if they know it) in line and it's automagically replaced by a full, detailed, template type citation which they don't need to code or otherwise handle if they don't want to. Citation generation is a bugbear of mine (which I'm sure James Forrester will testify to). I even registered User:CiteBot in 2007 for a bot which would (try to) automatically convert bare URLs for some common press sites (BBC, Guardian, New York Times) which had consistent meta data into fully formed references - that role was eventually taken up by the WP:REFILL tool, which is a semi-automated tool. Citation handling is becoming a more urgent issue as people drift away from editing with desktop/laptop fully featured (Windows/Mac OS/Linux) type systems and move to tablets and smartphones, where switching between multiple tabs and having pop-up dialog boxes is less than ideal. Drag and drop references would be good for VisualEditor too (though replicating that level of functionality on a smartphone/tablet could be a challenge). That could work by allowing editors to generate citations (pushing them to WikiCite, or pulling them in from WikiCite, as appropriate) without having to actually use them in line as they write. Editors would instead saving them into a little area at the bottom of the VisualEditor screen, from where they could drag the citations to where they want them. If they were to use citations inline as they write, those citations would also then appear in the same little area at the bottom of the editor, enabling them to be dragged and dropped for successive uses. This approach would also see citation numbering automated too, as that's another minor layout issue that we could do without having to fix. Nick ( talk) 08:08, 5 April 2018 (UTC) reply

When I patrol new pages and come across poorly formatted references, I use the message feature of the Curation tool to direct the creator to WP:CITE. However, that is a mind bending page with a steep learning curve for a newbie. A simpler version needs to be written. Nick makes some valid suggestions. AfC is not the Article Rescue Squadron - many draft creators (and confirmed editors too), especially SPA, dump thier creation in Wikipedia expecting other editors to complete it or clean it up. They need to be told up front that sources are required - just as we do for BLP. In the case of paid editors, see WP:BOGOF. Messaging is part of the AfC template system, so in that respect it's more effective than Page Curation where some reviewers appear to do the absolute bare minimum. It all boils down again to being able to sensibly recognise whether or not a new article has true validity and potential for the encyclopedia. Kudpung กุดผึ้ง ( talk) 23:42, 5 April 2018 (UTC) reply
Citation generating for new users is a massive problem. I was discussing something similar with Insertcleverphrasehere, and the discussion turned to the need for people nominating content for deletion (and administrators running through speedy/PROD candidates) to ensure they're doing the Checks and alternatives before deletion by looking for sources to confirm/refute notability or fix sourcing issues. It's easy to suggest we should be spending time doing this, and in many cases, it's actually not too difficult/quite easy to find sources and save some good but unsourced content, but it's shit having to do this. It's time consuming and wearisome (which inevitably causes burnout with some reviewers) but something which we can cope with. It would make life easier for reviwers if they didn't have to spend time source searching, but that's not the biggest issue. If the original author doesn't source their work or uses poor quality citations nobody can follow, then we will never know what sources the original author was looking at, we will never know if the sources we find are better or worse than what they had, we will never know if we're missing out on some sources that could be used by other editors to expand their article or indeed other articles. That's all assuming we can find sources, some of us are lucky to have good university libraries with millions of volumes, journals, press archives and the like, and The Wikipedia Library offers a similar service, but if an author has dumped something on Wikipedia with no sources, and we can't find anything online, deletion is the highly likely option (and the only option if it's a BLP). We need to make citation generation easier and ensure we're doing everything we can make new editors supply all the citation information they're looking at/using when writing their first articles. I also think we need to be explicit - no sources is an instant rejection no matter how notable the content may be, an extreme nuclear approach, but one which would be balanced by simpler, easier to use citation generation features. Nick ( talk) 09:32, 6 April 2018 (UTC) reply

This really should be at the reviewer's discretion. It is perfectly legit to reject a submission if it does not contain referencing required to demonstrate notability; Our notability decline messages don't say that the subject is not notable, they say that insufficient evidence of notability has been provided. It is not legit to reject because you don't like the format those refs are provided in or because you, the reviewer, don't have access to them. It is also fine to accept a poorly referenced article on a notable topic if you believe it is unlikely to be deleted. When I feel like doing this I typically add {{ Friendly search suggestions}} and bare links to a few of the strongest sources to the draft's talk page. If it then ends up in AfD I have specific resources readily available for my keep argument. Unless it's not something they want to be doing, WP:BEFORE work doesn't burn reviewers out. I think the key to preventing burnout is to give reviewers choices and variety of work. ~ Kvng ( talk) 14:50, 6 April 2018 (UTC) reply

Remove the threat of sanctions

The more pages a reviewer touches the greater chance they pass something that proves to be something that shoukd have been deleted instead. Currently the Reviewer's name goes on the talkpage (not the creator's) and the Reviewer may get ripped apart and sanctioned for passing a "bad" page. We need to assume a lot more faith in our reviewers and remove the idea they take full responsibility for all the contents of all the pages they handle. The account that created the page is responsible for the content, not the user who managed the page by moving it or tried to improve it. Now if a reviewer is making lots of errors percentage wise, we should deal with that, and discuss any specific types of errors they should avoid. Legacypac ( talk) 20:23, 3 April 2018 (UTC) reply

the Reviewer may get ripped apart and sanctioned Diffs please. Yes, reviewers get called out for making bad accepts, but other than two very bad reviewers who were removed from the project I don't recall any "sanctions". Primefac ( talk) 20:38, 3 April 2018 (UTC) reply

You forget about the topic ban on moves imposed on me at User:Nick's suggestion - something he presented no evidence to support and then he accused me of breaching it for doing exactly what he insisted I do. You commented multiple times on the topic. Diffs in my talkpage archives. This was 100% based on AfC moves. Legacypac ( talk) 20:50, 3 April 2018 (UTC) reply

Hello. Nick ( talk) 20:55, 3 April 2018 (UTC) reply

@ Legacypac:Is there a reason the reviewer's name goes on the talk page? It isn't the case for the many other review systems. Espresso Addict ( talk) 23:17, 4 April 2018 (UTC) reply

The reviewer's name is coded into the AfC Project template posted to talk on an accept draft. But a reviewer might have spent 10 minutes checking and the creator hours drafting. I'd favor no name and it would be simple to remove it. Not our biggest problem though. Legacypac ( talk) 23:36, 4 April 2018 (UTC) reply

It's in the edit history. If we get logs sorted, it will be there too. Espresso Addict ( talk) 23:55, 4 April 2018 (UTC) reply

Clearly some amount of bravery and humility is required of reviewers as we, in some sense, take responsibility or vouch for for the work of others. For me that makes reviewing more worth doing. The community is forgiving of mistakes and lapses or judgement. Failure to admit to or learn from mistakes (often rooted in a lack of humility) is what leads you down the road to sanctions. ~ Kvng ( talk) 14:50, 6 April 2018 (UTC) reply

Question from Nick

Why has the Wikimedia Foundation sent someone to run this who has made so few edits, their own account (not their WMF account) is yet to make it to 'Extended Confirmed' status. I've blocked more experienced socks this week. This is just a piss take on the part of our beloved Foundation, isn't it. Lip service being paid, but in reality, an excellent example of the contempt we're actually held in by the San Fran Mafia. Nick ( talk) 21:01, 3 April 2018 (UTC) reply

Hi Nick, I can answer your question on this. Ryan Kaldari and I have been working with the New Page Patrol folks for a while, setting up and running the ACTRIAL research. By this point, we understand NPP pretty well, but we haven't had the opportunity to work with and learn about the people who work on AfC. ACTRIAL recently ended, and one of the things that we saw was the increase in workload for AfC. The RfC about ACTRIAL is still running, but if/when that change becomes permanent, it's going to put more stress on AfC, and we consider it our responsibility (Ryan's and mine) to learn more and help out here.

Marshall is a product manager who joined my team a month ago -- he's an experienced product manager, but new to the Foundation. Ryan and I want his first project to be something meaningful, and that's why he's working with all of you on improvements to AfC. I hope that helps to explain why he's here talking to you; let me know if you have any questions or thoughts about it. -- DannyH (WMF) ( talk) 22:14, 3 April 2018 (UTC) reply

I think we should give Marshall a chance, Nick. It's a standard practice in any enterprise to send a relatively new but intellectually competent staff member out as the first envoy on a mission to a client. We should be grateful for these first steps - in their wrap-up report on ACTRIAL, the Foundation did say they were going to take a look at AfC. Considering that AfC is only a WikiProject and does not have the official status of NPP that functions on software purpose built by the WMF, this is a major breakthrough in Community-Foundation relations. Having worked very closely together with Danny and Ryan over NPP and ACTRIAL - and not without some measure of friction at times - I appreciate what they are trying to do. Nothing will happen overnight (it took nearly 7 years to get ACTRIAL off the ground) but if we begin by displaying criticism for their involvement, they'll simple back off and we'll be back to square one. After all, they're not obliged to help at all. Kudpung กุดผึ้ง ( talk) 23:56, 4 April 2018 (UTC) reply

Due to the English Wikipedia's independence from the WMF on editorial decisions, it's not at all surprising that the WMF liaison would not be an ECP editor. I'm happy to sit-in on weekly video-conference meetings if it would be helpful. power~enwiki ( π, ν) 03:19, 5 April 2018 (UTC) reply

I'm also happy to video conference for idea brainstorming. — Insertcleverphrasehere ^{(
or here)} 10:18, 6 April 2018 (UTC) reply

No moves to mainspace by the creator of a draft

There are several drafts I am watching which have been created by SPAs, where they have repeatedly moved them to mainspace despite rather serious faults with the article. A large number of spammers also create the draft, edit it sufficiently to achieve autoconfirmed, and then move in. And that often evades NPP as well. I'd like to see a filter that flags articles moved into mainspace by the article creator. Guy ( Help!) 21:52, 3 April 2018 (UTC) reply

I actually think we do have a filter for that, ~~but I don't know the number~~ but it looks like it was disabled quite a ways back. Primefac ( talk) 21:55, 3 April 2018 (UTC) reply

Would such a filter not catch all Drafts moved to mainspace by all editors? We encourage drafting in Draft or userspace. I'm anti-spammer as much as the next person but as people keep saying AfC is optional and anyone can create mainspace pages once auto-confirmed even under WP:ACREQ? Legacypac ( talk) 22:01, 3 April 2018 (UTC) reply

An admin reviewer (as it seems essentially identical to create protection) could be allowed to set a disallow-move-into-mainspace-by-creator flag for material that should be deleted in mainspace but might be missed, though I think it would be easy to overuse this in cases where it isn't really needed. Espresso Addict ( talk) 04:41, 4 April 2018 (UTC) reply

Anything moved to mainspace, including drafts or user sub-page drafts, should be listed in the New Pages Feed. This is what was always supposed to happen, so if its not, the bug needs to be filed at Phab. It's not only a question of article quality, it's also one of the ways we catch COPYVIO, socks, and paid editors. There has never been any question of preventing autoconfirmed editors from moving pages, but I certainly believe flagging such moves from draft (or user space) to mainspace would not impinge upon the 'rights' of those users who maintain that 'Wikipedia is the business where anyone can walk in off the street and tinker with the back office'. Perhaps Cenarium who disabled Filter on 28 April 2015 can shed some light on it - I don't want to re-enable it without discussion. Kudpung กุดผึ้ง ( talk) 00:32, 5 April 2018 (UTC) reply

Did we get a follow up to this? Pinging Cenarium and MER-C. Kudpung กุดผึ้ง ( talk) 23:19, 19 April 2018 (UTC) reply

I've seen a few examples from the feed that ended up there after being moved by the creator from draft, so they do get flagged. We typically don't get to them right away because they often land somewhere in the middle of the feed. But we get to them eventually. — Insertcleverphrasehere ^{(
or here)} 23:24, 19 April 2018 (UTC) reply

Copyvios

Thanks MMiller (WMF) for preparing this. It looks like an accurate summary of recent reform discussion.

One thing that can help reviewers to be more effective is some automation of copyright violation detection. This was recently discussed at Wikipedia talk:WikiProject Articles for creation#Automated copyvio checks on new pages?. ~ Kvng ( talk) 23:53, 3 April 2018 (UTC) reply

Seconding this. It would be useful if Earwig could be run and vios above x% threshold (to be discussed) be flagged. Espresso Addict ( talk) 04:35, 4 April 2018 (UTC) reply

I'll 3rd that - my #1 request. It cuts both ways. If we can speedy delete the G12s it cuts the backlog. The copyvio check takes more time than scanning for other obvious problems so I usually do it last. It is so annoying to find everything is well written and cited for a good topic and then realize they stole the whole thing. Seeing a high CV Score would cause me to check CV first. Legacypac ( talk) 04:41, 4 April 2018 (UTC) reply

I'm sometimes willing to rewrite for a promising topic that interests me, but it's a pain to have to do all the review work and then realise that one needs to rewrite it or delete it as a copyvio. Espresso Addict ( talk) 04:43, 4 April 2018 (UTC) reply

Me too. Plus technically you have to RevDel all the copyvio. There just are not enough AfCers though. If we stop and rewrite everything promising we never make progress on the backlog. There are thousands of willing editors operating in main-space who can fix stuff up on new pages (outside copyvio) but only a handful of active AfC reviewers. Legacypac ( talk) 04:52, 4 April 2018 (UTC) reply

PITA. I have found some shortcuts through the process of dealing with copyvios but me doing less of it just means that others have to do more. ~ Kvng ( talk) 20:36, 4 April 2018 (UTC) reply

@ Espresso Addict: - My bot runs earwig over the category daily at present - and there's been some discussion at WT:AFC regarding tagging vios - if we could get a consensus there for the task (and what threshold / template) it would be trivial for me to add to the bot. SQL ^{Query me!} 22:45, 4 April 2018 (UTC) reply

Thanks @ SQL:. It could just report the level it finds, which I think is what the DYK implementation used to do. I tend to look at the closest match even for very low %s. It would need some education though, both of reviewers and creators; Earwig tags titles, quotations, and material like "Professor of Engineering, University of Cambridge" which doesn't need rephrasing. Espresso Addict ( talk) 23:05, 4 April 2018 (UTC) reply

SQL's application of Earwig also applies ORES scores (predicted class, quality, vandalism, etc.) I have two related questions:

Would those ORES scores be useful to the workflow? More or less than having copyvio scores?
Where would having scores (copyvio or ORES) be most useful in the workflow? I could imagine them being part of the Helper script, surfacing scores for each article as they are being reviewed.

-- MMiller (WMF) ( talk) 23:50, 4 April 2018 (UTC) reply

You'd want the scores upfront to enable selection of drafts to review; the ORES score is not hugely useful when you've got the draft open. Both copyvio & ORES would be very useful. Espresso Addict ( talk) 00:00, 5 April 2018 (UTC) reply

~~I'm picturing something... like this? SQL ^{Query me!} 00:06, 5 April 2018 (UTC)~~ Misunderstood, NVM! SQL ^{Query me!} 00:08, 5 April 2018 (UTC) reply

In my opinion, Marshall, ORES would have no immediate impact on the way NPP and AfC reviewers work. What we need are more immediate solutions. Many reviewers seem unable to even correctly interpret the wealth of information that is presented to them in the New Pages Feed - or perhaps they just don't bother reading it.

I have absolutely no idea why automatic COPYVIO control keeps being brought up as if it were something new. It used to be fully automatic on all new pages, done by a script developed by Coren. I've no idea why it stopped. Perhaps because Coren retired and the bot was no longer maintained and/or is no longer compatible with later iterations of MediaWiki. But it's something that really needs to be done. I find it most annoying having to manulally run every article I patrol through Earwig's otherwise excellent gadget. No wonder there are severe backlogs at both NPP and AfC.

As copyright violations carry legal implications, I believe further development of such a feature is well within the mandate of the Foundation who can either develop it themselves or release a grant for a competent volunteer to do it. After all, it's a cross-Wki issue. Kudpung กุดผึ้ง ( talk) 00:51, 5 April 2018 (UTC) reply

See Coren's bot which more than halved the reviewing and/or patrolling time. It worked extremely well for many years until Coren retired. This is definitely a high priority, Marshall, and one which would certainly benefit from some WMF engineering. This is not to belittle the otherwise excellent Earwig's Copyvio Detector, or Copy Patrol, but they're not automatic. Kudpung กุดผึ้ง ( talk) 23:54, 5 April 2018 (UTC) reply

A case study, in passing

I suggest anyone who doubts the problems at AfC with some articles, at least, takes a look at the history of articles on two people at Johns Hopkins, elected members of the National Academy of Engineering in Feb 2018 (which I believe is alone enough to pass WP:PROF). Jennifer Elisseeff was submitted to mainspace on 9 Feb by an experienced but red-linked user; Charles Meneveau via AfC on 7 Feb, immediately rejected, resubmitted on 19 Feb and has only just exited after I saw it languishing. That's not an unusual case, just the article I happened to be working on when I noticed the announcement for this subpage. Espresso Addict ( talk) 23:58, 3 April 2018 (UTC) reply

Meneveau was Rejected at [1] even though the page says he holds a named professorship and has been named a Fellow to two different societies - that is three clear cut PROF passes. I don't find that page to be written like an Advertisement either. Legacypac ( talk) 00:41, 4 April 2018 (UTC) reply

There are a number of decline reasons that are routinely abused adv is one of them. I have propose removing this and a few others but those proposals are unlikely to gain consensus because many reviewers believe they're here to fight spam (and anything that resembles it). ~ Kvng ( talk) 20:28, 4 April 2018 (UTC) reply

Indeed. I do wholeheartedly believe that attitude must change if (when) ACTRIAL is switched back on. There's no evidence this was spam at all, just a genuine person seeing a genuine cast-iron notability reason flash past in the news and posting an article; the sort of new editor we want to clone. Espresso Addict ( talk) 20:43, 4 April 2018 (UTC) reply

Well don't hold your breath and don't make a change here a precondition for permanent ACTRIAL. The attitude is not confined to AfC. It exists also at AfD and with PROD, probably also NPR but I don't have enough NPR experience to know for sure. Anyone who wants to see their efforts build the encyclopedia, as opposed to deleting it or spending time bickering with other wikipedians, should be working on existing underdeveloped articles. Plenty of work to be done at WP:TAFI and WP:DEORPHAN for beginners, at WikiProjects for those further along and WP:WPMERGE for the advanced. ~ Kvng ( talk) 21:27, 4 April 2018 (UTC) reply

I did a test CSDing G11 all pages tagged as adverts still in the decline list from a couple random letters of the alphabet - turns out Admins agreed that quite a few were not blatant advertising. I propose we keep Adv decline but include a CSD option we are expected to use. Don't decline something that fits a CSD reason and not seek deletion. Legacypac ( talk) 21:42, 4 April 2018 (UTC) reply

Well, the decline reason is adv not blatent_adv so that is not a bug surprise to me. But, as you may know, I'm not in favor of actively deleting things in Draft space. I don't see how forced deletion of behind-the-scenes stuff is worth burning time and goodwill on. ~ Kvng ( talk) 22:24, 4 April 2018 (UTC) reply

@ Kvng:: Most of the other situations you note required admin intervention for deletion. I frequently decline A7 & G11 speedies. And indeed, there's plenty to work on older articles needed. My main focus these days has been on trying to rescue attacked articles and make them deletion resistant, especially where they represent underpopulated areas of the encyclopedia.

@ Legacypac:: Admin perceptions of G11-worthy articles differ. As I recall, the rubric states the article must be unsalvageable for this to apply. There's a definite area where promotion is noticeable but not delete-worthy. There's a template for promotion (

{{
Advert}}

) that could be applied is cases where work is badly needed but deletion isn't appropriate. Espresso Addict ( talk) 22:39, 4 April 2018 (UTC) reply

I think it's time for the G11 criterion to be revisited. Spam is spam and even the one-line directory-style entry that might not be saturated with adspeak is going to reward its creator with SEO for his company (or his client if he was paid to do it). There seems to be an attitude that Wikipedia needs every article that is started, but just because something might eventually pass our notability criteria doesn't automatically mean it is a suitable subject for an encyclopedia. Again, these issues sre not the exclusive property of AfC - it's even more important for New Page Reviewers to understand them. 01:04, 5 April 2018 (UTC) Kudpung กุดผึ้ง ( talk)

That goes well beyond what AfC/NPP can decide. I'd be happy to support a draconian entry threshold for companies but most of the community wouldn't. Espresso Addict ( talk) 23:39, 5 April 2018 (UTC) reply

This all goes back to when Wikipedia was founded. The rule book in those days was thin because no one ever dreamed what impact Wikipedia would have 10 or 15 years later and become the magnet for spammers (from multi-nationals to market stalls), vanity bio writers, footy fans, and Bollywood. As Wikipedia grows organically, it takes the community too long to catch up - many RfC for improvement tend to fail due to hanging on to what are now archaic ideologies. That said, WP:ORG has recently been somewhat rewritten. Kudpung กุดผึ้ง ( talk) 00:06, 6 April 2018 (UTC) reply

Some quick comments...

...on potential improvements

Help drafts to be submitted in better condition in the first place.

This would be really helpful. Also get the submitting editors to do as much of the indexing as possible. Espresso Addict ( talk) 04:00, 4 April 2018 (UTC) reply

@ Espresso Addict: -- by "indexing", do you mean assigning categories? -- MMiller (WMF) ( talk) 00:51, 5 April 2018 (UTC) reply

We could do with indexing the heap of drafts so that reviewers can find suitable review fodder easily; for example I specialise in academics, writers & classical musicians and know nothing about sportspeople, bands & Bollywood. It took me several hours of offline work to extract a long list of bios needing review [2] so that I could comb through it for those few where my expertise might be useful. But the submitters know whether their subject is a footballer, pop musician, or professor in oriental art -- there could be some form of auto-tagging. Espresso Addict ( talk) 01:10, 5 April 2018 (UTC) reply

This is already easier through [3] and some external tools allow to overlap wikiprojects, i.e. [4]. Would that be useful to you? Gryllida ( talk) 03:58, 5 April 2018 (UTC) reply

Improve communication between reviewers and authors to decrease iterations.

Yes. Also try to help reviewers to be more hands on, and just improve the article themselves where they can. Or point out problems in detail, by wading into the text and leaving messages. Communication between reviewers would also be useful eg an easy way of asking for a second opinion. And communication with outside projects, eg Teahouse, wikiprojects.

Standardize reviewing criteria across reviewers.

They are already standardised; the question is interpretation of numerous complex & to some extent conflicting policies, plus understanding the degree to which few mainspace articles in practice meet all the supposed minimum standards. One of my major objections to AfC as currently set up is that one reviewer, not necessarily an admin, not even necessarily all that experienced, gets to second-guess an AfD discussion across any topic, which just starts giving me divide-by-zero errors. Espresso Addict ( talk) 04:00, 4 April 2018 (UTC) reply

Increase the speed/ease that reviewers can do their workflow.

Yes. Espresso Addict ( talk) 04:00, 4 April 2018 (UTC) reply

Help reviewers find more promising drafts sooner.

Yes. Espresso Addict ( talk) 04:00, 4 April 2018 (UTC) reply

...or an unlisted one if I missed something with a lot of potential...

What is missed? That's probably better answered by more-experienced AfCers, but one thing as an outsider admin I'd very much like is improved data on how drafts flow around the system. A log of all AfC submissions & reviews (accepts & declines); a log of individual reviewers' records (similar to the CSD log of NPPers); more clarity on the project's stats. ETA: I've just found Template:AFC statistics but it needs a proper historical log. Espresso Addict ( talk) 04:08, 4 April 2018 (UTC) reply

...on proposed solutions

Bring notability to the forefront of the guidance in the Article Creation Wizard -- right now, the wizard does not mention notability.

This might be helpful. We want this to be as much like an online-always-open editathon as possible, and good-faith newbies should be directed gently towards subjects that are useful. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

Route drafts to AfC reviewers who are experts in the draft’s topic, and more likely to be able to detect notability.

This is something I've been planning to propose. We could also report relevant drafts to active Wikiprojects; even if the Wikiproject person wasn't a formal reviewer, a comment from an expert on how the article fits with the subject-specific notability guidelines is always of assistance. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

Rotate reviewers on repeat submissions instead of routing repeat submissions to the same reviewer.

This already happens? I don't necessarily think it's a good thing either, as the editor jumps through one set of hoops only to be set another by a completely different reviewer. It's very demoralising. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

Alter template language to clarify the AfC process for new editors and to encourage them to improve their drafts.

What problems have you in mind here? Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

@ Espresso Addict: -- this comes from a discussion around the idea that the wording in existing templates for when a draft is declined may be harsh or too bureaucratic for newbies; even just the phrase "Submission Declined". So this idea is about rethinking the wording on the templates with an eye toward clarity for newbies. -- MMiller (WMF) ( talk) 00:51, 5 April 2018 (UTC) reply

Sounds good, thanks. Espresso Addict ( talk) 01:00, 5 April 2018 (UTC) reply

Make it easy for reviewers to search for notability, potentially by implementing something similar to the “Find sources” links in Articles for Deletion.

That's next to useless. If a reviewer can't dump a name in google and look at the results critically s/he shouldn't be reviewing. Half the AfD links go to paywalled sources and if the article name is not the usual one (eg the very common situation where the middle name is included) they're worse than useless. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

Take steps toward a shared interface between AfC and NPP, since those processes are similar in many ways.

Similar yet opposite. NPP exists to remove unsuitable articles; AfC to improve & accept suitable ones. The skills & processes are not necessarily the same. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

After three declines, move draft to Miscellany for Deletion.

Strongly oppose this. MfD isn't patrolled like content deletions. I've seen articles with clear notability and no insuperable problems declined 5 or more times. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

Apply machine learning models via ORES to predict the likely quality of submitted drafts, surfacing the best or worst to reviewers to streamline their workflows.

May be worth a try. Just length & number of references forms a good indicator of quality, especially if it ignores facebook et al. Doesn't something similar exist in mainspace already? I forget what the report's called. Don't waste resources reinventing the wheel. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

Make it clear to autoconfirmed users that they can move their drafts to main namespace themselves without waiting for review.

This would need thought. Some users should be encouraged to use the back door, others coerced into going through the front. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

Develop a way for drafts to be discoverable by editors who wish to collaborate and improve them.

We already have one, it's called mainspace. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

Any of the many improvements to the Helper script listed in github: https://github.com/WPAFC/afch-rewrite/issues

No opinion. I haven't used the script long enough to comment. Espresso Addict ( talk) 01:35, 4 April 2018 (UTC) reply

...on challenges

Good articles can sit and wait for review in AfC instead of being in the main namespace.

It would be useful to have some data on how often this happens; I've seen numerous examples but I'm not looking at a random sample. Easy cases (accept/decline) probably get processed quickly then harder cases languish. This is true in all volunteer-processed heaps, here on Wikipedia & elsewhere. Espresso Addict ( talk) 02:46, 4 April 2018 (UTC) reply

If the AfC process is challenging for a new editor to navigate, drafts that have the potential to become high quality articles may not progress. That could happen if:

The new editor does not understand the review process.
The new editor does not understand the review criteria.

Taking these together: It's difficult to avoid the barrier for new editors; it exists, and grows with every complexification of policy. The question is, does the newbie deal better with AfC or NPP/speedy/prod/AfD? I've seen both ways argued strongly by people whose opinion I respect. Espresso Addict ( talk) 02:46, 4 April 2018 (UTC) reply

The review process takes long enough that the new editor becomes disengaged.

Research plus anecdotal report passim suggests newbie editors are strongly motivated by getting their articles in mainspace, and googleable, within minutes of writing them. I don't know whether AfC is ever going to be powered to achieve this. Espresso Addict ( talk) 02:46, 4 April 2018 (UTC) reply

New editors who have a confusing or frustrating first experience may be turned off from continuing to contribute and becoming active editors.

I think the only argument here is with the 'may'! Espresso Addict ( talk) 02:46, 4 April 2018 (UTC) reply

AfC reviewers may have a large backlog, which may cause them to burn out. This is especially true if the backlog of submitted drafts grows faster than reviewers can handle while maintaining high quality standards.

I don't think backlogs is a good way of looking at these articles-in-waiting. We need to avoid big or growing backlogs altogether, somehow. Speedies manage by having a fairly large pool of people (all active admins) to draw on when they pile up. How can that be duplicated? Espresso Addict ( talk) 02:46, 4 April 2018 (UTC) reply

...on goals and metrics

Help reviewers get through the backlog faster

As I wrote earlier I think backlogs is a bad way of looking at this process. We need a way of motivating reviewers & recruiting new ones but I don't see the Foundation as able to help here. Perhaps rewrite as 'Direct reviewers to find draft articles that they can process more easily'? Espresso Addict ( talk) 03:49, 4 April 2018 (UTC) reply
@ Espresso Addict:On motivating reviewers - Do you think a stats page or leaderboard would help at all? Also, we'd discussed at VP:T a set of 'top/bottom 25 OK (ORES)/spam (ORES)/copyvio' tables maybe being at WP:DRAFT. SQL ^{Query me!} 23:20, 4 April 2018 (UTC) reply

We don't want to encourage speed over accuracy/helpfulness -- that's certainly been a factor in drives in other areas. Nor necessarily cherrypicking easy accept/declines over ones that need some research (though cherrypicking the best ones is a good notion). A weekly/monthly reviewer of the week/month award, where people can be nominated based on quality of review, helpfulness to newbies, as well as number of articles processed? A set of barnstars which include helpfulness, hands-on-ness, &c? Just encouraging using the thanks button if you see a particularly good review. I like the idea of tables of high/low ORES scores -- more than 25 would be useful to avoid edit conflicts (50? 100?). Espresso Addict ( talk) 23:49, 4 April 2018 (UTC) reply

Fair enough on the leaderboard - that makes a lot of sense. I'm finishing up some bug testing / fixes on the top tables and they should be in time for the next SQLBot run. ( User:SQL/AFC-Ores/Top25Spam, User:SQL/AFC-Ores/Bottom25OK, User:SQL/AFC-Ores/Top25OK, User:SQL/AFC-Ores/Top50CopyVio). SQL ^{Query me!} 01:08, 5 April 2018 (UTC) reply

@ Espresso Addict: with respect to your proposed rephrasing of this goal, I think I would say that we are definitely including the idea of finding draft articles faster as one of the "methods" in the "Potential Improvements" section of the page. I hope that sufficiently covers your thinking here. -- MMiller (WMF) ( talk) 01:01, 7 April 2018 (UTC) reply

Get more quality articles in main namespace faster

I think this is of major importance, especially on the assumption that ACTRIAL will soon be turned on permanently. As I've written above, I think this is the most important thing that will motivate the better class of newbie editors. Espresso Addict ( talk) 03:49, 4 April 2018 (UTC) reply

Make AfC a process that grows active editors

I'd love for this to be the case but it mostly depends on a change of mindset as to what AfC is primarily for. Elsewhere people have talked about amalgamating AfC with NPP but I think it might be more productive to look at how the Teahouse folk might be brought in to help. Also the few Wikiprojects that remain active. As I wrote earlier, we need to look at AfC as an online-always-open editathon, so the folk who run these could be another resource. Espresso Addict ( talk) 03:49, 4 April 2018 (UTC) reply

... on next steps

I understand where the deadline's coming from, but why wasn't this all started a bit earlier? Six days is a rather brisk timeframe to get consensus on something so complex. It's important that the community here doesn't feel railroaded by the Foundation into unhelpful changes. Should this discussion be more widely advertised? There have been discussions about AfC all over the place of late. Espresso Addict ( talk) 04:13, 4 April 2018 (UTC) reply

Espresso Addict -- thank you (and everyone else) for your detailed reactions and comments so far. I think this conversation is going to be really helpful. With respect to your question about the timeframe, I set next week as a goal for when we might be able to consolidate around a couple of leading ideas, which I was hoping would be doable because of all the related conversation that has already been happening over the past few weeks after ACTRIAL. We are trying to start this project on an ambitious pace so that we can take advantage of some available engineering/design/product resources that we'll have here during the next couple months, but if next week comes around and we are still having important discussions that new voices are joining, we do have the flexibility to continue the discussion. I definitely do not want to rush into unhelpful changes. And thank you for the idea about posting in other places -- I will post in some of the other AfC-related Talk pages to let people know. Does this all make sense? I'll also be back in a few more hours with follow-up questions to the ideas posted so far. MMiller (WMF) ( talk) 16:57, 4 April 2018 (UTC) reply

Thanks, MMiller (and thanks for your work on this, it is appreciated). I'm not sure where to suggest posting; I don't think advertising it very widely is going to be productive. We should certainly involve the group convened by @ Kudpung: who are looking into these issues. Possible other suggestions are the discussion on AfC under the ACTRIAL RfC, Teahouse, NPP. Anyone other suggestions? Espresso Addict ( talk) 19:49, 4 April 2018 (UTC) reply

Definitely post to Wikipedia talk:The future of NPP and AfC. ~ Kvng ( talk) 20:32, 4 April 2018 (UTC) reply

Thanks, Kvng and Espresso Addict. I posted at Wikipedia talk:The future of NPP and AfC, the ACTRIAL RfC, and in a conversation at Village Pump about using ORES scores in AfC. Hopefully that brings some new voices in. MMiller (WMF) ( talk) 22:16, 4 April 2018 (UTC) reply

Paid editing

I was surprised not to see any mention of WP:COIEDIT in MMiller (WMF)'s analysis of AFC. There are two things that editors with a COI are told:

you should put new articles through the Articles for Creation (AfC) process instead of creating them directly;
you should not act as a reviewer of affected article(s) at AfC, new pages patrol or elsewhere;

Vexations ( talk) 00:42, 5 April 2018 (UTC) reply

Suggestions from Gryllida ( talk) on 03:54, 5 April 2018 (UTC)

Hi MMiller (WMF)! It is my impression that there are two sides to the problem of article creation.

Reviewers:
- often do not know content (wikiproject integration is needed and is on your todo)
- often do not edit the article, I think inline comments are a nice way to illustrate the problems with an article if they are delivered to newcomers in an adequate manner

@ Gryllida: -- thank you for the detailed thoughts. Could you say more about how you think the "inline comments" could work? Or is there an existing page I could read about the idea? -- MMiller (WMF) ( talk) 20:32, 5 April 2018 (UTC) reply

This or that. I am yet to figure out what is the best place to put them, for the article authors to respond. Inside of the draft itself they may have difficulty removing these comments themselves. Gryllida ( talk) 21:00, 5 April 2018 (UTC) reply

@ Gryllida: I think that's a really interesting idea; thanks for the examples. How have the draft authors tended to react to the inline comments? Do they tend to address them all? -- MMiller (WMF) ( talk) 22:34, 6 April 2018 (UTC) reply

I had one successful case of addressing the comments in Draft:Zoom+Care, seemingly made by a COI editor whose interest was to keep the draft as pretty and article-like as possible, as opposed to a draft with underlined and striked out highlighted commented content etc.

I suppose it depends on the ability of the contributor to: (a) distinguish between draft and draft talk if comments are on draft talk (should be easy); (b) remove my comments from draft (medium difficulty, gives the advantage of seeing them in-place); (c) motivation to finish the draft.

Other things I had been doing is write the comments on draft talk and leave the author a message on their personal talk page, for example [ [5]].

I intend to make some kind of tracking, maybe an table with notes, to see which of various ways and places of leaving comments and communicating work better. Once I do that I will let you know.

-- Gryllida ( talk) 22:24, 7 April 2018 (UTC) reply

Authors:
- Often do not understand what they need to do
  - write bias
  - write sentences which are not confirmed by sources in any manner whatsoever
- After a review, are confused about how to seek help
  - The templates are long and confusing (this is on your todo)
- After a review, do not put effort into making the necessary modifications
- After a review or before a review, face article cluttered with templates, leaving little space for understanding how to use the draft talk page

There are a few points that I would suggest to improve in what is already written on the main page.

I like the points "Help reviewers get through the backlog faster - Get more quality articles in main namespace faster - Make AfC a process that grows active editors". I think one more missing point is "get newcomers to understand the work that they are expected to do".
"Bring notability to the forefront of the guidance in the Article Creation Wizard -- right now, the wizard does not mention notability." - I think this is a good goal, but 'notability' is not an excellent way to put it. A lot of people just start thinking in response, "MY subject is notable!" and the notability idea becomes a confrontation instead of guideline.
"Route drafts to AfC reviewers who are experts in the draft’s topic, and more likely to be able to detect notability." - agreed, I think this may be done via wikiproject tags and Special:PageAssessments already. However
- addition of a new draft is not used to notify people, for instance via Echo;
- the wiki has no bug tracker, meaning people do not have a chance to assign improvement of a draft to themselves.
In my opinion lack of these features results in decreased participation of WikiProject members.
"Rotate reviewers on repeat submissions instead of routing repeat submissions to the same reviewer." I am not sure this is needed, but perhaps.
"Alter template language to clarify the AfC process for new editors and to encourage them to improve their drafts." I do not like templates in the first place. I'd suggest using messages on the draft talk page. Yes, they need to be more friendly to submitters.
"Make it easy for reviewers to search for notability, potentially by implementing something similar to the “Find sources” links in Articles for Deletion." This is Done already implemented via an edit notice. Click 'edit' on any draft and the 'find sources' template is shown above the edit box. (With Visual Editor it is collapsed in 200px width and uncomfortable, perhaps you can fix that?)

@ Gryllida: -- thanks for pointing this out. I had not noticed this. -- MMiller (WMF) ( talk) 20:32, 5 April 2018 (UTC) reply

"Take steps toward a shared interface between AfC and NPP, since those processes are similar in many ways." Agreed.
"After three declines, move draft to Miscellany for Deletion." I'm not sure it would help. Maybe, maybe not. Someone more experienced needs to comment.
"Apply machine learning models via ORES to predict the likely quality of submitted drafts, surfacing the best or worst to reviewers to streamline their workflows." I would be ok with this, but I think editors need to have access to the learning set -- release it to the public -- and contributors may benefit from being able to modify it to suit their needs (topic areas, preferences, their own training set of articles).
"Make it clear to autoconfirmed users that they can move their drafts to main namespace themselves without waiting for review." I am not sure whether this would be useful, people would gladly use this feature which would result in their draft being deleted which may defeat the whole point.
"Develop a way for drafts to be discoverable by editors who wish to collaborate and improve them." this was already discussed above when speaking of experts in a topic, agreed it is a good idea already imlemented via page assessments and wikiproject tags partly.
"Any of the many improvements to the Helper script listed in github: https://github.com/WPAFC/afch-rewrite/issues" sure
Additional comments:
- The Wikipedia Article Wizard integration with sister projects is very poor resulting in Wikipedia growing very quickly while the sister projects (other languages or other themes such as wikibooks) are underappreciated. This has improved when sister search results were introduced, but some of the sister projects were disabled there without an opportunity to view their results, which I personally find concerning. I would suggest that if Wikipedia administrators dislike a sister project they still need to leave it openly visible, as a family responsibility, since the Wikimedia projects are a family.
- Another issue is that people start wanting to write complete articles. Instead I would propose to have several phases to new article creation:
  1. "Title + URLs to sources" i.e. 'Microsoft, http://foo, http://bar'
  2. "Title + URLs to sources + summary of info about Title from each Source", ie. "Microsoft, http://foo says it is in america and was founded in 1998, http://bar says it is the largest company on the planet with N emplyees and Y revenue"
  3. "2-3 paragraph bias-free material based on the above, formatted properly"

@ Gryllida: -- I think this is an interesting way to think about teaching newbies how to iterate their way to strong articles. -- MMiller (WMF) ( talk) 20:32, 5 April 2018 (UTC) reply

- I would like to emphasize that the current manner of handling newly created drafts is a big improvement over their deletion straight away, however it is still a psychologically challenging process and if we want draft to be improved then we need to make it interesting for the authors to make the necessary improvements instead of imposing restrictions which they resent and find unnecessary and unhelpful, and break the process of article creation into small steps so that the article being created is not terribly massive (also eases reviewing).

Hope some of this is helpful. (I was criticized before for not reading past discussions on the talk page of wikiproject afc, and I didn't do it this time either. Some of the suggestions above may be against the afc reviewers consensus or experience.) -- Gryllida ( talk) 03:54, 5 April 2018 (UTC) reply

Additional comment, it may be useful to make the requested articles process more newbie friendly, more efficient (through marking relevance of submitted entries and allowing people to assign some requests to themselves), and engage wikiprojects in content writing from here. Gryllida ( talk) 04:00, 5 April 2018 (UTC) reply

In-context help and onboarding

Marshall, in respect of your Potential Improvement approach A: Help drafts to be submitted in better condition in the first place, are you aware of and interfacing with the In-context help and onboarding initiative to guide new users in their editing? If this bears fruit then the whole education process for those creating new articles would not need to be incorporated in the Wizard : Noyster (talk), 10:03, 5 April 2018 (UTC) reply

@ Noyster: -- thank you for bringing this up. Yes, I work closely with JMatazzoni_(WMF), who is part of that project. That initiative is definitely relevant, and currently unclear what it will mean for the future of the Wizard. This is a good reminder that I'll make sure to check with that team as the conversation on this page progresses, to make sure we don't spend time on anything duplicative or that will be superseded quickly. -- MMiller (WMF) ( talk) 20:27, 5 April 2018 (UTC) reply

@ MMiller (WMF):, while community volunteers have bravely attempted to smarten up the simple Article Wizard, the wizard is something that could easily be scrapped for something better. The In-context help and onboarding project is broad in scope and while very important and the best solution I have come across, it would be a mamoth engineering task and take years to realise - let Ryan tell you how long. That said, I was in awe how quickly they built page Curation/New Pages Feed. Kudpung กุดผึ้ง ( talk) 00:18, 6 April 2018 (UTC) reply

The in-context help project sounds wonderful. Can we upvote it? Espresso Addict ( talk) 01:43, 6 April 2018 (UTC) reply

@ Espresso Addict: I'm glad you're enthusiastic. Since, as Kudpung says, that's a much broader and separate project, I think it would be helpful if you added your thoughts on its talk page.

Find a way to prevent submission of very short drafts

When a draft is very short it is practically guaranteed to be declined. It's practically impossible to properly identify a subject and state a reasonable claim of significance (A7) in less than twenty or so words.

It would help reduce the GFOO if "drafts" such as "I am Ashish from Mumbai, I'm in tenth grade and like football" or "The Noisemakers are a cool new hard metal band from Liverpool, we have 20,000 likes on Youtube <Link to Youtube>" could be prevented from even entering the workstream. Roger (Dodger67) ( talk) 11:21, 5 April 2018 (UTC) reply

Start with the hundreds of blank submissions. Make the submit button not even work until there are X bytes on the page. In G13 work I've never found anything useful under 450 bytes of text. If they are going to submit nonsense force them to make a paragraph worth. "When a draft is very short it is practically guaranteed to be declined. It's practically impossible to properly identify a subject and state a reasonable claim of significance (A7) in less than twenty or so words." is 35 words and not enough text to ever pass as an article. Legacypac ( talk) 12:19, 5 April 2018 (UTC) reply
@ Legacypac: - Holy COW, there are a lot of drafts < 450 chars. I was going to make a table of them, but there were north of 40,000. Even limiting it to 45 chars resulted in 13,102 drafts. SQL ^{Query me!} 13:37, 5 April 2018 (UTC) reply
Nevermind, it's early, forgot about redirects. The above link lists ~3000 pages > 0 but < 450 bytes, and are not redirects now. Still, that's a lot. SQL ^{Query me!} 13:43, 5 April 2018 (UTC) reply
It's worse than 3000 - that's after I've processed over a 1000 AfC Declined Drafts by CSD G2 and blanked many more Declined blank sandboxes in the last few months. Thank-you for User:SQL/Short_Drafts There is pretty much nothing on it that should have been created as a new page in the first place. The effort to delete far outweighs the effort to create. When we CSD we create almost as many talk pages with a nasty deletion notice,. Way better if the new user got a message like "You have not added enough content to create a new page" when they try to save Draft:4266 "4266 (MMMMCCLXVI) Starting a Leap Year" or Draft:Ajay Gautam Associates with "Ajay Gautam Associates is group of Indian Lawyers practising (sic) independently in various courts of India" or Draft:AllianceGamerz =AG= with "to be continued" Legacypac ( talk) 14:15, 5 April 2018 (UTC) reply

There's a little easier list as well at User:SQL/Zero-Length_Drafts#The_List that I work periodically FWIW. SQL ^{Query me!} 14:37, 5 April 2018 (UTC) reply
Anyhow - back on topic - I think this could probably be done via an edit filter? I'm not 100% sure if that's the right approach, however. SQL ^{Query me!} 17:32, 5 April 2018 (UTC) reply

I don't know about preventing submissions of very short material. Editors should be encouraged to state the main source of notability as succinctly as possible. (Margaret Thatcher was the Prime Minister of the UK.[ref] Marie Curie was a Nobel-prize-winning physicist.[ref]) Often the longer they go on, the more puffery/irrelevance gets inserted. ETA: And long newbie subs are often copyvio. Is there a way of rejecting/deprecating offerings entirely without references (even embedded external links)? Espresso Addict ( talk) 22:01, 5 April 2018 (UTC) reply

Encouraged to state the main source of notability for "I am Ashish from Mumbai, I'm in tenth grade and like football" ? Junk drafts should be nuked. Get a bot to delete that 40,000 or whatever it is. Even if they go G13 it would take years to get rid of them manually, and these clean up operations use up valuable resources that could be doing something else. An edit filter (or something) is the solution, such as Legacypac's suggestion. Kudpung กุดผึ้ง ( talk) 00:41, 6 April 2018 (UTC) reply

@ Kudpung: There's stuff here to figure out, but I think it's doable. Should account age or # of edits matter? What about user groups? How many chars should be the cutoff? SQL ^{Query me!} 03:08, 6 April 2018 (UTC) reply

My work with G13 drafts were I could sort thousands of pages by size proved 450 charactors was a good minimum limit. I found I could blindly CSDing anything below that limit and lose nothing of any value. A couple sentences and ref get you there, even less if the Default Article Wizard text is present or afc template. We could go higher, but 450 is a good start for now and limit restriction to Draft space for now - maybe expand later after we prove the point. In Draft space no one needs to start with a REDIRECT so that does not need to be an exception. Maybe someone can think of some exception but I'd restrict it to Admins. Legacypac ( talk) 03:25, 6 April 2018 (UTC) reply

( ←) - in it's most basic form - this should work (I think, I'm not an expert on edit filters): User:SQL/AFC-Ores/editfilter. SQL ^{Query me!} 03:33, 6 April 2018 (UTC) reply

As it stands I can't even CSD lot of them. G2 is very narrowly interpreted, there is too little info meet G11 or G12 and G3 does not often apply. User:SoWhy lead the rejection of expanding U5 to nuke this kind of junk. Anyway why should anyone need to think about which CSD to use? No one want 1000 pages or whatever of junk run through MfD. Stop creation in the first place. Legacypac ( talk) 00:55, 6 April 2018 (UTC) reply

We need two changes: A) Prevent creation of any Draft page without at least 450 bytes B) Prevent AfC submission of any userspace page below 450 bytes. Legacypac ( talk) 03:29, 6 April 2018 (UTC) reply

I'd support that as well. One of the things that really concerns me is those people who translate the first line of a German or French Wikipedia article and then expect us to do the rest. IMO such stubs should be sent to draft. Or nuked. Kudpung กุดผึ้ง ( talk) 04:53, 6 April 2018 (UTC) reply

If 450 is a hard and guaranteed number, I can have it coded today. Primefac ( talk) 15:28, 6 April 2018 (UTC) reply
I would strongly object to anything of the sort. A couple of examples. this early version] of thr article 500 Miles High I encountered it on speedy delete patrol, when it was 182 bytes. i had no trouble sourcing it and gettign it nthrough DYK. this early version of the article Great American Lesbian Art Show was 224 bytes, 207 without the speedy tag. I had no problem sourcing it and getting it through DYK. There have bene quite a few articles and drafts i have encountered below 450 byes which went on to be valid articles. Moreover this would be a violation of current policy, and would require a site-wide RfC to be valid. I urge you not to go there. DES ^(talk) _{DESiegel Contribs} 01:48, 7 April 2018 (UTC) reply

@ DESiegel: - what if the filter action was to warn, instead of prevent? SQL ^{Query me!} 01:31, 8 April 2018 (UTC) reply

SQL, that might be more acceptable, depending on how the warning is written. It would have to be carefully crafted not to WP:BITE those attempting in good faith to start articles on legit topics, or topics they believe are legit. I will admit that most very short drafts are also of poor quality. The problem is that a poor quality draft does not mean an unsuitable topic, as I trust my examples above show. Also, in my view, we should rather welcome a thousand poor-quality drafts that wind up never making it to article status than scare off one that would have become a valid article. We have got to err on the side of acceptance, I feel, or have our source of new content and new editors dry up. Several of the suggestions above seem so intent on keeping out "bad" submissions, that they overlook this point, and over look that bad submissions may come from good 9or potentially good) editors. DES ^(talk) _{DESiegel Contribs} 03:49, 8 April 2018 (UTC) reply

I would probably argue that an acceptable draft would need to be at least 250 bytes before it even should be considered for submission - a one sentence garbage entry I made is 282 bytes. I know we'll probably just get into "what is the minimum acceptable value?" pissing contest, but if either of your examples were submitted as a draft they would be immediately declined for not having references (but a reference would bump them up over 250?). Primefac ( talk) 14:40, 9 April 2018 (UTC) reply

The content generated by the article wizard alone is 337 bytes. MER-C 14:51, 9 April 2018 (UTC) reply

Right, but the content would be about 250. This, then, would lend credence towards "450" as being the best "minimum" for a submitted draft. Primefac ( talk) 15:28, 9 April 2018 (UTC) reply

In fairness, my earlier comment was procedurally improper anyway, and the implementation of a "minimum submit size" would obviously need more support than just those who happen to be interested in an obscure sub-page of a WikiProject. At the very least it would require a discussion at WT:AFC itself, but probably wouldn't require a full-on RFC. Primefac ( talk) 15:30, 9 April 2018 (UTC) reply

I see where you are comming from, Primefac and I tend to agree that it would be hard to fashion an acceptable article with less than say 400 bytes. (Although if the article wizard is not used, and a single short-form source is provided, it might happen.) But there is a difference between declining a submitted draft, and not allowing it to be submitted at all. If a draft is submitted and declined for not having sources (or not enough sources) the decline message should let the user know what is needed. A "not long enough" auto-message would simply encourage the addition of padding and not pinpoint that sources are needed, rather than simply more bytes. Also, remember that the policy is still verifibility, not verification. A totally unsourced but verifiable article that is not a BLP is valid in mainspace, after all. Maybe we should change that, but we haven't yet. Also recall that there is no requirement to use the article wizard. Without its framework, a minimal valid article is significantly shorter. But my main conceern is for a not-yet-valid but promising draft. I think a filter that does not even permit submission is more WP:BITEy than a decline would be, particualrly with a helpful decline comment. DES ^(talk) _{DESiegel Contribs} 22:19, 9 April 2018 (UTC) reply

Anyone submitting a one-sentence unreferenced draft is essentially jumping the queue at Requested articles, and that's where they should be directed to : Noyster (talk), 15:46, 9 April 2018 (UTC) reply

Oppose per WP:CREEP, WP:DISRUPT and WP:IMPERFECT. I commonly advise newbies at editathons that all they need to get a new article started is a solid first sentence and a good source. It seems important to emphasize this so that they don't feel intimidated and frightened off by the feeling that they have to start with a page of GA quality. For example, see Lucy Finch, which I started for a BBC 100 Women event. That was just 370 bytes and much of that was minor boilerplate like a stub template. Andrew D. ( talk) 16:00, 9 April 2018 (UTC) reply
And, your example would be allowed / not warned - as it contains a link (and, not being in the Draft: namespace, nor being submitted via the wizard, or being in CAT:AFC - you know, almost every point being discussed here). SQL ^{Query me!} 05:21, 11 April 2018 (UTC) reply

Draft sorting

Some Wikiprojects have enabled a "class=Draft" parameter in their project banners. This "in theory" is suposed to attract the attention of project members to the existence of a new draft in their subject area. However, projects either don't notice or do not know that they are thereby invited to help review or even develop the draft.

I think if we could clone the WP:Stub sorting system it might better facilitate recruitment and participation from subject specialist editors. Drafts that wait the longest for review tend to be on topics that really do require an expert to review. For most untrained people advanced mathematics/physics/linguistics/engineering/chemistry/medicine/etc is indistinguishable from nonsense. Roger (Dodger67) ( talk) 11:36, 5 April 2018 (UTC) reply

See this tool. jcc ( tea and biscuits) 21:47, 5 April 2018 (UTC) reply

I agree that Wikiprojects should be recruited. The problem with class=draft is that Wikiproject members only usually come in to look at class=[null]. Class=draft is interpreted as doesn't need review at this time. Espresso Addict ( talk) 23:25, 5 April 2018 (UTC) reply

Reduce the adversarial "trial by fire" nature of AFC

We need to find a way to help newbies better understand that AFC is not an adversarial "reviewers want to destroy your hard work" process, that it is in fact a "reviewers are here to help you get the draft into an acceptable state" process. Roger (Dodger67) ( talk) 11:43, 5 April 2018 (UTC) reply

Agreed and I'll add we treat all new users alike regardless of intent - we template them with light red Decline notices. I'd like to find a friendlier way for the GF newbies and a quick and decisive delete process via CSD or MfD for the spammers and self promoters. Welcome the good and bar the bad. Legacypac ( talk) 12:12, 5 April 2018 (UTC) reply

Moreover, bad faith submissions get a nicely worded decline template and an invitation to the Teahouse on the user talk page -- example: Special:Undelete/Draft:Buddha Teas and User talk:Ena.zazula. This is not the message we want to send -- the correct response to a first person spam page is the banhammer. MER-C 16:47, 5 April 2018 (UTC) reply

I can't even get self promotion quickly deleted at MFD without drama Wikipedia:Miscellany_for_deletion#Draft:Pankaj_Kumar and Draft:Robert_Wynne-Simmons for recent examples. Why does it take six declines and a fight at MfD that spills over to discussion pages to delete spammy autobiographies? Legacypac ( talk) 17:29, 5 April 2018 (UTC) reply

User:Legacypac - I agree on those two drafts, but I think that the reason for the "drama" is that one particular reviewer is having a fit about the encouraging wording of the decline template. To give him the benefit of the doubt, I think that he thinks that crud like those two should not be given the discussion at MFD that MFD takes, and would prefer to use blocks rather than deletion as the ultimate weapon for cruddy submits. By the way, I am reasonably sure that Draft:Pankaj Kumar does involve a blockable offense, which is sockpuppetry. Robert McClenon ( talk) 21:41, 5 April 2018 (UTC) reply

I don't like the decline templates either. If I had edit access to them I'd have changed some things months ago like I've proposed. I'd have changed the wording beside the "submit your draft" button too as I've proposed on template Userpage. What I've not done is turned the normal use of MfD into a battleground to prove my point.

I've had poor luck with sockpuppet investigations. Maybe I don't understand it but when I've reported what I thought was 100% sockpuppetry it takes a week to find out I was wrong. It should be simpler to use MfD. We could run a lot more junk through MfD if it functioned like an advertised Draft PROD where its quietly gone in a week unless someone disagrees for a legitimate reason. Legacypac ( talk) 22:11, 5 April 2018 (UTC) reply

That's probably the reason why a DfD (Drafts for deletion) should be created. Kudpung กุดผึ้ง ( talk) 00:47, 6 April 2018 (UTC) reply

Draft:Robert Wynne-Simmons is almost surely about a notable person, why anyone wants to delete it I do not understand. Personally I think there is much too much use of MfD to delete drafts now, and it is too easy, not too hard. But if DfD is created as a prod for drafts, expect me to monitor it and object whenever I think something is a plausible draft, unless you also change the deletion policy. DES ^(talk) _{DESiegel Contribs} 01:54, 7 April 2018 (UTC) reply

There needs to be a painless way of getting rid of utterly useless tosh that happens not to fall under any speedy criterion that applies in draftspace. But we shouldn't be using such a route for drafts such as Draft:Robert Wynne-Simmons where there's a potentially viable topic under the autobiographical cloaking. And robust patrolling of such a route/forum would be essential. Espresso Addict ( talk) 02:44, 7 April 2018 (UTC) reply

I don't really accept that there does need to be such a way. I would be happy to limit deletion in draftspace to what falls within the strict limits of the WP:CSD. But even if I did accept that premise, for argument's sake, given what has happened to many drafts at MfD, and by speedy deletion, I domn't really trust that such a new "easy" forum would be limited to "trash". Note the comments above saying that Draft:Robert Wynne-Simmons is an example of what should be deleted, with no one objecting before my comment? How will "viable" drafts be screened out of such a process? Who will decide? Who would "patrol" the process? Those details mater. DES ^(talk) _{DESiegel Contribs} 02:58, 7 April 2018 (UTC) reply

Well, I objected by commenting in the MfD, which seemed like the most-appropriate venue. There is a problem with autobiography here & elsewhere -- the author is often incapable of generating an unbiased article about him/herself, and there's a feeling that why should we the community clean up a vanity page. I suspect there'd have been more discussion if a similarly promotional-but-notable article had been written by someone other than the subject. The kind of thing I'd like to be able to delete in draftspace would be "[ABC] attends [XYZ] High School and likes history, LOL." The deleting admin is always responsible for the deletion. Espresso Addict ( talk) 04:06, 7 April 2018 (UTC) reply

Yes, commenting in the MFD itself is the best way to deal with that particular draft. But here it was beng put forward as an example of the kind of thing that should be deleted, and in the absence of objections, it might be deemed that there was consensus to do that in future. How could we be sure thaat MD, or any new process, would be limited to handling [ABC] attends [XYZ] High School and likes history, LOL. and the like? That isn't all we are seeing nominated for deletion at MfD now. As I said in another section, I think we should rather keep 1000 bad drafts than delete one that will eventually become a valid article. And at the new draft level often no one can tell which is which. DES ^(talk) _{DESiegel Contribs} 03:59, 8 April 2018 (UTC) reply

But in spite of my views above, I very mush do think we need to reduce the adversarial nature of AfC. At present the AfC directions say that the standard is that drafts unlikely to be deleted at an AfC once approved should pass, others should fail. Perhaps we need a 2-level process, and sugested above, level one for the minimal basics to be a valid article, level two for incubation to become a solid article. I'm no sure just how that would work, or where the line should be. I will say that when i do AfC reviewing, i always make some attempt to help fix issues that can be fixed fairly easily, not just look for a reason to decline or approve. OTOH, that takes longer per draft. DES ^(talk) _{DESiegel Contribs} 04:06, 8 April 2018 (UTC) reply

I believe it would be possible to draft a variant of A7 which (if deleting admins read the rubric) would work at distinguishing high-school students from potential article subjects. I think it was me that suggested the idea of a two-stage process, so I can't second this, but more and more I think that's the way to go. We need to identify good drafts (and their creators) and nurture them, whilst continuing to reject the useless ones. Espresso Addict ( talk) 04:35, 8 April 2018 (UTC) reply

note - I fixed the header of this section. It is clear from the more developed first sentence that AfC does have an adversarial nature, but some new editors make it that way; arguing instead of listening and learning what they should do. So much of an editor's success depends on them being will to learn what they should do instead of doing what they can do or what they think they should do... Jytdog ( talk) 04:24, 8 April 2018 (UTC) reply

It goes both ways. Unhelpful, incorrect declines result in AfC feeling like an adversarial place. I don't recall what the original section title was but changing it mid-discussion seems confusing & potentially misrepresenting responders' views. Espresso Addict ( talk) 04:29, 8 April 2018 (UTC) reply

I have reverted the section header change.In my view the adversarial aspect comes at least as much from the reviewers as from the new editors, and the header should not imply otherwise. DES ^(talk) _{DESiegel Contribs} 04:44, 8 April 2018 (UTC) reply
Besides, changing a discussion header after many comments, thereby breaking links in the page history, and changing the context in which previous posts were made, is in my view poor practice, and I object to it. DES ^(talk) _{DESiegel Contribs} 04:46, 8 April 2018 (UTC) reply

The solution for reviewers who approach the task as an adversarial one is quite obvious - just take away their reviewing pencil. Roger (Dodger67) ( talk) 15:10, 9 April 2018 (UTC) reply

As someone who cleans Draft space - DES's comments appear far outside the norms by advocating keeping 1000 junk drafts in case 1 is a good one. That is a very bad idea that only benefits the spammers and vanity editors. Coupled with this Admin's narrow ideas of CSD applicability, and voting to keep junk at MfD it suggests a disturbing pattern. Admins are supposed to carry out the will of the community not push for the inclusion of garbage. Legacypac ( talk) 05:04, 8 April 2018 (UTC) reply

The last time there was a community discussion on this issue (pr something close to it) that I recall was this RfC where there was a strong consensus not to apply notability standards to pages in draftspace or userspace. Yet I see Legacypac attempting to apply such standards frequently at MfD and elsewhere. the speedy deletion criteria start off with The criteria for speedy deletion (CSD) specify the only cases in which administrators have broad consensus to bypass deletion discussion, at their discretion, and immediately delete Wikipedia pages or media. They cover only the cases specified in the rules here. and go on to say that Administrators should take care not to speedy delete pages or media except in the most obvious cases.. I think this makes it clear that my concept of the CSD as narrow, bright-line standards, to be adhered to strictly, has general consensus. If that consensus has changed, then we need an RfC or other community-wide discussion to make this clear. If it does change, I will adhere to whatever new consensus has formed, whether I agree or not. DES ^(talk) _{DESiegel Contribs} 13:09, 8 April 2018 (UTC) reply

Decline templates and the comment system (comments by Insertcleverphrasehere)

We are not going to get consensus to sweeping changes to the fundamental nature of AfC, not in this short period, perhaps not ever. This also applies to trying to define the answer to the big question of what AfC is for (pre-triage or for helping new users get their articles ready for main-space). Given this, I think that work on the decline templates is probably the easiest avenue forward given the short time period of this proposal. In particular:

Clearing out some of the worst decline rationales, such as the oft-abused inline rationale, to help reduce spurious declining.
Giving reviewers more options for decline templates when declining a draft. In particular, a decline template for hopeless drafts that contains stronger wording (i.e. "not suitable for Wikipedia"), and without a resubmit button.
Before declining with this 'hopeless' template on notability grounds, the reviewer should include a search for sources (i.e. WP:BEFORE). This would reduce the current situation where the submitter wastes everyone's time spamming the article with unsuitable references after receiving a message that it doesn't have enough references (i.e. a non-notable topic can't be made notable by including a bunch of non-reliable sources or a bunch of trivial mentions; there is no point telling the submitter that it needs more references if it is clearly non-notable).
Creating another decline template option for reviewers for drafts that look like very promising topics but have other serious issues (changing it from 'declined' to "Not quite Ready" or something).
Adding a message on the decline template (but not on the 'hopeless' version) telling the submitter that if they don't agree with the review that AfC is optional and that they are free to move the article to mainspace themselves, and also a warning that it might be deleted if the topic is unsuitable. This message would be hidden to non-autoconfirmed users per the very-likely permanent implementation of ACTRIAL (we have code that makes this possible).

In addition, our comment system at AfC is garbage, as it works completely differently to all other discussion systems on wikipedia, and makes it difficult for new users to discuss things with the reviewer (thus resulting in the submitter posting to the reviewer's talk page and fragmenting all discussion about the draft across a bunch of pages)

A better way could be changing the 'comment' system so that it posts a section to the talk page under the reviewers name (i.e. ==Comments by Insertcleverphrasehere==), and posts a link to the submitter's talk page inviting the user to comment there (and probably also puts a template on the draft page indicating that there are comments on the talk page). Hidden in this section with  tags should be instructions how to sign messages, how to ping the reviewer, etc (perhaps a template with a button that automatically adds a pre-generated message similar to how the CSD 'contest' message system works, and also contains a ping to the reviewer). This would help the submitter learn how to use talk pages; skills that they will need as an ongoing editor. — Insertcleverphrasehere ^{(
or here)} 21:23, 5 April 2018 (UTC) reply

YES YES YES exactly what we need to do. One more key thing - Add a few words to the Userpage template (and other templates with afc button) to inform the the new user that Submit means sending it for an experienced editor to review for publishing as a new article. Either the world is full of truly clueless people or we are not telling them early enough what Submit means done for Userspace, will do for Draftspace. Legacypac ( talk) 22:21, 5 April 2018 (UTC) reply

Thirded. All of this sounds great. Espresso Addict ( talk) 23:09, 5 April 2018 (UTC) reply

I would certainly support this. All of it. Kudpung กุดผึ้ง ( talk) 00:52, 6 April 2018 (UTC) reply

I agree with all of this. The only thing that I want to add is that speedy deletion should be used more liberally to rid shit drafts. MER-C 11:22, 6 April 2018 (UTC) reply

Unfortunately speedy deletion is of neccessity not amenable to "liberal use". Speedy criteria are narrowly defined and "I know crap when I see it" will never be an acceptable rationale. Roger (Dodger67) ( talk) 12:42, 6 April 2018 (UTC) reply

I'm saying that anything that can be deleted under the existing criteria should be deleted (test edits, blank submissions and spam in particular), instead of being kept for some unknown reason. MER-C 13:29, 6 April 2018 (UTC) reply

Support them all; it'd be interesting actually if we do the change of adding that move button to log the moves done by users clicking on the link - whether articles survive or not and what not, and how often it is used (probably also put a notice there for "don't move it mainspace directly if you have a COI").

I'd also think that notice should be added to the every submission template at the bottom of each draft - I don't think it is bad to let people know they don't have to go through the system; it is likely to cause a some bad submissions but nothing I think that NPP can't handle, but it could help (we can do it as a TRIAL perhaps). I'd expect only a small fraction of submissions to be moved this way even if the notification is there everywhere as most drafts are hit and run affairs from non-autoconfirmed users. Galobtter ( pingó mió) 20:00, 6 April 2018 (UTC) reply

100% agree we should CSD everything that is CSDable - I've been saying the same thing for a long time. It shoulf be easy to include the CSD tag right in the decline post to the page. Send a message to the user it was a Test Edit and a G2 to the page. But even if that is not possible Reviewrs should twinkle the CSD on. When I see really inappropriate pages in userspace pop up I just CSD without even declining it. Think like a NPR while doing AfC. Legacypac ( talk) 21:55, 6 April 2018 (UTC) reply

Backlog graph

Is there a graph of the backlog anywhere- like this but working? jcc ( tea and biscuits) 21:48, 5 April 2018 (UTC) reply

@ Primefac and There'sNoTime: who might be able to help. Nick ( talk) 09:34, 6 April 2018 (UTC) reply

There's this, plus I keep track of other stats. Primefac ( talk) 15:49, 6 April 2018 (UTC) reply

@ Jcc: there's also https://tools.wmflabs.org/aivanalysis/afc.php, but I only started at the beginning of the year. SQL ^{Query me!} 22:57, 12 April 2018 (UTC) reply

Some thoughts/suggestions, not necessarily needing Foundation assistance

Numbered for convenience. (1) One fundamental problem, especially if/when ACTRIAL is permanently instated, seems to be that AfC needs to both (a) reject the rubbish; & (b) help the good-faith, slightly competent newbies to create usable articles. It seems that a two-stage process might be warranted, where an initial triage hard-declines the non-notable drivel/copyvio/blank/test and tags the remnant (20%?) as worth devoting some effort to. Heap (b) get an immediate talk-page notice with a "your draft has entered review, bear with us" message.

(2) A corollary is that we need to extend A7 to drafts that would be A7'd in a flash in mainspace. Or, if this is impossible, at least puts them in a box from which an admin is needed to retrieve them. This would need to be actioned by an admin.

(3) Another problem is that reviewers take on drafts where they don't understand the topic, its notability criteria & AfD atmosphere well enough. There are several ways of handling this but disabling the get-random-draft function would seem a possible first move.

(4) I'd like to see much more collaboration between reviewers (and others) on all but the most obvious declines. DYK has an index page for submissions that transcludes all the hook discussions allowing everyone to review other reviewers' efforts. Bad reviews get challenged and corrected; questions get answered. I don't know if that would work here; the multi-transclusion does tend to break the index page when subs get high.

(5) Is there a role for one or more co-ordinators? This has worked well in some other venues such as FAC/FLC.

(6) When I worked in academic journals, there were multiple levels of response:

accept as it stands (~5%)
accept with minor revision (~20%);
major revision needed but potentially acceptable (~45%);
reject, but very major revision might render acceptable (~30%);
reject & don't resubmit (and by the way we've informed all the other journals in our area of your nefarious activities) (<<1%)

and I wondered whether this could be adopted here, with some levels not necessarily visible to the author. Obviously the proportions would be different.

(7) It would feel more friendly if reviewers had "faces" as they do at the Teahouse, ie reviewers provided a brief, newbie-oriented bio. I wouldn't consider it negative if newbie creators were allowed to ping a reviewer with suitable interests listed to suggest reviewing their material. Obviously open to abuse but might help in some cases. Espresso Addict ( talk) 22:56, 5 April 2018 (UTC) reply

With regards to your (1): The original idea, still present in our reviewer instructions, is if the submission passes basic checks, we accept it. AfC was not originally intended to be an article incubator. The help we want to provide to good-faith contributers is to prevent them from making the mistake of putting something into mainspace that gets immediately deleted. Crappy articles on notable subjects are still unlikely to be deleted. I know a lot of current AfC reviewers don't want to be responsible for putting half-baked material into mainspace. Reviewers like this should concentrate on accepting stuff that does meet their standards, declining stuff that fails the basic checks and leaving the rest of it to others. ~ Kvng ( talk) 23:16, 5 April 2018 (UTC) reply

I think the DYK approach of a big list (we'd probably need multiple big lists by basic topic) and some degree of communal oversight would help reviewers to select articles they feel comfortable with assessing/assisting.

I do still think funnelling non-(auto)confirmed new articles through AfC is a big mistake. We need to invent (a) a functional incubator, and (b) a time machine to go back to before ACTRIAL. Espresso Addict ( talk) 01:27, 6 April 2018 (UTC) reply

a) Yes. b) This is what creates the backlog. NPP is a triage, AFC is a field ambulance, but the WP:Article Rescue Squadron is the field hospital.
Getting new CSD criteria created is one of the hardest things to do on Wikipedia. Worth a try.
This is why a better sytem is required for selecting reviewers. It underlines my argument for elevating AfC from an informal WikiProject to an essential core function like NPP. Perhaps when people see there is an official hat to collect it might attract more operatives of the right calibre, although those admins who work at WP:PERM are only too aware of the downsides; of ~530 new page reviewers, only about 50 are truly active. Some have never used the right they were granted. Kudpung กุดผึ้ง ( talk) 01:35, 6 April 2018 (UTC) reply
Don't know about this. I've never had anything to do with DYK.
I've never understood why this has never been done. That said again, when we ran an election to create a coord team for NPP there were no takers (at that time). A future election is scheduled when we know the official outcome of the curent discussion on ACTRIAL.
Worth consideration.
No - IMHO. That would make AfC too much of another one of the many 'help' venues. The Tea House is enough.

Kudpung กุดผึ้ง ( talk) 01:35, 6 April 2018 (UTC) reply

@ Kudpung: In case it isn't obvious, I came to this whole discussion from the ACTRIAL RfC, after having divide-by-zero errors at the idea that non-(auto)-confirmed editors should be directed to AfC to create new articles. I did look in at your project but got put off by the wall of text that I'm meant to have digested before my opinions are considered worthy of hearing. Nothing has yet convinced me that this is better than the landing page saying, politely, "sorry, no, you don't have enough experience yet to do create articles -- gain experience by [A/B/C...]?" Espresso Addict ( talk) 02:05, 6 April 2018 (UTC) reply

My project? Kudpung กุดผึ้ง ( talk) 02:58, 6 April 2018 (UTC) reply

Wikipedia:The future of NPP and AfC. Espresso Addict ( talk) 03:10, 6 April 2018 (UTC) reply

( edit conflict)@@ Espresso Addict:, Once ACPERM passes we can discuss this possibility at the landing page talk. I suspect that something along the lines of "The Wikipedia community has decided to restrict article creation to editors with a little bit of experience under their belt, so please do X/Y/Z instead. If you insist, you can start a draft or work in your sandbox." in this case after ACPERM I'd like to see a welcome message automatically informing users that they have been marked as autoconfirmed and given the rights to move/create new articles (along with a welcome message). However, all of this is beyond the scope of this page, and auto-welcoming users has been rejected a number of times by the community in the past. — Insertcleverphrasehere ^{(
or here)} 03:01, 6 April 2018 (UTC) reply

I'm firmly against auto welcoming new users. To do so would be WP:Beans. In any case it's far to wordy (one of our major problems) - no new users are interested in any situation previous to their involvement. I did however suggest a splash page when someone had completed a registration when we were discussing the revamp of the Wizard. It was on the lines of: Thank you for registering an account. You can now edit any article [button - sends user to Main Page], or create a new article [button - sends user to the wizard], but somewhere my idea got buried. I believe Kaldari made a graphic mock up of it. Kudpung กุดผึ้ง ( talk) 05:05, 6 April 2018 (UTC) reply

After all the discussion

The biggest thing we can do to improve - Have the info from here User:SQL/AFC-Ores show for each draft when you click Review (much like the deletion history of the title does). Especially the copyvio score and earwig link. When you click Accept the info on Class should prefill (but be overridable). I've turned from reviewing drafts off the age lists to using this sortable report. Much better as I can target the likely good or the likely copyvio/spam etc. I have no idea how to program this but having this info right when and where you need it would be awsole. Legacypac ( talk) 02:02, 7 April 2018 (UTC) reply

Change the Submit Buttons - Request Prompt implementation

One good was to discourage blank and other inappropriate submissions is to tell the user up front what they are doing. I've proposed new wording on two userpage templates that are linked to AfC. Template_talk:User_sandbox. Can we vote the change here? The templates are rightly protected as they are very widely used. Ping User:DGG Legacypac ( talk) 23:55, 5 April 2018 (UTC) reply

I commented on the template's talk page. It seems that everyone is in agreement on this one, so someone with the rights should have no problem making it happen. A real no-brainer. — Insertcleverphrasehere ^{(
or here)} 02:50, 6 April 2018 (UTC) reply

Done Kudpung กุดผึ้ง ( talk) 02:56, 6 April 2018 (UTC) reply

Same change needed on Template:Userspace_draft which currently just says Ready? Submit for Review! Legacypac ( talk) 03:03, 6 April 2018 (UTC) reply

Seems to have been

Done by Kudpung. — Insertcleverphrasehere ^{(
or here)} 04:48, 6 April 2018 (UTC) reply

yes excellent and here is another one:

Template:AFC submission/draftnew is what the Article Wizard generates. Similar wording needed there. Currently there is just a grey "Submit Your Draft for Review!" button Draft:Legacypac_Article_Wizard_Draft_Test Legacypac ( talk) 05:17, 6 April 2018 (UTC) reply

Looking at the rest of the text "It is not currently pending review." could be deleted as the same idea is in the heading just above. Legacypac ( talk) 05:23, 6 April 2018 (UTC) reply

Kudpung It is this one that also needs updating with the new wording: Template:AFC_submission/draft. — Insertcleverphrasehere ^{(
or here)} 20:25, 10 April 2018 (UTC) reply

The agreed wording is "Finished writing a draft article and ready to request an experienced editor review it for possible inclusion in Wikipedia?" It needs to go right above the submit button but I can't figure out how using the sandbox. There seems to be templates buried in templates there. Legacypac ( talk) 06:15, 11 April 2018 (UTC) reply

Yeah that template is like Inception. — Insertcleverphrasehere ^{(
or here)} 06:58, 11 April 2018 (UTC) reply

This is far too long for a verbless sentence. "Tired? Run down? Take Zingo pills!" is OK in an advert, but this one needs to be grammaticalized as "If you have finished writing a draft article and are ready to request an experienced editor to review it for possible inclusion in Wikipedia, then please do so-&-so" : Noyster (talk), 10:16, 11 April 2018 (UTC) reply

I tweaked it slightly to "Finished writing a draft article? Are you ready to..." MER-C 10:47, 11 April 2018 (UTC) reply

Update 2018-04-06

Thank you all for the detailed conversation this week. It sounds like a lot of important ideas are being raised. I wanted to try to summarize some of the conversation after the first few days:

I made some updates to the project page to include seven additional potential improvements that were raised in the discussion here. I also added a couple points to other sections brought up by this group.
There has been a lot of discussion (and changes already!) around the language in templates, around putting templates and comments on talk pages instead of draft pages, and on the comment workflow in general. Those things all seem to me to be related, and perhaps we might want to think about them holistically.
One new idea that was brought up was "inline comments" -- the idea of having a feature that allows reviewers to make specific comments to different parts of a draft. The team here at WMF thought that was an interesting idea, and I'm interested in everyone's thoughts on that idea.
Being able to detect and surface copyvio sounds like it would be a particularly valuable improvement -- more valuable than having ORES scores.
The idea of routing or tagging drafts so that topic experts (perhaps from WikiProjects) can review them got some good discussion.
Many people are positive on the idea of unifying the interfaces of NPP and AfC, but this is likely out of scope for this project, given its size and importance.

Please jump in here if there are other important summaries from the discussion this week. In the coming week, I'm hopeful that we can consolidate around some leading potential improvements -- but please do let me know if this seems rushed.

-- MMiller (WMF) ( talk) 01:20, 7 April 2018 (UTC) reply

Thanks for this, MMiller. Inline comments is an interesting topic to pursue -- I'd love to see reviewers use these more -- but we'd have to address how they're an improvement on say, [adding comments in bold/square brackets]? Autoremoval/hiding on promotion? More visibility for newbie editors? Ability to highlight text that needs work easily? (Like the citation needed span template,^{citation needed} but easier to use & more visible.) Espresso Addict ( talk) 01:56, 7 April 2018 (UTC) reply

Inline comments would make the page harder to read. Just edit in ^{citation needed} or ^{dubious –
discuss} etc lkke normal. These tags can be left in even when mainspacing the page. Train the newbie how real wikipedia works. Legacypac ( talk) 02:04, 7 April 2018 (UTC) reply

That's the problem - real Wikipedia doesn't work now. It's shocking how unfriendly and downright unusable it can be on the wide array of mobile devices (tablets, phablets and smartphones) people use, getting Wikipedia into shape so we can re-attract the thousands of editors who have abandoned the project when moving away from fully featured computers provides the ideal opportunity to add useful new features which work particularly well across platforms. The use of the traditional tags, often added in drive-by tag bombing, is particularly unfriendly and provides basically no information at the point of use and when clicking through to the linked page, provides too much information. You're left trying to mind-read what the tagger was thinking when they added the tag, which for a new user is often impossible (it's often hard enough for us decrepit old timers to work out why a tag was added). The suggestion to use inline comments is a great one, provided it works like the comments feature in Microsoft Word. That would allow a detailed sentence or paragraph to be left (even if it's a semi-automated comment left at the click of a button) so the author sees something like 'this newspaper source lacks the date and page number, please try to add this information' at the point the citation is used, or 'this wiki-link is a disambiguation, which page did you intend to link to, if it was the DJ, use Foo (DJ) instead of Foo' which will enable the precise change to be identified to the author, with advice or bits of markup that they need available to them. Nick ( talk) 08:38, 7 April 2018 (UTC) reply

Hover text comments like MS Word's would indeed be lovely. Espresso Addict ( talk) 01:40, 8 April 2018 (UTC) reply

We already support inline comments using html

<!-- This is a comment -->

which renders nothing in the output but is visible in both the Visual Editor and the WikiText editor. We also have the {{ comment}} template and {{ Void}}. One could add a comment and link to the relevant section of the talk page by using something like

{{comment|A suggested improvement for this section is [[Talk:{{PAGENAME}}#suggestions for improvement]] here}}

for example. I'm not sure that we need the WMF for that. Generally, I oppose any proposal that shifts the responsibility for the content of editor feedback to the WMF. For example, it is absurd that editors do not have direct, unmediated control over the wording of the tags in the page curation tool used in NPP. Vexations ( talk) 12:27, 8 April 2018 (UTC) reply

MMiller (WMF): Many people are positive on the idea unifying the interfaces of NPP and AfC, but this is likely out of scope for this project, given its size and importance. You seem to have grasped this. The point is, how do we get to lobby the highest level of WMF management? Is the CEO like the Queen whose own children and grandchildren have to make an appointment through her various levels of ministers and secretaries before they are granted an audience? Kudpung กุดผึ้ง ( talk) 07:17, 7 April 2018 (UTC) reply

The silly irony here is that one major function of the whole QC process is to protect the Foundation against legal liability for incorrect material that is libellous. Espresso Addict ( talk) 01:38, 8 April 2018 (UTC) reply

No, the main protection for the WMF is Section 230 of the Communications Decency Act. The CEO, Katherine Maher, is quite accessible and, when I spoke to her recently, confirmed that editorial matters are left to us editors who are responsible for what we write. Andrew D. ( talk) 16:43, 9 April 2018 (UTC) reply

@ Kudpung: I'm not sure if this helpful, but I know that one place that is meant for communicating with Foundation leadership is comments on the draft annual plan. -- MMiller (WMF) ( talk) 21:36, 9 April 2018 (UTC) reply

This was very helpful MMiller (WMF) and many thanks. I have commented there. The problem at the WMF as I'm sure you are already aware, is that there are no clear lines of responsibility. It tries to be run on the 70s and 80s German concept of a 'flat hierarchy' with a CEO who is really a socio-political figurehead, but where it fails is that it ends up with too many chiefs and not enough indians, and the actual stake holders are not the WMF but the readers, writers, and maintenance volunteers. So you have business managers trying to decide what software solutions are required, and developers laying down operational policy. And the Wikipedia communities having to mop up the WMF's mistakes in the best way they can - ACTRIAL (2011) and the IEP and Orangemoody are prime examples.

We see ORES and the latest WMF engagement for ACTRIAL and AfC as very positive steps, but it needs to move beyond their political assuaging propaganda to concrete action in the form of a solid commitment to major development resources. Otherwise the annual plans and the goals for 2030 are as empty as a state-of-the-nation speech, or the Queen's annual opening of parliament which is written for her by the incumbent Prime Minister. Kudpung กุดผึ้ง ( talk) 12:03, 10 April 2018 (UTC) reply

Interaction between AFC, Drafts: as a collaborative space for drafting articles and AFD options

Hi

I think this is mainly for information to inform the discussion.

There seem to be several spaces which currently are part of the article creation and review life cycle where the individual parts interact but are loosely connected and difficult to navigate for new (and established) users, these include AFC, Drafts and AFD.

I've been helping a little tiny bit User:SQL with their amazing work on creating an ORES rating system for Wikipedia:Drafts at Wikipedia:Village_ pump_(technical)#How_hard_would_it_be_to_set_up_ORES_for_the_Draft:_space?. The main objective of this is to make Drafts less of an inescapable pit of oblivion and more of a viable way to allow collaborative work on articles before they are published.

Having a more functional Drafts space will I think help both AFC and AFD to have better outcomes in terms of article quality and editor retention, AFD does not have a requirement to show viable alternatives to deletion, which is a related but different issue to our discussion here. Specifically for AFC I think this improved ORES curated Draft space overview will allow people to improve drafts before review and encourage more people to review articles because the potentially better stuff is at the top of the list.

John Cummings ( talk) 16:31, 9 April 2018 (UTC) reply

@ John Cummings: thanks for chiming in. I definitely see how ORES scores will help reviewers navigate and prioritize the queue more efficiently. But can you say some more about how the scores could help newcomers navigate the process? -- MMiller (WMF) ( talk) 21:28, 9 April 2018 (UTC) reply

@ MMiller (WMF):, the basic idea would be allowing new users to collaborate with existing contributors and other new users on an article before it is published, making it easier for experienced users to help new users. The ORES scoring would make looking for and helping on drafts easier and a more pleasant experience and so hopefully it would become more popular. This kind of mentoring is very helpful in helping new users to not fall into one of the many traps which leads to poor editor retention. Also if there is a more functional draft space it would I hope be a more used option for AFD, so rather than the content getting deleted it is unpublished to Drafts. This use of a 'not yet' process would I hope encourage a growth based mindset which could help with editor retention.

John Cummings ( talk) 14:53, 10 April 2018 (UTC) reply

Development of tools

See m:Talk:Wikimedia_Foundation_Annual_Plan/2018-2019/Draft#Tools. Kudpung กุดผึ้ง ( talk) 00:58, 10 April 2018 (UTC) reply

Update 2018-04-10

There is a tremendous amount of valuable information on this page, so thanks to all of you. Now that the conversation has died down a bit, I looked back over everything, and it seems to me that three main potential improvements have been discussed most positively. They are listed below in no particular order.

Here is how I think we can proceed from here:

The engineers and other product managers on the Community Tech team here at WMF will discuss these three potential improvements, in order to focus a bit more on where the Foundation could be most useful, and get some very rough scoping on the resources it would take to implement them.
I'll post back on this page with where that conversation ends up, both in terms of specific work that we think could be impactful and that fit into the resources the team has available.
After discussion on this page, we'll narrow in on a specific improvement together.
Then we'll need help from this group to clearly define how exactly the improvement will be implemented.

Please let me know if that process doesn't sound right, or if the three potential improvements below don't accurately capture the discussion.

A: Copyvio and ORES scores

Improvement: Reviewers can use copyvio and ORES scores to prioritize drafts for review.
Benefit: Increase the speed/ease that reviewers can do their workflow AND Help reviewers find more promising drafts sooner.
Notes: Copyvio seems to be a more urgent need, but it may be easy to add ORES if copyvio is being addressed. The review process could benefit most from this information as reviewers are prioritizing drafts, but copyvio/ORES could also potentially be added to the AFCH script to be present during review. This improvement would involve taking a look at tools that have accomplished this in the past, and re-implementing or improving on them. This combines two of the potential improvements listed here.

Yes, copyvio within the script, or better still, automated bot based copyvio searches. jcc ( tea and biscuits) 23:30, 11 April 2018 (UTC) reply

@ Jcc: thanks. Could you say some more about what you mean by automated bot based copyvio searches? -- MMiller (WMF) ( talk) 00:33, 12 April 2018 (UTC) reply

Despite the development of CopyPatrol, which is a fantastic manual-interface, it does not match the abilities of CorenBot, which stopped working for a variety of reasons.Best, ~ Winged Blades^Godric 05:40, 12 April 2018 (UTC) reply

I envisage a tick box within the draft template to confirm if the latest revision has been checked for copyright or not- pretty much CopyPatrol but with something within the draft template to let us know if it has been checked yet or not. jcc ( tea and biscuits) 12:53, 12 April 2018 (UTC) reply

I think individually tagging articles with ORES wp10 prediction / draftquality prediction and % score / copyvio% score would go a long way to help here. SQL ^{Query me!} 23:45, 12 April 2018 (UTC) reply

I think that's too much BEANS. Like TonyBallioni has said elsewhere, during ACTRIAL we had a ton of paid/COI editors come into IRC and some of the less savvy helpers would essentially tell them how to get around our standards. If we say "this article is predicted to be 90% spam" all they'll do is grab a thesaurus and try to cheat the system. Yes, I know a small percentage will actually take the advice to heart, but the majority won't^{(yes, that's purely anecdotal)}.

Also, as I've found a few times, the CV results are usually right, but in the few instances where it's not, saying "80% copyvio" when it's just one list that's pinging sows more confusion than it sorts (an annoying fact I had to explain to a fecking 11-year veteran three fecking times).

Basically, I like the stats/tools you've developed, but I don't think they should be posted on the draft itself. Maybe have a link in the "draft pending" template along with the other Reviewer tools we offer. Primefac ( talk) 01:08, 13 April 2018 (UTC) reply

I think there should be a way to see the ORES score for spam, but I also agree with Primefac's concerns re: BEANS. I'd be fine with it for copyvio. Yes, sometimes it is wrong, but that's easy to deal with, and we have a much larger problem with people not checking for copyright than with people overchecking. TonyBallioni ( talk) 01:11, 13 April 2018 (UTC) reply

I had a quick glance at the ORES results for spam and found them to be wanting. Drafts that have a 10% spam score are still likely to be spam. MER-C 11:31, 13 April 2018 (UTC) reply

Perhaps I'm misunderstanding, but the way that I was thinking about it, the copyvio and ORES scores would not be visible to the draft creator -- rather, they would be visible to the reviewer in the AFCH gadget, as a quick guide as they are doing their review. Hopefully, this would avoid the BEANS issue. I would also say that it seems from the conversation on this page that having those scores present during review is actually of secondary usefulness, compared to having a place to use them to prioritize draft for review (like the page that SQL has created, and that the WMF could potentially collaborate to improve). Does this sound right? -- MMiller (WMF) ( talk) 16:56, 13 April 2018 (UTC) reply

The latter is better - other than a copyvio score (which I'd be running the copyvio check in a separate tab anyway) I don't think I'd go "oh, this has an ORES score of 30% spam, so I'll just decline it as |adv|" - I would give a full review. However, if I wanted to see which drafts were flagging as high spam (i.e. wanting to find potential quick-fail or G11 drafts) then SQL's table would be great. So yes, having the scores in an easy-to-find place would be great, but I don't see myself using that information once I'm actually on the page reviewing. Primefac ( talk) 17:48, 13 April 2018 (UTC) reply

B: Topic routing

Improvement: Route drafts to reviewers who have expertise in the draft's topic.
Benefit: Improve communication between reviewers and authors to decrease iterations AND Standardize reviewing criteria across reviewers AND Help reviewers find more promising drafts sooner.
Notes: Since notability issues are one of the biggest reasons that drafts struggle to get through the review process, the idea is that reviewers with expertise in the topic of the draft can more easily and accurately assess notability. Several people brought up ways this has been attempted in the past or existing data elements that could be leveraged here. This improvement would mean taking a fresh look at how to do this, keeping existing wisdom in mind.

Topic routing sounds good on paper, but I have some doubts about how well it will work in practice. AfC is one area of the project which seems to attract relatively new editors with limited experience, and in many cases, these new editors are still in secondary school or early in their college/university educations. They are beginning to develop a talent for reviewing articles and they're beginning to specialise in a specific subject area, but they're not yet experts and may not have a great deal of experience. There are a number who, sadly, vastly over estimate their own abilities and talents.

I would also add, whilst some of us do have professional qualifications and experience in our chosen fields, Wikipedia is, at least for some of us, an escape, and not somewhere we would necessarily want to visit just to continue our daytime work.

I would think it's an idea that's worth further consideration, but I would caution that it might not quite have the expected impact on AfC. Nick ( talk) 10:18, 11 April 2018 (UTC) reply

Despite agreeing with Nick, I think improvements in this aspect will bring a lot of improvement in the process. ~ Winged Blades^Godric 05:44, 12 April 2018 (UTC) reply

C: Communication with authors

Improvement: Rethink the content, location, and workflow around templates and comments on drafts.
Benefit: Improve communication between reviewers and authors to decrease iterations
Notes: A lot of the conversation has centered around improvements in this area, and some of the content of templates has already been changed over the last week. Other salient parts of this might involve altering the AFCH script to put templates on Talk pages, and potentially improve where and how comments from reviewers to authors are surfaced. This improvement would require us to think about which parts are best suited left to AfC reviewers, and which parts WMF can productively help with. This combines two of the potential improvements listed here.

Query: would C include work on the decline templates? (which also was supported by most users above) or is this one of the bits that might be "best suited left to AfC reviewers"? — Insertcleverphrasehere ^{(
or here)} 00:37, 11 April 2018 (UTC) reply

@ Insertcleverphrasehere: I was thinking that decline templates were the primary templates that this would affect, since those are the ones that newcomers see and interact with most often. But maybe I am misunderstanding the different types of templates that are part of the AfC process -- please let me know. With respect to which work would be better suited for reviewers (as opposed to WMF), I was thinking it would be the specific wording of the content of the templates. Whereas work to change things like where and when the templates are placed (e.g. Draft page vs. Talk page), are software changes that WMF is better positioned to help with. That said, WMF resources, such as designers and researchers, could also potentially help with the wording of templates. Does this distinction make sense? -- MMiller (WMF) ( talk) 22:01, 11 April 2018 (UTC) reply

Yep, that sounds about right to me. — Insertcleverphrasehere ^{(
or here)} 22:06, 11 April 2018 (UTC) reply

This has a great deal of potential, but it might require something of the unthinkable - moving away from communicating with authors in English and instead communicating some of the issues to them in their native language. The last couple of years have seen a massive increase in the numbers of editors who have English as a second (or third/further) language, and increasingly, where their English ability is more restricted. They struggle, unfortunately, to understand more complex concepts like our notability and sourcing policies but communicating some information to them in their native language may well aide their understanding. Internationalisation and translation of the AFC notices etc might well produce some significant benefits. Nick ( talk) 09:29, 12 April 2018 (UTC) reply

I think it would be best (for everyone concerned) that these editors get pointed to their native language Wikipedia instead of flailing around here -- there are more opportunities to contribute over there (by virtue of being less complete), and policies (although less developed) are written in their native language. These type of editors often compensate for their lack of English skills through copyvios and are much more likely to be unresponsive on their talk pages, which is a terminal combination. MER-C 13:32, 12 April 2018 (UTC) reply

I agree entirely and I see some reviewers and helpers trying to point editors in the direction of their native language project, but unfortunately the English Wikipedia is seen by many new editors as the project that they simply must have their content published on. I don't know how we get around that outlook. Nick ( talk) 14:27, 12 April 2018 (UTC) reply

Nothing must be published anywhere, and editors with this kind of promotional attitude are simply not the editors that we want to be helping anyway. I have no idea how you expect that an editor who can't read our templates and policy pages can be expected to write articles in English. — Insertcleverphrasehere ^{(
or here)} 15:09, 12 April 2018 (UTC) reply

A good reason to rewrite declines in simple English. "No" is an easy word to understand. Legacypac ( talk) 15:23, 12 April 2018 (UTC) reply

The irony of Insertcleverphrasehere failing to actually read what I wrote isn't lost on me, so for their benefit, I said we should be helping editors who have difficulty in UNDERSTANDING concepts like notability and sourcing, editors who have English as a second/additional language are frequently capable of writing basic articles to an acceptable standard (i.e to a level which would be accepted through AfC) but may have difficulty understanding our longer policy and guideline pages, which are somewhat complex. That's why they may be submitting content which lacks sources or has sourcing issues, that's why they could be submitting content about a subject which doesn't meet our notability criteria. If we help them along by providing some additional information in their native language, it has the potential to improve the quality of the material they submit, however, it may well be the case that providing a series of policies, guidelines and templates in simple English can eliminate the need to do that. Nick ( talk) 17:31, 12 April 2018 (UTC) reply

The easiest way to divert ESL editors towards their native languages is via Wikipedia:New user landing page and Wikipedia:Article wizard -- "Is English not your native language? Wikipedia is available in all major languages of the world, see HERE for the list". MER-C 20:00, 12 April 2018 (UTC) reply

Selecting drafts for review

I have a general question that would help fill in a gap in my knowledge as I talk to the Community Tech team on our end at WMF: what are the different ways that AfC reviewers are currently selecting articles for review?

I know that some reviewers choose by age category -- does that happen from this page? And do reviewers tend to select newer drafts for review, or older drafts? I also know that Legacypac is finding SQL's page useful. What other methods or tools are used for selecting drafts for review?

The reason I ask is because it looks like a lot of the potential improvements being discussed have to do with prioritizing drafts (copyvio, ORES, topic routing), and I want us to discuss what the right type of page or tool would be to enable the prioritization and selection of drafts.

-- MMiller (WMF) ( talk) 00:03, 14 April 2018 (UTC) reply

Probably just as arbitrarily as New Page Reviewers work through the New Pages Feed. There may be a way of providing the stats you need, I wouldn't know, but Insertcleverphrasehere is quite good at this sort of thing. Kudpung กุดผึ้ง ( talk) 02:14, 14 April 2018 (UTC) reply

@ MMiller (WMF):. Well, at NPP it is all user preference, some users review pages as they come in (at the front), some work on the oldest submissions (the back), and some target articles of certain types (i.e. sportspeople etc). I am not sure how AfC reviewers sort by topic but at NPP we have the The NPP Browser, a fantastic tool which allows users to sort through the new pages feed for keywords. I wonder if it would be simple to create a parallel tool that does the exact same keyword search process, but for AfC Drafts (or add this as an option to the NPP browser). The tool is maintained by Rentier, who may be able to provide more information. — Insertcleverphrasehere ^{(
or here)} 02:24, 14 April 2018 (UTC) reply

@ Insertcleverphrasehere: I had not seen The NPP Browser -- that looks like a really interesting tool. Do you and other NPP reviewers use it regularly? And do you know how it sorts articles into topics (e.g. Military, Software, Sport)? Or perhaps Rentier can let us know. -- MMiller (WMF) ( talk) 17:43, 17 April 2018 (UTC) reply

Rentier's NPP browser is fantastic. I prefer it the Special:Newpagesfeed. The keyword search is the most useful. I'm not so interested in it's browse function. If it would let me also sort by number of citations, it would be even better. Vexations ( talk) 19:00, 17 April 2018 (UTC) reply

@ MMiller (WMF) and Insertcleverphrasehere: As Vexations says, the keyword tool is probably the most useful feature, along with the sorting. I take the data from https://en.wikipedia.org/w/api.php?action=pagetriagelist If there was a similar API for AfC drafts, it would be very easy to include them in the tool. Similarly, I can easily add a new column to the NPP view if the field is returned by the API.

I don't gather detailed statistics about who uses the tool, but each day there are at least a few dozen queries spread throughout the day, sometimes more.

The sorting into topics is done using a trivial keyword search, where the list of topics and keywords is hardcoded.

Rentier ( talk) 14:33, 21 April 2018 (UTC) reply

Having watched the stats - some target the front, some the oldest, and some randomly or by subject. Some of us mix it up. Lately I've seen efforts to pick a day and clear it to zero. Some of us look at the title list and cherry pick. If a University comes up for example I'll usually accept it as they are auto notable. I skip football/soccer players. Generally if someone does not review it with in 24-72 hours of submission it's likely to sit until its the oldest, but we do have a Random button and a GFOO button that takes us to new submissions. Legacypac ( talk) 03:33, 14 April 2018 (UTC) reply

The short answer, across the aggregate of users, is "all of them". Some reviewers like myself review from the back of the queue, some review almost exclusively from the front, some use the "review random submission" button, and others as Legacy says look for pages with specific themes (universities, footballers, pages tagged though our new "draft sorting" tool). I will say the former two (front/back of the queue) are by far the largest volume of reviewers, as the 3- to 20-day categories rarely go down significantly before they hit the 3-week cat (with recent exception). Primefac ( talk) 15:34, 14 April 2018 (UTC) reply

There are Category and List links in the Submissions tab for the Wikiproject. These would be the most obvious places reviewers would get to the pending submissions. From there, as others have said, we select what to review in various ways. ~ Kvng ( talk) 02:56, 17 April 2018 (UTC) reply

Replying to this a bit late, but you can filter drafts by subject at toollabs:apersonbot/pending-subs, as well as apply other filters to the drafts. Enterprisey ( talk!) 21:25, 9 May 2018 (UTC) reply

Metrics

From phab:T192515: "Here are the main metrics that we'll be measuring (potentially broken out by some other dimensions):

60-day rolling mainspace rate: for a given cohort of drafts submitted to AfC, what percent of them are in the main namespace 60 days after their submission?
90-day rolling survival rate: for a given cohort of articles that came from AfC, what percent of them are nominated for deletion 90 days after being moved to main namespace?
Quality article waiting period: for those drafts that were accepted to main namespace on their first review, how long did they have to wait for that first review after being submitted to AfC?"

Other metrics could also be proposed, such as

quality of the articles initially submitted for review
- references provided inline
- references filled in via templates such as 'cite web'
availability of feedback from non-reviewers during the waiting period
responsiveness of the draft author to feedback provided at the draft talk page
impact of changing the article submission flow on the likelyhood of success (example:
- A) current process vs
- B) split the article creation process into four stages:
  - 1) provide urls of sources;
  - 2) write information about sources such as what is the date and author and information provided therein,
  - 3) write article stub
  - 4) write a longer article - each step involves receiving feedback vs
- C) create article one paragraph long, receive feedback, iterate until the one paragraph is meaningful, proceed to writing the second paragraph,
- D) something else)
length of the submission

What other metrics could be used? -- Gryllida ( talk) 03:51, 19 April 2018 (UTC) reply

Thanks, Gryllida, you beat me to it! Those three metrics that I listed on the Phabricator task today are based on two that I identified on the project page, and that have been discussed a bit on this talk page. The idea is that we can tell that we've improved the AfC process if all three of them go up. The first one should go up if it's easier for reviewers to find the drafts that belong in mainspace. The second one can help make sure that the first one isn't going up at the expense of quality. And the third one is another way to tell whether we've made it easier to prioritize those drafts that start out at a high level of quality. Given that two of the three proposed improvements (Copyvio/ORES and Topic Routing) are about helping reviewers prioritize the right drafts sooner, these seemed to be the metrics most relevant to the likely upcoming work.

I hope everyone can weigh in on whether these seem like the right metrics to focus on, and bring up other important metrics (as Gryllida did). One thing that we'll need to keep in mind is that just as we'll have limited engineering bandwidth for improvements, we'll also have limited bandwidth for measurement. -- MMiller (WMF) ( talk) 04:55, 19 April 2018 (UTC) reply

I think waiting time and percentage of submissions eventually accepted are good high-level things to track. I think that both wait time for each review and total wait time for eventual publication are important metrics and shorter is better in both cases and especially the latter. I don't think we should be concerned about percentage of accepted submissions deleted as a higher number here could be considered a good thing or a bad thing depending on which side of the polarized world you're standing on. ~ Kvng ( talk) 15:18, 19 April 2018 (UTC) reply

I track a rather silly number of metrics, if you're looking for somewhere to start. Primefac ( talk) 15:21, 19 April 2018 (UTC) reply

A tad late, but just wanted to drop my thanks to Primefac - answers a lot of metrics on AfC i've been interested in Nosebagbear ( talk) 14:49, 22 July 2018 (UTC) reply

CV bots

{{ Csb-pageincludes}} Why is it that everywhere I mention this bot it stops the discussion dead in its tracks? Seems almost like a conspiracy. Neither Earwig nor Copypatrol do this. Am I missing something? Ping: Diannaa, Doc James, Sphilbrick TonyBallioni. Kudpung กุดผึ้ง ( talk) 00:18, 20 April 2018 (UTC) reply

Hi Kudpung. If I'm understanding you correctly, you are suggesting that a bot add a template similar to the one CorenBot used to use to each new draft or new article listed at CopyPatrol. I'm not sure what the point of that would be, since all reports at CopyPatrol are cleared within 24 to 48 hours. Please let me know if I'm missing something here. — Diannaa 🍁 ( talk) 13:20, 20 April 2018 (UTC) reply

Just to clarify, CorenSearchBot checked only newly created articles in mainspace; Copypatrol covers that and adds a lot more. It checks all additions over a certain size to newly created articles and newly created drafts, as well as all additions over a certain size to articles and drafts that already exist. This is why Coren was able to retire his bot. — Diannaa 🍁 ( talk) 13:38, 20 April 2018 (UTC) reply

Apologies if I'm putting words-in-mouth, but I think Kudpung's concern is that there's no "obvious" tell if there's copyrighted material added onto a new page such as the notice added by the old bot. Primefac ( talk) 15:03, 20 April 2018 (UTC) reply

Yes I do think that's what he means, but I don't think it is necessary, as all reports are cleared within 24 to 48 hours (for example as of right now the oldest open report is only 16 hours old). I don't think we need to add templates to the listed items; in fact I would prefer if you didn't. To do so would only generate more work for the handful of people (literally a handful; most of the work is being done by three people) that assess the reports. There's about 25% false positives on circa 100-125 daily reports filed.— Diannaa 🍁 ( talk) 17:05, 20 April 2018 (UTC) reply

Thanks you Diannaa. Other than templating the article which appears unneeded / unwanted any other integration with NPP or AfC wanted? Doc James ( talk · contribs · email) 18:49, 20 April 2018 (UTC) reply

Can we just expand whatever copyright process happens in mainspace to cover Drafts. There are only around 6000 ?? active drafts at any goven time, and 1700 of those in AfC pending status. Keep it simple. Legacypac ( talk) 18:54, 20 April 2018 (UTC) reply

Copypatrol already covers drafts. MER-C 09:08, 21 April 2018 (UTC) reply

Doc James, by the time time Copypatrol gets round to its tasks, the pages at the top of the feed have long since been patrolled and are gaily on their way to Google. I know what I'm talking about - I patrol from the front of the queue and have to check every article laboriously with Earwig. Obviously any one who doesn't understand has never patrolled new pages while Coren was working. There is no logical Alternative to CorenSearchBot and I do not understand why every mention of it gets swept under the carpet. Kudpung กุดผึ้ง ( talk) 11:28, 21 April 2018 (UTC) reply

All new articles should be manually checked for copyvio regardless, should they not? I wouldn't recommend skipping this step. Or, you could try visiting https://tools.wmflabs.org/copypatrol/en and see if the article is listed there, as the bot will immediately post anything it finds. — Diannaa 🍁 ( talk) 11:40, 21 April 2018 (UTC) reply

From what I understand Copypatrol dose run rapidly after the edit is made. I am not sure if it would be possible to add tags to the feed such as "pending review" "passed review" or "concerning review" or something similar?

Agree User:Kudpung that you have a much better understanding of the process. What would you like to see? Doc James ( talk · contribs · email) 12:23, 21 April 2018 (UTC) reply

Copypatrol is fine as long as its used for what it's best at: locating snippets of COPYVIO 24 or 48 hours later. But the good, clean perfectly formatted and sourced article that some PR merchant has dropped into the mainspace feed in a carefully calculated 4 days and 10 edits to avoid the scrutiniy of ACPERM, is nothing more than a verbatim copy of his client's web site. It looks good to the over eager patrollers who are hovering over the front of the feed with their mouses ready to pounce and check it as 'patrolled;'. Within 30 seconds it's referenced and indexed by Google's resident Wikipedia bot and the SEO and PR agents are laughing all the way to the bank.

I simply want to know why we can't have Corel back - or something like it. Or is it that it was so long ago that no one remembers what I'm talking about? I pasted a copy of its template above as an example. What Earwig can do at the touch of a button and a 20 second wait could be done automatically and save a lot of time. It's still quicker than using Copypatrol - NPPers won't go there - who wants to cross the street and wait 24 hours for a coffee when there's a faster coffee machine in one's own lobby, but the only thing wrong is that it's broken and nobody can be bothered to fix it?

For Copypatrol to have any meaning at NPP it needs to scan and tag a new page within about 25 seconds, because that's how long the low hanging fruit stays in the feed before it gets patrolled. Diannaa, if you have teams of users that can do that manually then there is no problem, but you'll need at least 600 because that's how many reviewers we now have and even they can't cope doing it manually. What I'm saying is, in the time you take to check one article with Copypatrol, you might as well patroll the page per NPP anyway. Kudpung กุดผึ้ง ( talk) 13:11, 21 April 2018 (UTC) reply

CorenBot died when the Yahoo api was shut down. We might be able to use another API. Will ask.

User:Kudpung My proposal was NOT to send NPPers to CopyPatrol but to bring some CopyPatrol details within the NPP feed. Doc James ( talk · contribs · email) 15:30, 21 April 2018 (UTC) reply

Yes, James I fully understood that and it is also what I would llike, of course. What I would ultimately like to see is a bot automatically tagging the new pages as Coren did, and the tag should have the __NO INDEX__ tag in it. Kudpung กุดผึ้ง ( talk) 02:53, 26 April 2018 (UTC) reply

@ Doc James:-- As you say, once Yahoo's search API went down, Coren-Bot encountered it's first problems, for it was entirely dependent on the same to search web-pages.But, AFAIS, there's no need to look for another API:)

Post the BOSS shutdown, the legal and ComTech folks spent a lot of time, to get into a partnership with major search engines (See T125459) and ultimately zeroed in on Google, to utilize their search-API.Consequently, CBot was upgraded to the new API (See T131169) and it was up and running.But, the change of API seems to not have went smoothly, as the bot crashed for unknown reasons soon afterwards.

In the meantime, that Coren got increasingly inactive over en-wiki compounded the problems.The bot soon re-crashed for the second time, after a series of wrong edits (or he might have terminated it).Post that crash, he has gone radio-silent, over the entire issue and AFAIR, despite Crow and some WMF-folks having repeatedly asked him to spare some time into the restarting of the bot ~~and/or to facilitate a retrieval of his bot-codes, so that some other volunteer developer could put that to work (without re-writing it entirely),~~ have not heard anything back.

If the WMF has some insider-contact with Coren, (he worked for them at a time), you may choose to leverage that and may-be the codes could be directly put to work, with some minor modifications but without expenditure of an entire set of new heads....... ~ Winged Blades^Godric 15:56, 26 April 2018 (UTC) reply

I am just at a total loss to understand why we can't have something like CorenBot back. It's a tool that strictly speaking the WMF should be responsible for developing and maintaining, especially as COPYVIO carries legal implications. But they always pretend to be understaffed and have too little financial resources. I think that is a myth and moreover, there should be a permanent WMF team looking after these issues instead of allowing clueless volunteer MediaWiki devs from other projects to class our urgent requirements at Phab as low priority. Kudpung กุดผึ้ง ( talk) 16:19, 26 April 2018 (UTC) reply

User:Winged Blades of Godric thanks for the additional details. Did not realize this part of the story. Does this mean that his bot code is not avaliable? Doc James ( talk · contribs · email) 21:28, 26 April 2018 (UTC) reply

@ Doc James:--Scratched a part.See this link.Best, ~ Winged Blades^Godric 08:03, 27 April 2018 (UTC) reply

Ah great. So does this mean someone could reactivate if they wish? Doc James ( talk · contribs · email) 16:58, 27 April 2018 (UTC) reply

Not so easily:) I've been thinking about this stuff and chances of a re-run and it'll take a lot of time (if feasible).But, much experienced botops like @ SQL: would be of immense help as to the prospects et al:) ~ Winged Blades^Godric 17:14, 28 April 2018 (UTC) reply

Update 2018-04-25

Thank you for continuing the discussion as the Community Tech team took some time over the last couple weeks to think through different approaches. We have an approach that we think could be really good (heavily influenced by community conversation on this page and elsewhere), and I’d like to get everyone’s thoughts on it.

We narrowed down to a top three potential improvements in the April 10 update above, and it currently looks like the Community Tech team could be most impactful with the first one: Copyvio and ORES scores. The goal would be to help AfC reviewers use those scores to prioritize drafts for review, by doing things like finding the most likely copyright violations to delete first, or finding the likely B-class drafts to review (and hopefully accept) those first.

To accomplish this, AfC reviewers would need a place where submitted drafts are sortable and filterable by those classifications, like the proof-of-concept built by SQL and John Cummings. The Community Tech team thinks we could do this in the following way. This idea will not be new to Legacypac, Kudpung, and others, who have been talking about it for the last couple weeks:

Use the existing New Pages Feed interface currently employed by NPP.
Add elements that show the copyvio and ORES scores for each submitted draft awaiting review.
Make the feed sortable and filterable by those data elements.

Once AfC reviewers select a draft for review, they would use the existing AFCH script as usual.

The most straightforward way to implement this would be as an expansion of the current Special:NewPagesFeed page, basically making it possible to filter both new articles and submitted drafts from the same place. That said, it is also possible to implement this separately just for AfC, so as not to alter the current NPP process. I would like to get an initial read from New Page Reviewers who have been a part of this discussion and then hopefully get thoughts from the broader NPP project.

We see several advantages to the approach of expanding the New Pages Feed to include AfC and these new data elements:

Possibility of extending the improvements to NPP, as well as AfC.
Allows AfC (and NPP) to address copyvio issues as they are doing review.
Allows AfC (and NPP) to prioritize review for those articles that are likely of the highest or lowest quality.
Avoids building another new interface that would need to be maintained and improved.

Before adding more detail or going further down this road, I think it’s important for the group to answer the following questions. In particular, I’m hoping for thoughts from Legacypac, Kvng, Primefac, Robert McClenon, and Dodger67:

Will this be a substantial improvement to the AfC process?
Would you use this to prioritize submitted drafts for review?
Does this solve the copyvio problems that exist in the absence of CorenBot (albeit in a different way)?

And then for the New Page Reviewers: would these changes be a welcome addition to NPP?

Once we settle on how to proceed, the Community Tech team will start fleshing out the details, and we’ll need input from everyone here.

-- MMiller (WMF) ( talk) 17:39, 25 April 2018 (UTC) reply

I think the strongest thing going for this proposal is the potential to bring AfC and NPP closer together. Many consider this a long-term goal and this would be an incremental step there that would potentially benefit both projects. And now some answers:

Significant but not necessarily substantial
Accepting drafts is my favorite thing to do at AfC and I could use this to find good candidates if I need a boost. I probably will continue to review the oldest drafts in the queue because fairness is good and I like a challenge.
To have a substantial impact on the CV portion of reviewing, CV information should be available on the draft itself or through AFCH script. Having it in the NPP interface only helps those who work from there. I haven't done much NPP work because it felt soul crushing but if the NPP interface let's me choose random submissions from near the back of the queue, I could switch to using that instead of our age-based AfC submission categories. ~ Kvng ( talk) 18:09, 25 April 2018 (UTC) reply

I just reminded myself of how the NPP interface works. It looks like it almost does what I need as far as selecting articles to review. It does only have oldest and newest sorting options. With only those two options, I'm curious how reviewers avoid stepping on each other's toes potentially duplicating efforts by reviewing the same articles at the same time. ~ Kvng ( talk) 21:14, 25 April 2018 (UTC) reply

Comment the existing NPP feed already includes new articles, redirects and userspace pages. Adding Drafts only makes sense. User:Insertcleverphrasehere has been working to get AfC reviewers to get the NPR PERM. When the software people switch on WP:ACREQ we can expect some burdan to shift from NPR to AfC Drafts so in time I hope some NPRers will handle some of the drafts too. Legacypac ( talk) 19:12, 25 April 2018 (UTC) reply

Kvng, one of the requests on the WMF devs (see WP:PCSI) is to improve the filter criteria in the user prefs on the the Special:NewPagesFeed. With submitted drafts being featured in the feed, this would provide those who prefer to work on drafts with an excellent opportunity to see at a glance what needs to be processed, and choosse their order of doing it. Thier normal 'accept' and 'reject' templates with their built-in sub-routines would be adapted for the Curation toolbar. It would be win-win solution. Kudpung กุดผึ้ง ( talk) 23:53, 25 April 2018 (UTC) reply

@ Kudpung: using the current capabilities of the tool, do you know how NPP reviewers avoid the issue of reviewing the same articles at the same time? ~ Kvng ( talk) 13:43, 26 April 2018 (UTC) reply

Kvng. They don't. It never happens. Sometimes but still rare, there is an edit conflict when newbies are trying to patrol pages with Twinkle at the same time as an authorised Reviewer is working from New Pages Feed, but it doesn't show as a edit conflict. The browser just hangs for a few seconds. I understand why you are asking, but I don't consider it to be an issue if AfC and NPP share the same feed. I assume that by and large, NPPers and AfCers will mainly continue to work in their preferred areas. Kudpung กุดผึ้ง ( talk) 14:15, 26 April 2018 (UTC) reply

@ Kudpung: when I review, I work in batches, typically 3 submissions at a time. It can take me 30 minutes to finish these three reviews. This seems like a big enough window that if there is another reviewer working at the same time from the back of the queue, we could easily end up reviewing the same articles at the same time. Why should I not worry about this? Do you have a proposal for a better AfC workflow than what I've described? ~ Kvng ( talk) 15:41, 26 April 2018 (UTC) reply

If you're talking about AFC, then you should be marking the draft as under review, which will significantly reduce the chances of an edit conflict. Primefac ( talk) 15:47, 26 April 2018 (UTC) reply

@ Kvng: I understand. You are talking about AfC where the reviewing process takes longer due to offering the submiters detailed help and even sometimes doing a WP:BOGOF for them. At NPP, the essential is a crude but effective triage. Those that are beyond repair are euthanised. Those that are basically OK get tagged and the author told to attend to it. Those that are basically OK but certainly not fit for mainspace get sent to Draft. Those that are OK, get, well, patroled as OK. If one does all the checks required at WP:NPP, an experienced reviewer who intuitively recognises a COPYVIO, PE, COI, SOCK, etc can generally do all this in 3 - 5 minutes unless they get on to a trail of socks that needs investigating or a case that needs posting at COIN. Note that a single sock can take up to a day to unravel. That's not the kind of work the AfCers generally do. Kudpung กุดผึ้ง ( talk) 16:51, 26 April 2018 (UTC) reply

@ Primefac: marking a draft as under review would be a very simple feature to incorporate. Really easy for the devs. Let's make make sure we get it. Kudpung กุดผึ้ง ( talk) 16:51, 26 April 2018 (UTC) reply

AFCH already has that functionality. See {{ AFC submission/reviewing}}. Primefac ( talk) 17:03, 26 April 2018 (UTC) reply

I'm aware of that but it is extra steps and so I stopped using it. Instead I randomize my selection to avoid collision. I could go back to marking before reviewing but the extra steps erode the efficiency gains were trying to get here. I can just assume everything will be fine but if it turns out it's not, it's a bummer. ~ Kvng ( talk) 20:06, 26 April 2018 (UTC) reply

I'd like to see All Drafts included as I regularly find both delete worthy and promote worthy pages in Draft space. All Drafts can benefit from collaboration and tagging (including G13) It would probably be easier to include all drafts anyway, then to exclude unsubmitted ones. We should of course have a filter for AfC submitted Drafts. Precident is that all Userspace pages are placed in the tool now and all the same problems found in userspace are found in draft space. Legacypac ( talk) 01:44, 26 April 2018 (UTC) reply

Wouldn't we all, Legacypac, but we also need to find easier policies/guidelines for deleting the trash. User pages and mainspace are not lumped together in the feed. They are distinct filter options in the preferences. However, and I've asked them umpteen times, the devs should create far more granular choices. Kudpung กุดผึ้ง ( talk) 14:21, 26 April 2018 (UTC) reply

I would imagine, programming wise, that getting CV scores for new pages would be essentially free if you get them for AFC submissions. So, there is no harm in including it for NPP and would be beneficial if copyvios get found and deleted quicker. MER-C 12:08, 26 April 2018 (UTC) reply

Yes, MER-C, but not quite. Prioritywise, putting AfC before NPP is putting the cart in front of the horse. NPP needs an effective COPYVIO filter far more urgently. NPP generally processes new mainspace articles within minutes, especially CORP and BIOS. There's no such urgency at AfC. More reason for the two processes sharing a common interface rather than the WMF developing it twice and prioritising the wrong one. AfC's problems are inconsistent reviewing and too few reviewers - these are social, not software issues.

I believe James is looking into this for COPYVIO, for one reason, company PR people are notorious for including chunks of their company's websites and other PR material. The 'professionals' who blatantly exploit Wikipedia for as a lucrative career don't have to go through AfC, they know how to make 10 edits and patiently wait 4 days; editors at both processes are notorious for not systematically doing CPYVIO checks (the rate at which some of them patrol is proof of this because it often takes Earwig sometimes up to 30secs to load its results). Kudpung กุดผึ้ง ( talk) 14:50, 26 April 2018 (UTC) reply

I agree that NPP needs an effective copyvio filter. What I'm saying is that there is no excuse for the WMF not delivering copyvio links to both NPP and AFC workflows at the same time, especially when they (as above) propose to use common software. If you add a copyvio link for AFC mode to the software, it should be trivial (no more than a few minutes or even less effort, provided that the software is well written) to add it NPP mode as well -- and vice versa. MER-C 15:04, 26 April 2018 (UTC) reply

The time it takes to losd earwig sucks. Just display the score and we can check the high scores. I've taken to noting I've checked copyvio on nearly every draft I touch. Legacypac ( talk) 16:04, 26 April 2018 (UTC) reply

As has been mentioned multiple times, Legacypac, please do not do that. I'm not going to reiterate all of the reasons why this should not be done. We don't need a note saying that copyvios have been checked (especially if the person checking misses something). Primefac ( talk) 16:07, 26 April 2018 (UTC) reply

Accusations like that is exactly why I document what I'm doing. Legacypac ( talk) 16:10, 26 April 2018 (UTC) reply

MER-C, Kudpung, and Legacypac -- thanks for commenting. One thing I want to make sure is clear as the discussion continues: this proposal would add a copyvio filter to both the AfC and NPP workflows simultaneously, because we would be adding Drafts to the existing NewPagesFeed and adding the copyvio (and ORES) information to the feed as a whole. We would only separate the two workflows if the NPP community wanted that, which it does not currently sound like is the case. Does that make sense? -- MMiller (WMF) ( talk) 16:18, 26 April 2018 (UTC) reply

It does indeed make sense, Marshall, because it's exactly what we've been asking for. What we do want however, is some collaboration with the actual developers with screen shots of what they propose doing. No one knows better what we need than we editors who do the reviewing work. In the past, the WMF have come up with excellent solutions following multiple meetings and video conferences, but when they launched the GUI it was lacking in essential features and they have declined to address them ever since. So please insist that the WMF and the Community work together on this.

Ultimately of course, what is really needed is a bot to replace Coren, as well. That would ensure that even with the laziest reviewers, the article itself gets gets promptly tagged. Kudpung กุดผึ้ง ( talk) 16:32, 26 April 2018 (UTC) reply

Reiterating - with direct answers

Unsurprisingly, from the practical aspect, the Marshall Plan is almost exactly what I have been suggesting for well over 2 years. Very surprising however, from the WMF-Wikipedia political aspect, and I highly appreciate this as another effort to close the gap bewtween the devs and the community and in our efforts to produce clean encyclopedias. To recap on Marshall's findings:

The Community Tech team could be most impactful with the first one: Copyvio and ORES scores. The goal would be to help AfC reviewers use those scores to prioritize drafts for review, by doing things like finding the most likely copyright violations to delete first, or finding the likely B-class drafts to review (and hopefully accept) those first.
AfC reviewers would need a place where submitted drafts are sortable and filterable by those classifications, like the proof-of-concept built by SQL and John Cummings. The Community Tech team thinks we could do this in the following way:

Use the existing New Pages Feed interface currently employed by NPP.
Add elements that show the copyvio and ORES scores for each submitted draft awaiting review.
Make the feed sortable and filterable by those data elements.
Once AfC reviewers select a draft for review, they would use the existing AFCH script as usual.
The most straightforward way to implement this would be as an expansion of the current Special:NewPagesFeed page, basically making it possible to filter both new articles and submitted drafts from the same place. That said, it is also possible to implement this separately just for AfC, so as not to alter the current NPP process.

The WMF sees several advantages to the approach of expanding the New Pages Feed to include AfC and these new data elements:

Avoids building another new interface that would need to be maintained and improved.
Possibility of extending the improvements to NPP, as well as AfC.
Allows AfC (and NPP) to address copyvio issues as they are doing review.
Allows AfC (and NPP) to prioritize review for those articles that are likely of the highest or lowest quality.

The questions the WMF is asking are:

Will this be a substantial improvement to the AfC process?
Would you use this to prioritize submitted drafts for review?
Does this solve the copyvio problems that exist in the absence of CorenBot (albeit in a different way)?
For the New Page Reviewers: would these changes be a welcome addition to NPP?

My answers to these questions are, predictably, yes, yes, yes, and yes. Kudpung กุดผึ้ง ( talk) 03:49, 29 April 2018 (UTC) reply

What about implementing ACPERM itself

I have been looking around to try to find out what is the timeline for the WMF to make ACTRIAL permanent, as the community said it wanted in the RfC. I see a bunch of discussion about other things (discussions which are likely to take a long time to work out).

What is going on with respect to simply re-instating the changes that made ACTRIAL function? Or has that already happened?

Pinging User:Kudpung, User:TonyBallioni, User:‎MMiller (WMF). Sorry if I missed the place where this is stated somewhere but as I said I looked and have not found it. Jytdog ( talk) 17:55, 25 April 2018 (UTC) reply

There was a bit of a tussle this morning with some volunteer devs from other wikis (long story) but the Community Tech team will be making it their spring project the week of 30 April 2018, so hopefully soon. TonyBallioni ( talk) 17:58, 25 April 2018 (UTC) reply

Thanks TonyBallioni. Is that stated by them somewhere? If so I think I would be good to link to it a) in User:Kudpung's Signpost story, and maybe at the ACTRIAL talk page and talk:AfC... Thanks again. Jytdog ( talk) 19:20, 25 April 2018 (UTC) reply

Jytdog, I should have said “sprint” above. Typo. See T192455 for more details. I’ll also note here that the WMF staffers who were aware of the RfC have been very helpful, and the issues today were caused by volunteer devs who have no idea of what is happening on en.wiki. Once I emailed Danny Horn the situation was resolved very quickly. TonyBallioni ( talk) 19:59, 25 April 2018 (UTC) reply

Thanks for that! User:Kudpung do you want to add something like " The WMF has said that ACPERM will be implemented the week of April 30, per T192455" to your signpost? Jytdog ( talk) 20:09, 25 April 2018 (UTC) reply

No, Jytdog, I think I'll leave it as it is. If there is going to be a broken promise, this would be something for my article in the May issue of Signpost. The Phab discussion has once more degenerated into a WMF vs Community slanging-match and there is no need to link to it. Again, if necessary, something for the May issue. Kudpung กุดผึ้ง ( talk) 23:57, 25 April 2018 (UTC) reply

OK thanks for considering it. Jytdog ( talk) 01:03, 26 April 2018 (UTC) reply

Rework of AfC review system to be primarily a notability check with feedback to the author.

Ok so I've been mulling it over, and I think I have a way to make the review system fair without putting undue extra work on reviewers. The current system forces new editors to demonstrate notability, which they often do not have the experience of expertise to do so, and they end up spamming the page with bad sources that obscure any good ones they may have found, doubling the work of reviewers. Instead the reviewer should primarily review the topic first, searching for sources and giving feedback on the notability level of the topic, and only if notable do they review for POV, structural stuff, etc. Articles that violate G11 or G12 or any other speedy criteria would be CSD tagged as usual. The system would work thusly:

The AfC reviewer gives a rating as to the notability of the topic first (i.e. notable/likely notable/likely non-notable/not notable), reporting this to the author. If notable and demonstrated to be notable with sources, the article is immediately published (same as now).
If assessed as 'non-notable' or 'likely non-notable', the review ends here and is declined. Authors may request a second opinion (via a button), but on the second non-notable assessment by a different AfC reviewer (verification) this button would be removed from the template.
If the assessment is 'notable' or 'likely-notable', but the submission is not supported by references, the reviewer reviews the submission itself and provides feedback as to structural problems/current sources/other sources available etc, and marks the page as 'pending edits by author' with a button "notify reviewer that you are done".
If the author abandons a draft on a topic assessed as 'notable' or 'likely-notable', then it should no longer be eligible for G13, although I suspect that there would be some editors that would be keen to rescue drafts assessed and tagged as 'notable' anyway, so this likely isn't a concern.

This system would greatly reduce the issues of the current situation where many notable topics end up abandoned by creators that don't have the skills to research references, and would also transfer AfC to an advisory role (which was the original intention of the project), rather than a gate-keeping role. However, it would still serve as a firebreak against non-notable topics being transferred to main space.

Any thoughts or ideas on advantages/disadvantages or potential pitfalls to this system are welcome. Cheers, — Insertcleverphrasehere ^{(
or here)} 21:50, 26 April 2018 (UTC) reply

Optimist! I seldom see anything that is notable or likely-notable, and I can generally tell just by looking at it without the need for a review tool. I will comment more shortly. Robert McClenon ( talk) 19:08, 29 April 2018 (UTC) reply

An AfC draft doesn't get transferred to mainspace - except by a reviewer who is supposed to know what they are doing - and then it's next stop is the terse but more efficient triage at NPP before it gets published. NPP will always be the gatekeepers and while the errors produced by its reviewers are exaperating, they are are nevertheless nowadays quite rare considering the number of pages that have to be processed.

I don't immediately see how these suggestions would change anything crucial - in fact in my view they turn AfC even more in to the kind of triage system that NPP is. The problems at AfC are social ones i.e. inconsistency and often poor reviewing, and too few reviewers. Proposed software changes i.e. sharing the New Pages Feed and the Page Curation tool, give AfC something it needs but still doesn't have, but they will still need to get their ducks in a row instead of swimming hither and thither. For want of an analogy: Unlike the simple vehicles I drove back in the 60s, today's cars are bundles of enormously sophisticated technology; with their airbags, satnav, ABS brakes, and all-round cameras, they have improved the safety and convenience of the users, but it still doesn't make them better drivers.

Sharing the NP Feed will enable NPPers to do reviews of drafts too (if they want to), cutting out the double check which is still very much required, and helping to cope with the slightly increased workload there. The look of the current AfC templates may change slightly (less obtrusive) and the wording be improved (:sigh: why do all Wikipedia warning/advice templates have to be so TL;DR?), but essentially, the embedded functions will remain intact. The human element at AfC is important and the reviewers still have to decide on the course of action for each draft.

Finally, a lot of the submissions are just the very same kind of dross that NPP in pre-ACTRIAL times had to handle 100s of times a day. AfC patrollers need to be more ready to have such stuff immediately deleted as it would have been be at NPP and not pretend to the creators that such junk is salvageable. That takes just seconds and negates the arguments of those who are worried about AfC being flooded by NPP's junk. Kudpung กุดผึ้ง ( talk) 01:15, 27 April 2018 (UTC) reply

I kind of like this idea, but I think rather than a complete overhaul of the process we could do most of that with the decline templates, e.g. if it's a non-notable-never-gonna-happen draft we could have a decline notice that reflects it and discourages resubmission.

Maybe we need to create a "Drafts for Deletion" (DfD) variant of XFD which does allow us to delete non-notable crap out of the draft space without flooding MFD. Primefac ( talk) 12:11, 27 April 2018 (UTC) reply

Several people, including me, have suggested that in various places. As a direct clone of AfD, it would allow speedy closures too if someone had been wasting everyone's time by hoping for a debate over an obvious non starter or piece of nonsense. Kudpung กุดผึ้ง ( talk) 13:07, 27 April 2018 (UTC) reply

If we can't use MfD as Drafts for Deletion (and I can't understand why not) then we need another AfD like profess that takes over 90% of MfD. Maybe proposing DfD pointing out WP:NMFD will push the point that in fact MfD should have the AfD search functions and consider notability. We rolled out Template:NSFW for the hopeless. I accept anything proven to be notable and not CSD worthy. The edge cases need more work, preferably by the editor who wrote the page (ie decline as not shown to be notable or no sources) Legacypac ( talk) 15:58, 27 April 2018 (UTC) reply

You can’t understand why not? You could try reading others’ explanations in the RfC that established NMFD. Many reasons. For me, a compelling reason is the unworkablility of having an experienced Wikipedian committee review of every sub-notable draft. My more-workable idea is to put the onus on the submitter to provide 2-3 sources that demonstrate notability. — SmokeyJoe ( talk) 23:39, 28 April 2018 (UTC) reply

User:Joe_Decker/IsThisNotable is an excellent start. Make the newcomer draft writers work through that, throw away 90% of the TL;DR AfC currently throws at them. — SmokeyJoe ( talk) 23:45, 28 April 2018 (UTC) reply

Article wizard

I've just visited the Article wizard. The steps to getting started are:

Option for sandbox edits to learn wikicode
Warning that unreferenced submissions will be rejected
Warnings about COI, copyright violations and promotional language

It doesn't get into the topic of notability. I'm not sure when an author learns about that. At first rejection?

We need to add a step to the wizard that discusses notability. It's not necessarily an easy topic to explain briefly so at this stage in the wizard we could offer the option to receive a notability assessment of a proposed topic by an experienced reviewer as described above. ~ Kvng ( talk) 13:00, 27 April 2018 (UTC) reply

@ Drewmutt: this was your baby. Primefac ( talk) 13:03, 27 April 2018 (UTC) reply

@ Drewmutt: what you are looking for is this: Category:Wikipedia article wizard. This is what comes of a) over-simplification, and b) not doing things as a team. Done properly, we might have even less drafts being submitted. Take a good look while you rework and add back all the missing pages - your basic graphic design was actually very good (probably the reason nobody was checking). Kudpung กุดผึ้ง ( talk) 16:26, 28 April 2018 (UTC) reply

To be fair, it does mention notability but so subtlety that most would miss it... at the bottom of Wikipedia:Article_wizard/Referencing it does link to notability. I would think it should be a page on its own between the Welcome and Referencing pages as it is fundamental to acceptance. Cheers KylieTastic ( talk) 16:38, 28 April 2018 (UTC) reply

Heya @ Kudpung: and @ Kvng:, I remember some thoughts about this topic during the redesign discussion. It's a difficult and, inevitably, a losing balance between the wall of GNG, and the over simplification of it. Compounded by the fact that we can't even agree what it is. I'm not against getting more into it during the wizard, but from the beginning my thoughts were of more actionable/concrete things, more like "don't do this and this as that's why we commonly reject things". Beside from rehashing WP:GOLDENRULE, I can't think of a more simple way of putting it. The GNG is, by design, subjective (a topic for another day), so beside from saying "Hey do your best, don't spam, and cross your fingers", I'm not sure what else we can add.

I've always wanted a silver bullet to split spammers, and well meaning people apart. I know it's an impossible goal, but I also don't want to bog down people who are writing about a new butterfly species with incomprehensible policy. I have a likely unpopular rant around this, but maybe the solution is more of a branched approach? Are you writing about:

A scientific topic
A person
A business (immediately adding their submission to AfD ^_^)

And so on.. Then giving them more specific SNG flavored requirements. Also @ Kudpung:, thanks for bringing up the uselessness of the verbiage on our templates. It's been something I've been meaning to address, so I'm glad I'm not alone. Drewmutt (^ᴥ^) talk 22:13, 28 April 2018 (UTC) reply

No kidding. I think it would take more than two hours to read all linked pages in bio comments on the template that we use to decline a biographical article at AfC. Vexations|Drewmutt|Primefac ( talk) 22:28, 28 April 2018 (UTC) reply

@ Kvng, Vexations, Drewmutt, and Primefac: The problem is that the 'RfC' for changing the Wizard was never a consensus on Drewmutt's finished product. The Article Wizard is too important to be left to guesswork and something that just looks nice. I'm not married to the 30 hours of work I put into my iteration at Category: Wikipedia article wizard that was was brushed aside without so much as as even a word, but I was hoping that it would be the basis for discussion on better graphics and further pruned text. Each and every page of that iteration should have been the subject of a discussion, including, if possible, with people who have majored in Communication Studies and people like Vexations who can review the boilerplate texts from the perspective of non-Native English users - let's not forget (as NPPers are only too aware) that a vast number of new pages are created by people with a very limited knowledge of the language, and that means leaving out idiomatic expressions like 'get the hang of it' - 'het onder de knie krijgen' .

All the TL:DR policy and guideline pages can of course be left unlinked - our bureaucracy has no more effect on what people do than the never-read set of T & C on an online store. The Wizard pages themselves, however, should be informative, descriptive, and sufficiently 'prescriptive' to deter those who come along with a total WP:CIR or the sole purpose of planting spam or making a fast buck out of our volunteer work, while providing useful tips for those with honest intentions and ideas for truly useful new articles. That means retaining a lot more of the interactivity and pages of earlier iterations of the Wizard that Drew radically axed - people actually do like pushing buttons (that's why video games are such a huge success), or at least until they reach the page that says "sorry pal, game over."

Now that we have 6 months of ACTRIAL experience to draw from, and now that ACREQ is permanent, there is time to work on this properly and do some A/B testing. Now that we've done the 'A' part for six months, I suggest we go back to my version for a month, then compare the stats, and then work together on a final version. The Wizard is not the exclusive property of the AfC project, It needs workshopping on a dedicated page such as, for example, ' Wikipedia article wizard/development'. There's no hurry, it took 6 years to get ACREQ, nothing can be worse than it was - and let's not forget to add the new reworked Wizard page versions to a category... Kudpung กุดผึ้ง ( talk) 01:34, 29 April 2018 (UTC) reply

I don't know if this was the issue with others but I would much rather work on incremental improvements to the existing wizard than try and review and A/B test a complete overhaul. If we can't add a WP:GOLDENRULE step to the wizard, at least I now know that, under the current system, the first review is really going to be when authors find out about notability. ~ Kvng ( talk) 02:04, 7 May 2018 (UTC) reply

The problem, Kvng, is that there already has been a total and complete overhaul of the Wizard: Drewmutt's. I did a mild makeover in preparation for ACTRIAL which could hardly even be classed as a major upgrade, but a few days later and unanounced, Drewmut deprecated many of the Wizard pages, replacing what was left with very barebones instructions that just invite clicking through. Our A, B versions are there, we have had 6 months of version Drewmut (which were never even categorised), now let's revert to my version (which wasn't too removed from the original version that's served us for years) just for a month and then compare results. At least the Wizard page would provide a brief note about notability which Drew's iteration completely ignored. And in the meantime, I'd be happy to work with a couple of users who like me have pursued formal studies in communication which take into account a reader's cognitive perception of a set of instructions. The very uselessness of the verbiage on all our Wikipedia templates is due to the lack of an expert approach to effectively helping new users - even the majority of the Welcome templates are enough to scare a newbie off - but reducing them to a scant primer is equally counter productive. Kudpung กุดผึ้ง ( talk) 13:01, 7 May 2018 (UTC) reply

I have no idea how to do an effective UX design as a collaborative project. Everything I know about design tells me this is folly. ~ Kvng ( talk) 13:54, 7 May 2018 (UTC) reply

For what it's worth, a good chunk of my career is in UX, so I'm happy to help in that regard as much as you'd like me to. I'm not at all opposed to Kudpung's suggestion of an A/B test. Quite obviously, my professional experience with UX is based around making tricky processes as easy as possible to understand and navigate. I understand there's some views that this isn't entirely a good thing, as Kud mentioned, which I get, but putting up intentional (punitive?) hurdles doesn't feel right either. As far as pieces that I "removed", I'd gladly like to see those back in, if people feel that's the way to go. Although, and it seemed during the RfC people agreed, that perhaps we take the cleaner version and "re-add" the missing pieces as opposed to going back to a modified version of the original. All that being said, I'm happy to do an A/B test, if there's consensus for it. Drewmutt (^ᴥ^) talk 16:55, 7 May 2018 (UTC) reply

I'm still being misunderstood: I'm not talking about putting up punitive hurdles. I'm suggesting applying some cognitive science which I was researching a decade before Don Norman invented the term UX. It starts off with things as simple but as effective as a DON'T WALK / WALK sign at a pedestrian crossing (crosswalk): one is a prohibition - a command, the other is offering a option to do something if you wish to proceed. Neither however, is punitive. Some crossroads have lights that display a straight standing red person and a green person in a walking posture. A B testing might not reveal which works best, but it's a safe bet that in a multilingial society, the icons might be the better choice. In a multi cultural environment however, there might be an element of people who will stand for ten minutes at the red light on a straight desert road completely devoid of traffic patiently waiting for the sign to go green, while there are others who even in a busy city centre will disregard the signs and do whatever they want anyway even if they risk a sign for jaywalking. Sorry about the lecture, but on Wiklikedia we have both sorts and everything in between and we have to find the right balance. UX is a result of AB testing - otherwise there are not metrics to evaluate it. Kudpung กุดผึ้ง ( talk) 18:47, 7 May 2018 (UTC) reply

Uh, Kudpung, your characterisation of Drewmutt's design being some top down design forced upon the Article Wizard is a bit disingenuous. The design solicited a lot of input, much of which was incorporated into the final design- I speak having had my two cents incorporated into the final product. Looking further, it's interesting that at the initial discussion you made no mention of A/B testing, in fact you said that you "liked it" and concurred with TonyBalloni that Drewmutt's proposal was better than the existing Article Wizard, before going off on a tangent about paid editing and your perennial "the WMF hates us all" line.

Your claim that "a few days later and unanounced, Drewmut deprecated many of the Wizard" [sic] is again particularly interesting given that the design then went through another approval stage- a 20 or so strong Village pump RFC, which resulted in a unanimous decision to adopt the new design- including no less, a support from yourself for the design in your RfC.

Looking at the page history of the old Article Wizard explains a lot. It appears that in September 2017 you attempted to carry out what you are accusing Drewmutt of- a top down, unconsulted redesign, only to have it promptly reverted. I'm sorry that your own top down redesign was reverted Kudpung, but let's not use that as an excuse to attack Drewmutt. jcc ( tea and biscuits) 20:46, 7 May 2018 (UTC) reply

I'm afraid I have little interest in cognitive science or A/B testing as applied to UX. We have a problem with submissions of drafts about non-notable subjects. Is it possible to add a notability test step to the current wizard? Is this one of the things that was removed to streamline the wizard? If so, can we just add it back in? If not, I'm happy to help workshop something. ~ Kvng ( talk) 23:36, 7 May 2018 (UTC) reply

(Reposting because I inadvertantly trubcated some text) ([[How can we have done AB testing before the ACTRIAL was implementedI was fully aware at the time that the changes I made to the Wizard were so minor they could have been adopted without an RfC. They were as much as I dared to do and almost imperceptible but it was as much as I was prepared to do in he available time. When I voted on the RfC I thought that we were voting on the layout that had been made in harmony with the WP:LANDING design, and I was impressed that someone had thought on the sames lines as I had and considered, for example, the nav tabs (which I hated but didn't dare hadn't removed) to be superfluous. TonyBallioni is almost always right about most things and I assumed he had done a more thorough check than I did. Certainly nobody realised that the vast majority of the wizard pages and a lot of its valuable information had been axed. Several users remarked on the use of colloquial language. The irony is, that I would not even commenting here had the AfC people not complained in the ACTRIAL RfC that ACTRIAL was increasing the number of drafts and putting a strain on their resources. I have a theory that the current iteration is party to blame for that increase and it should be obvious to anyone who has properly read and understood my comments above that AB testing is the only way to prove or disprove it.

I have always concurred that Andrew's version was a vast improvement over any previous iteration and I welcomed the initiative. I would be happy to work for up to 30 hours or so with someone or so to create a version to be used in the test, but if nobody is interested, then so be it, but let's not hear any more complaints about backlogs at AfC or that Andrew's version won't work on mobile phones. Kudpung กุดผึ้ง ( talk) 01:17, 8 May 2018 (UTC) reply

Kvng, a better explanation of notabilty, or more accurately, referencing as a prerequisite for articles, is crucial to avoiding many drafts being declined whether they are dross or articles with at least some potential. These lacunae have almost certainly contributed to the increae in drafts, but I seriously consider that what needs to be done is more than simply putting back one or two phrases somewhere; it requires a bit conditional leads to other information that was not reused I think it needs a bit of teamwork, but not that every single change needs a month long RfC to approve them. Do some changes, run them for a month, then compare the stats. That shouldn't be so difficult. Kudpung กุดผึ้ง ( talk) 01:26, 8 May 2018 (UTC) reply

I just took a look at the Wikipedia:Article Wizard and I was struck by the absence of what I treat as routine whenever I advise new editors on their first article: the preamble. The Wizard starts

Before starting the process of creating an article, you can get the hang of things by first editing in your sandbox.

whereas I always try to start with

Before starting the process of creating an article, you should first collect and read as many good sources on the topic as you can find.

That allows me to explain that you need to find several independent reliable sources before you do anything else. By making proper sourcing the very first thing the editor is guided to consider, I try to get them onto the right track before they even press a key in the editing window. I know that won't make much impression on the UPEs, teenage autobiographers and zealots, who already have their mind made up, but I really feel it might be a big help to good-faith new editors who just need to learn how to write an encyclopedic article. Cheers -- RexxS ( talk) 01:53, 8 May 2018 (UTC) reply

The second step of the wizard is dedicated to sourcing. Whether that should be fist or second is something we can discuss. Discussion of notability is almost entirely missing and I think that is a much greater deficiency that we need to focus on. ~ Kvng ( talk) 13:24, 8 May 2018 (UTC) reply

The three most important foundations of notability are sourcing, sourcing, and sourcing. If you don't install in the new editor a recognition that sourcing precedes content, then trying to get them to understand notability is a very uphill task. If you guide them to find "several independent reliable sources before [they] do anything else" as I stated above, you're 90% of the way to meeting N without even having to mention the word. -- RexxS ( talk) 17:12, 8 May 2018 (UTC) reply

Authors submitting stuff to AfC are not researchers and this is a steep learning curve to put them on. Their main concern is, "Why was my draft rejected?" The wizard needs to meet them at this level. The learning curve up to the WP:GOLDENRULE is not as steep as the one up to a researcher fluent with sourcing. ~ Kvng ( talk) 23:20, 11 May 2018 (UTC) reply

Kvng, are you suggesting that most new users don't have a clue about what sourcing means? You may of course be right but surely everyone has seen a book, a journal, or a document that has footnotes - everyone has had to fill out an application form for a passport or a driving licence or do an MCQ exam. The basic concept is not so hard to grasp, even if they find our Wiki markup a challenge. I think anyone coming to Wikipedia should already expect it not to be as easy to edit as Facebook or Twitter. One problem is that the Wizard links the user to WP:REFB which while supposed to be an 'easy' guide, is another Wikitypical wall of text. Kudpung กุดผึ้ง ( talk) 03:56, 15 May 2018 (UTC) reply

@ Kudpung: new editors have trouble differentiating reliable and unreliable sources. Telling them they have to find and use reliable sources doesn't necessarily get them to where we need to be. Also telling them that Wikipedia only accepts articles on subjects that have established notability by virtue of coverage by reputable publications will probably make things click more quickly for them. ~ Kvng ( talk) 13:40, 15 May 2018 (UTC) reply

I'm glad to see the direction this conversation has moved in because helping editors establish notability through reliable sources, without throwing them into the deep end of reading Wiki policy feels like exactly what the wizard should do. I would just add that I think there would be value in us becoming comfortable with the idea of more A/B testing. In this way people might be more willing to reach consensus to try a change knowing that if it didn't work it wouldn't be permanent or would need a difficult new consensus process to change (or change back). I think it's actually a way of addressing how to do good UX in a collaborative project. Best, Barkeep49 ( talk) 17:57, 16 May 2018 (UTC) reply

Update 2018-05-07

Hi all -- following the update above on 2018-04-25 and the ensuing discussion, the Community Tech team has begun to plan out this project in earnest. I've reworked the project page to summarize the plan that this group developed here. I used the product management construct of " user stories" to detail out our current plans, which I hope are a clear way to convey them. Please read the updated page over and comment if anything looks amiss, or does not reflect the consensus from this talk page. And for those of you who are interested in following along from the technical perspective, this is the main Phabricator task that Kaldari's team will be working from, which has linked subtasks.

We definitely want to work closely with this community to make sure we get this right, so I'll be posting frequent updates and mockups to get your input. Thanks for working with us so far, and I'm looking forward to the next several weeks.

-- MMiller (WMF) ( talk) 21:42, 7 May 2018 (UTC) reply

Topic filtering

I noticed some people above were talking about filtering drafts by topic or subject. I would suggest that the new draft list also have some sort of way to filter by topic, as people seem to find that feature of toollabs:apersonbot/pending-subs useful. I implemented it with a user-generated map from infoboxes to wikiprojects, seen here. The tool also allows drafts to be sorted by a number of criteria, which were copied from the big list at AFC/S mentioned above. Enterprisey ( talk!) 21:43, 9 May 2018 (UTC) reply

@ Enterprisey: thanks for pointing out where the topics come from in your tool -- I was wondering that. I think the group was excited about filtering drafts by topic, but decided that copyvio and quality filters were more important in the near term. I do hope we can revisit this idea in the future, though. I'm also interested in any other reflections you have on the conversation on this page. -- MMiller (WMF) ( talk) 00:33, 15 May 2018 (UTC) reply

Commemts on latest update by Legacypac

Comments in additon to Copyvio we are looking for PROMO/SPAM and attack pages. I review a lot of AfC submissions but very rarely conflict with another reviewer - not a big deal but nice to have. Many AfC submissions come from Userspace (mostly sandboxes). We very quickly move them to Draft unless they are an obvious decline. 50% likely Copyvio? There is color coding build into earwig - that could offer some guidance. Legacypac ( talk) 17:06, 11 May 2018 (UTC) reply

@ Legacypac: thanks for the thoughts. I think the ORES " draftquality" model will be able to help reviewers find spam and attack pages, and that's part of the plan. I have a question for you about User space: given that submissions from User space are quickly moved to Draft, do you think it will be sufficient if the New Pages Feed only includes AfC submissions from Draft space, as opposed to also User space? I'm asking to get clarity on which things are and are not most important, as (with any engineering project) we will have to prioritize which of the ideas we will and will not be able to accomplish. -- MMiller (WMF) ( talk) 00:30, 15 May 2018 (UTC) reply

@ Legacypac: just wanted to check back on this specific question about User vs. Draft space. Let me know what you think! -- MMiller (WMF) ( talk) 23:13, 25 May 2018 (UTC) reply

.:::@ MMiller (WMF): we have a sub-category to watch Category:Pending_AfC_submissions for userspace drafts so no. Legacypac ( talk) 00:00, 26 May 2018 (UTC) reply

Comments by Kudpung

These are my opinions only. If we are to be workshopping this project here, I suggest each participating user make their comments, at least to start with, in their own section. Espresso Addict, Primefac, Kvng, Robert McClenon, Vexations, Enterprisey, SmokeyJoe, Doc James, MMiller (WMF), Gryllida, and Rentier

"sort by random" option so that reviewers can choose random pages: Probably not so important.
Allow the feed to list both drafts that are and are not submitted for review. Remove submitted drafts from the list (as is done fore 'patrolled' new pages. When drafts are rejected by a reviewer, they should show in the feed again. This is going to require some intearction between the HS templates and the feed. Helper Script editor(s) please comment.
Reviewer selects whether they are doing New Page Review or Articles for Creation, which would automatically set certain settings, and make irrelevant ones disappear. Very important. DGG please comment. See image.
Surfacing straightforward checkboxes for all the data elements that reviewers can choose, i.e. all four draftquality categories and all six wp10 (quality assessment) categories. I am not sure what is meant by this. The view of the list tgat AfC reviewers get should not be more complex than what the NPP reviewers see. We don't want to kill off interest in using the feed by making it so cluttered that reviewers can't see the wood for the trees. There is an important list of required fixes for the feed at WP:PCSI. Not all of these are high priority but most of them are. WMF please take a moment to go through the list and see if any of thew current suggestions overlap - some of those requests are now 2 years old ans still urgent. Ryan?.
Phrasing the options more like sentences, e.g. "Only those drafts likely to be vandalism". Since ACREQ pure vandalism pages are now raree. Legacypac please comment on the frequency of vandalism pages at AfC.

Rare as far as I know. Blank is more common. See Category:Declined AfC submissions Legacypac ( talk) 04:46, 12 May 2018 (UTC) reply

"Newest" and "Oldest": add dropdown box for sorting in the Special:NewPagesFeed. This *is needed for NPP anyway.
We do not think reviewers will need to sort by numeric ORES scores after filtering to an ORES category. Nor do I.

Listing

Use the API of Earwig's Copyvio Detector, High Priority. Should be fully automated. Should display, eg. 'CV99%' in the feed and should attach a COPYVIO template to the article and automatically send a warning to the creator.
Keep drafts under review out of the New Pages Feed upon load/refresh. (We won't be able to make them dynamically disappear from your screen when someone else puts them under review after you have loaded the page.). As per current usage at NPP.
{{tq|Blue "review" button should behave for AfC. (This button does the same thing as clicking on the page title.) Button function is fine. Consider gaining real estate in the feed entries by making it smaller.
What to do with the icons on the left for AfC (e.g., the trash can icon for pages that have been nominated for deletion). Make icons that point out copyvio violation or different ORES categories. See above.

Other

Refer to the two ORES models with more descriptive names than "draftquality" and "wp10". not sure that this is necessary beyond the basic numeric alerts for copyvio and ORES. We don't ave this for NPP and I don't see it being a further help. The important thing is to get reviewers to open and look at the page rather than than the list for low hanging fruit.
We will look into the technical feasibility of rescoring articles with ORES and copyvio on every edit, but will need to keep an eye on whether this overloads their respective APIs. I don't think this is necessary.

@ Kudpung: what do you think would work well in terms of how often (or in what circumstances) to refresh these scores? For instance, if a reviewer (in either NPP or AfC) notices a page that contains copyright violations, is it usually the case that the page is nominated for speedy deletion, or is it sometimes that the violating text is removed? In the latter case, I would imagine that we want to rescore it for copyvio so that another reviewer doesn't open a page because of its now-obsolete high score. -- MMiller (WMF) ( talk) 00:38, 15 May 2018 (UTC) reply

Hi Marshall. We've had this discussion before at Reply to Kudpung from Primefac below. I think we're trying to achieve something more complex than is needed or as we say, a solution looking for a problem. There are no bright lines for COPYVIO treatment because as has already been mentioned, a high score could be returned due to long names and quotes, and lists. The most important thing of all is to flag articles where duplications are detected. The rest is up to the reviewer what she or he does with them and that's a case for local education as explained by RexxS. This is why CorenBot's template was so important, it forced the reviewer to take a look, but it already listed the URLs of the offending source sites and tagged the article. That's the process I'm personally wishing to win back and the one I found most helpful. Kudpung กุดผึ้ง ( talk) 03:38, 15 May 2018 (UTC) reply

Open questions for the community

1. How common it is for drafts to get created in User space, as opposed to Draft space. 2. Will AfC reviewers commonly need to review in the User space? 1. Legacypac, ICPH, do we know? 2. Reviewing user space sub-pages is a ready one of the filter options in the New Pages Feed. AfC reviewers are not obliged to review user space. See also [[WP:User pages}}.

Hard to know exactly as we quickly move most to Draftspace and don't track this but 10-50 a day maybe? I often see 5 or 6 waiting and by time I process them there are a couple more waiting. Tracked here: Category:Pending AfC submissions along with ones in Article space (often creator moves) Legacypac ( talk) 04:46, 12 May 2018 (UTC) reply

Likelihood that AfC reviewers attempt to review the same drafts at the same time. Probably unlikely. Never (or extremely rarely) happens at NPP. Occasionally two admins will be trying to delete the same page at the same time, but again rare.
The explanatory text (and any related references on other pages) that shows at the top of the New Pages Feed will likely need to be updated to reflect the additional future uses of the feed. Best left until later. We edit this notice ourselves for NPP. Instruction creep is always a Problem with Wikipedia. Better to make a better tutorial for AfC and get the reviwers better educxated. The low quiality/inconsistency/low nuber of active reviewers of reviewing at AfC is what raised all these issues in the first place - not the backlog as claimed by the AfC reviewers on the ACTRIAL debate. RexxS please comment.
1. Is there an existing sense of what copyvio score a reviewer would consider to be high enough to investigate? 2. If there is a clear threshold, we may be able to include that as a quick filter in the New Pages Feed, in addition to allowing it to be sorted by score. 1. Anything over 15% needs some copied text removing. Below that it's usually long names etc that are the same in the original and thw Wiki page. (Ocasionally quotes). 35 - 40% tend to become critical. I usually deleted and salt pages that have 50% or more clear, deliberate plagiarism. According to the rules, all COPYVIO should be removed. When douingthat leaves only a stub, IMO the page is best deleted. Color codes not necessary (puts too much code in the API).

Kudpung กุดผึ้ง ( talk) 22:43, 11 May 2018 (UTC) reply

Also, please include Wikipedia:Page_Curation/Suggested_improvements#11._Patrolled_by_Twinkle in this upgrade because those are the pages that have been tagged by non rights holders and should come under special scrutiny. Kudpung กุดผึ้ง ( talk) 23:25, 11 May 2018 (UTC) reply

Reply to Kudpung from Primefac

A high CV% does not automatically mean it's copyvio. I've deleted pages marked 19% and kept pages marked over 90%. Sure, they're the exception rather than the rule, but unlike some admins we should use the CV% as a starting point, not an automatic "it's above X% so we'll mark it for deletion". Hell, when SQL first put up the ORES/CV table the second draft on the list was marked at something like 93% simply because of a huge list of publications.

If you want to have a CV% sorting option, that's fine, but it's unnecessary to do any more than sort by the total %. If you start throwing around automated templates all you'll end up with is page creators gaming the system (since they'll be able to "test" their paraphrasing) and you'll get admins who don't bother to actually check if the damn thing really is a copyright deleting the page while admins like myself who actually do investigate get pissed off at the number of false positives.

In general, EVERY page needs to be checked for copyright violations. We should not be marking drafts as "checked" because that leads to complacency, but it would be helpful if there was a way to flag a page (or maybe even a specific diff) as a false positive so that it's not being continually nominated/checked/etc for having a high %. Primefac ( talk) 02:40, 12 May 2018 (UTC) reply

Is there an existing sense of what copyvio score a reviewer would consider to be high enough to investigate? 2. If there is a clear threshold, we may be able to include that as a quick filter in the New Pages Feed, in addition to allowing it to be sorted by score. Nobody is talking about introducing any rules for what CV% are cut-off points for reviewer/admin action. What we are discussing are the parameters for an ORES display. CorenBot did an excellent job during the years it was in action, and IMO that's really all that's needed. I will never understand the resistance to rebuilding it, and wanting to make CV identification so computer-intellectually complicated. At the end of the day, reviewers and admins use their discretion on what is to be done with COPYVIOS, and that is not the discussion here. Al we want the software to do is encourage reviewers to check for COPYVIO. Copypatrol can't do it, although Diannaa's team do an excellent job isolating COPYVIO additions to existing articles. Kudpung กุดผึ้ง ( talk) 05:07, 12 May 2018 (UTC) reply

CV% is not copyvio . it's a measure of similarity against what a particular search strategy finds that is useful as a first step for determining the likelihood of actual copyvio. There is no critical value. Articles that are primarily lists will always have a high CV, but that does not mean copyvio if they cannot be expressed otherwise. Articles with a low CV that read as if they were taken from a website need a manual check with knowledge of where they are to be expected in the particular case. There is a useful tag {{copypaste}} that serves as a warning. Since we don't actually tag drafts, it can be used in essence as a comment. The various automated functions are only rough guides that help in determining priorities and as a first step in suggesting analysis. DGG ( talk ) 05:53, 12 May 2018 (UTC) reply

The various automated functions are only rough guides that help in determining priorities and as a first step in suggesting analysis.. Precisely. Thank you for saying it so succinctly, DGG. Kudpung กุดผึ้ง ( talk) 08:06, 12 May 2018 (UTC) reply

I have no opposition to having something like Corenbot resurfacing, and I apologize if I misinterpreted your original statements (I need to remember not to comment at stupid hours of the day); I thought you meant tagging pages with G12 automatically if it was above a certain %, or leaving a "this has no copyright" notice that a few reviewers had at one point been leaving. Primefac ( talk) 13:12, 12 May 2018 (UTC) reply

Comments by Kvng

My understanding is that the WMF guys ( MMiller (WMF) specifically) have read through and understand what is needed and have an idea for how to get there and my hope is they're now off and implementing something and in a little bit we're going to see a preview of it and we'll request some tweaks and maybe we'll do that again and then we'll be at the point where we have something useful and others will start using it and maybe we'll make some more changes and then even more will start using it and everyone will get used to how it works and if we try to make any more changes people will freak out. J/K on that last part, sort of. Anyway, this is the Spiral model of development. It works well for projects like this where most stake holders can't give great feedback until they have something they can touch and feel. ~ Kvng ( talk) 23:34, 11 May 2018 (UTC) reply

Comments by RexxS

Just a reply to Kudpung's request for now, although I'll revisit this when I have more time.

The idea of a tutorial is good, but making a useful tutorial is a long process of creation, trial, feedback, and refinement. You will need either a very dedicated driver or a small team to craft something worthwhile. Even then, most people will only be able to learn the background and underpinning knowledge from a tutorial. It's best to concentrate on delivering the most important items of information, aids to memory, and summaries, in digestible chunks with links to more expansive reading. Somewhere, reviewers will still have to exercise what they have learned in order to refine their skills. Perhaps you might consider a multi-stage tutorial: (1) First steps; (2) Review; (3) Sharing experience. Or something like that. Between each step, have reviewers do some reviewing with the objective of putting into practice what they have learned. You'll probably need either a 'mentor' or a place where reviewers can share their experiences and learn from each other (or both). I'll always be happy to offer assistance where I can. -- RexxS ( talk) 10:56, 12 May 2018 (UTC) reply

Comments by Vexations

A few random thoughts, all strictly my personal views and not reflective of any kind of consensus:

User space drafts should not be subject to review for notability and quality and remain largely protected from deletion, with the exception of copyright violations and attack pages. Unsubmitted user space drafts should not be listed. When I move an article to user/draft space, it is to give the editor a break from the threat of deletion and let them improve the article to the point where it can "survive" a review. If we expand our reviewing to those spaces, editors will not have a quiet place to work on an article.
I also don't think unsubmitted drafts should be reviewed unless there has been a substantial change since a previous review, AND there is a serious issue that needs to be addressed, like a copyvio, a BLP problem or an attack page. Editors should be given the opportunity and time to work on their drafts mostly undisturbed.
I have some doubts about the usefulness of scoring articles by the likelihood that they are vandalism. If I were a vandal, I wouldn't try to submit my vandalism through AfC. I don't think that there is much vandalism that AfC has to deal with. What would be really useful is sorting out the filters so that they allow for more specific targeting. I'd like to show, for example, ONLY articles that have no in-line citations or only those articles that meet a combination of criteria, like WP:PETSCAN lets me do.
WRT "Phrasing the options more like sentences", I much prefer that the software tells me exactly what it is going to do rather than paraphrase. Similarly, please DO use the exact terms "draftquality" and "wp10" because they have a well-defined meaning. You're writing software for experts, not the general public. People who don't know what those terms mean should not be reviewing.
Review conflicts are exceedingly rare and not a priority. It happens to me sometimes at NPP and is almost never a real problem.

Vexations ( talk) 01:46, 13 May 2018 (UTC) reply

New comments by Kudpung

The way the feed toggles between NPP and AfC looks fine.

I don't see how sorting by quality is helpful. I don't see how using ORES capacity to generate this information is useful. Newly submitted drafts or new pages are never at a quality that is higher than 'start class'. Some new pages are complete articles and at first blush appear to be immaculate but they would never be C, B, A, GA, or FA ; these are the pages that are dangerous because they are mostly the advocacy and/or paid editing attempts. The same applies to the way the information is displayed in the feed entries.

I don't see how sorting by COPYVIO value is helpful. I don't see how using ORES capacity to generate this information is useful. Newly submitted drafts or new pages either have COPYVIOS or they don't, and simply flagging an article as CV should be enough to force any reviewer to take a closer look. Not all COPYVIO is flagrant abuse of the rules - a lot of it is made by good faith users who don't understand what copyright is. If anything, all we need is a red alert if almost the entire page has been pasted from another source, and a low alert for minor plagiarisms. The same applies to the way the information is displayed in the feed entries. I'll mention again however, that CopyPatrol is not part of AfC and NPP and never will be - nothing can replace the valuable features of CorenBot, and if it is not possible to recreate it as a feature of this MediaWiki extension, then at this stage of evolution of Wikipedia we should be looking at obtaining a grant for a couple of community volunteers to develop it as a local script.

In all, the preferences for filter selection can be much less granular.

In anticipation of images from the WMF, I had already made a mockup of what would serve my purposes. As a patroller, it's all I need, and it's not cluttered. Sorry to keep harping on this but I think some of the design resources should be urgently allocated to our list at Wikipedia:Page Curation/Suggested improvements among which many are useful to both AfC and NPR and which have been waiting for attention for almost 2 years. Kudpung กุดผึ้ง ( talk) 10:23, 18 May 2018 (UTC) @ MMiller (WMF):. Kudpung กุดผึ้ง ( talk) 10:36, 18 May 2018 (UTC) reply

@ Kudpung: thanks for taking a look and for making that mockup. I have some comments and questions for you.

Regarding the sorting of quality, it seems to me that it is relatively common for draft pages to (appear) to be above Start class. The reason I think this is because of User:SQL's page, in which the ORES and Copyvio scores for all submitted drafts are listed. Of the approximately 1,000 drafts currently scored there, the distribution of ORES predicted classes is as follows:


Predicted class	Count	Percent
FA	2	0.2%
GA	7	0.7%
B	31	3.2%
C	275	28.2%
Start	519	53.2%
Stub	141	14.5%

That means that ORES is predicting about a third of the drafts to be above Start class (C, B, GA, and FA). Taking a look at a few drafts predicted to be C-class ( here, here, and here), those drafts do seem to contain more content and better organization than Start or Stub pages. At the same time, I see your point about highly developed drafts potentially being more likely to be copyvio or promotional pages. I think Legacypac was using SQL's page, so maybe there is something we can learn from that usage.

ORES also predicts that 9% of the pages are "spam" (like this one). Could NPP use this to filter to pages that need to be cleared out first?

Regarding your thoughts on copyvio and your mockup, it sounds like you're saying that the high granularity of a numeric copyvio score is not necessary. It's more that reviewers should be able to find those pages that are likely to contain copyvio. That helps -- I'm interested in others' thoughts on this front.

And then in terms of " concept A" and " concept B", it sounds like you prefer the thinking around "concept B", in that it presents a specific set of recommended filters that are useful for reviewers, instead of the flexibility of "concept A" in which reviewers can assemble their own idiosyncratic filter combinations. Is that right?

I also have some follow-up questions for you:

When you are checking for copyvio while patrolling new pages, do you do it by looking up each page in Earwig's tool? Or do you do something different?

I don't do this systematically for every page, no. I rely largely on a long experience to identify what is likely to be a COPYVIO. I usually paste snippets straight into Google because it's faster than using Earwig. I do get a lot of false positives but I'm usually right when there is in a fact a copyvio. Some copyvios are almost blatantly obvious, that's when I call up Earwig just to see to what extent.

Do you have any thoughts on the usefulness of Google for copyvio (the default setting on Earwig's tool) vs. Turnitin (an optional setting on Earwig's tool and the only service used by CopyPatrol)?

I don't believe I've used Turnitin.

What do you think of the toggle at the top of our wireframes (the radio button whose options are "New page patrol" and "Articles for Creation")? I noticed that your mockup simply adds drafts to the list of namespaces, whereas our wireframes ask reviewers to choose which type of review they're doing.

It doesn't show on my mock up, but I already said I think the toggle in your wireframe is fine

Do you have thoughts on the idea of randomly selecting a page for review? Is that a need that exists in New Page Patrol?

personally I don't think it's necessary for NPP, but it might be useful for the AfC reviewers. You would need feedback from them.

-- MMiller (WMF) ( talk) 00:13, 24 May 2018 (UTC) reply

MMiller (WMF), see my answers in green in your post. Kudpung กุดผึ้ง ( talk) 01:48, 25 May 2018 (UTC) reply

MMiller (WMF), A couple more comments:

Performing additional re-checks for copyvio on the material in the queue is not necessary. Besides which, they use up a quota system if I have understood Diannaa correctly.
Using resources to already class articles as 'start', 'C', 'B' class, etc is not useful, they are not criteria for inclusion and we are concerned with them here. Every WikiProject uses these criteria and does it themselves. It's not a function for AfC or NPP, nor can I see it of being of interest to reviewers. It would probably confuse them by causing them to wonder what they are supposed to do with that information. We need to avoid using ORES as a solution looking for a problem. What the New Pages Feed and its Curation tool urgently need are practical upgrades that will streamline the workflow of the reviewers. We must not forget that everyone over here is a volunteer, they get no thanks or compensation whatsoever for their work. The easier we make it for them, the more they will do. Kudpung กุดผึ้ง ( talk) 04:18, 25 May 2018 (UTC) reply

Asking for thoughts on the update from 2018-05-17

Hi all -- I'm just posting here to ask everyone to check out my update on the project page from 2018-05-17.

Here's the short version:

I posted wireframes of some different ways we're imagining this change to the New Pages Feed could look. We're looking for reactions to these wireframes, and in particular between "concept A" and "concept B".

AfC list

AfC menus: concept A

AfC menus: concept B

NPP menus: concept A

Implementing copyvio scoring for pages in the feed is turning out to present a series of technical challenges. We have different approaches we could take, and we're looking for thoughts from people who have experience with copyvio.

Please check out the the project page for all the details. Thank you! -- MMiller (WMF) ( talk) 00:19, 24 May 2018 (UTC) reply

I'd rather have more control in the hands of reviewers, and I like the ability to set a Copyvio score yourself, Concept A from me. Though 'quality' needs to be called 'ORES quality' or 'Assessed quality' or something so that users don't think that these are user assessments. — Insertcleverphrasehere ^{(
or here)} 01:41, 24 May 2018 (UTC) reply

If the system ran a copyvio check once when it sees a draft submitted to AfC and displayed that, it would be enough. Few drafts get much work done after submission, until rejection. This would burn more credits because because we don't necessarily check copyvio on pages before declining them, but would help us CSD copyvio instead of declining. For acceptable pages we can skip running copyvio if the score is low so that would not burn any extra credits. I see no reason to run copyvio on drafts with every edit or large edit. For example if I decline the page running copyvio on my canned decline is pointless.

Knowing a page machine scores above C is useless as its not true. Knowing the machine score will help us add the score on acceptance. Clustering the attack, spam, etc together is very fine.

One way to save copyvio check credits might be to exempt all edits by certian rights holders from automated checks as they are low risk. Hopefully a NPR or Admin or Autopatrolled user is not adding copyvio. One might also exempt editors with more than X edits? Legacypac ( talk) 23:02, 24 May 2018 (UTC) reply

A few points: (1) The Copypatrol system is already checking all additions over a certain size threshold for all content added to draft space and article space; there's no need for a separate duplicate check. That's a waste of finite resources. See my post below for more information. (2) Patrollers should be checking all drafts for copyvio (using Earwig's tool or by googling any suspicious-looking segments of prose) whether you plan on declining the draft or not, because declined drafts remain on Wikipedia for a minimum of 6 months. (3) The Copypatrol system has a whitelist already in place of non-violators who repeatedly trigger false positives for various reasons. — Diannaa 🍁 ( talk) 00:35, 25 May 2018 (UTC) reply

Diannaa, to 2: Reviewers will almost certainly not systematically wait the 20-30 seconds for Earwig to do its work. This is why I keep calling for automatic templating as was done by Coren; since the implementation of ACREQ there are far cfewer new pages to patrol and as our focus is now very much on spam, undeclared paid editing and COI, that's what's on the increase and the ones that often include chunks of corporate websites or private bios and CVs. Drafts only remain for 6 months if they have not received productive edits edits in that time. Kudpung กุดผึ้ง ( talk) 02:05, 25 May 2018 (UTC) reply

@ Legacypac: thanks for your thoughts here. I'd like to make sure I understand something that you wrote. I'm glad that flagging drafts with whether they are likely spam/vandalism/attack will be useful. But I'm not sure what you meant about knowing the likely class (B-class, C-class, Start, etc)? Would you use that to prioritize higher scoring drafts for review? -- MMiller (WMF) ( talk) 23:59, 25 May 2018 (UTC) reply

Thanks User:MMiller (WMF) I tend to select drafts based on age and am confident enough to cover 95% of topics. For me the #1 question is notability and no computer can judge that. I've seen drafts that might score C on a band that fails WP:NMUSIC and stubs on Order of Canada winners that meet WP:ANYBIO. So to me the score is only useful for rating the draft if I accept it. I would prioritize deletion of spam/attack etc if I could easily see that in the feed. Legacypac ( talk) 00:12, 26 May 2018 (UTC) reply

Comments by Diannaa

The Copypatrol system originated in January 2015 when we received 1 million credits from Turnitin. These credits were exhausted in January 2018, at which point Turnitin provisionally gave us some additional credits (I don't know how many; see phab:T185163). What this means is that normal operation of the Copypatrol system depleted our stock of credits by circa 900 per day over the three year period 2015-2018. Meanwhile, Earwig's search engine tool uses Google search; we are allotted 10,000 searches per day, and our normal usage is over 5,000 searches per day. We've asked for a higher quota from Google, but we are already at the highest available quota. So as you can see these resources are finite. Meanwhile there's currently 4500 articles in the queue at New Pages Feed and around 1200 drafts (around 5700 items total). Adding a search of 5700 items to the normal course of daily business would immediately put us over the daily quota and Earwig's search engine tool would not be available for the remainder of the day (i.e., until midnight Pacific time). Likewise 5700 daily searches when added to our daily norm of 900 at Turnitin would exhaust a million credits in under six months.

How about this: Figure out a way to show on the new pages feed whether or not there's an open report at Copypatrol. Just a reminder, what the Copypatrol bot does is search for copyvio for each addition over a certain size (whether it's a new article creation, an existing article, or a draft). When a Copypatrol item is resolved, it's removed from the list of open items and moved to the archive. If there's a second (or third or fourth or fifth) copyvio added to the same article or draft, a new report is generated. So this system would automatically tell the new page patroller whether or not the current version is suspect and would not use up any of our valuable search engine resources. — Diannaa 🍁 ( talk) 22:41, 24 May 2018 (UTC) reply

Diannaa, as I have said many times, I greatly regret the passing of CorenSearchBot. Anything else by comparison, IMO, is a palliative. Not being involved in Copypatrol, I never use it. I was never aware with searching for copyvios that the searches are subject to quotas - was CorenBot? The 4500 articles in the queue at New Pages Feed and around 1200 drafts (around 5700 items total) are not daily figures. Once we have motivated the reviewers into clearing the backlog, new articles, as drafts or in mainspace are only in the 100s. Kudpung กุดผึ้ง ( talk) 01:56, 25 May 2018 (UTC) reply

Please refer to the project page Wikipedia:WikiProject Articles for creation/AfC Process Improvement May 2018#Back-end where it says they intend to add Turnitin or Google Search to the New Pages Feed. And in a later paragraph it says they hope to re-score the pages, i.e., perform additional re-checks for copyvio on the material in the queue. That's what I'm objecting to, as we will exhaust our daily quota of Google searches and exhaust our donated block of Turnitin searches as well. — Diannaa 🍁 ( talk) 03:10, 25 May 2018 (UTC) reply

this is why I suggest "score it once when AfC Submitted" and be done. If I see a low score and no red flags I can just accept the page without waiting for earwig. Speed and ensuring every page is check is the big reasons for including the score. Few editors make big changes while waiting on a review so rescoring drafts wastes tokens. Legacypac ( talk) 03:16, 25 May 2018 (UTC) reply

Percentage on Earwig's tool is a pretty useless method of detecting copyvio. ~~Turnitin does not assign a score; it's a yes/no; either copyvio is detected or it is not detected.~~ Determining whether or not the report is a false positive and locating the source document depend on the skill of the investigator. That's because Turnitin can detect copying of pages that don't exist any more and are now 404. It also detects copying within Wikipedia, including of old revisions that are not live any more. I see you are still suggesting that people should skip checking for copyvio as a time-saving measure. That's a step that should not be skipped in my opinion regardless of the time involved. It's pretty important. — Diannaa 03:38, 25 May 2018 (UTC) reply

Please don't mistake what I said. I check every draft I'm proposing to send to mainspace with earwig. If I see 3% or 25% and I see no reason to believe it's copyvio anyway I stop investigating for copyvio. Running earwig takes time, while I can't do much else with the page, so having earwig prerun saves time on the low score pages. Its exactly the same check it it just saves me hitting three buttons and twittling my thumbs. Legacypac ( talk) 04:38, 25 May 2018 (UTC) reply

Diannaa, I'm not really concerned with the technical pros and cons of this or that COPYVIO detection API. I'm concerned with their practical use, speed of application, and keeping backlogs at NPP and AfC down to a minimum, and above all, maintaining the dwindling interest of our Reviewers to keep reviewing. NPP is a thankless task. That said, I find Earwig's percentages quite helpfull if not essential. The alternative is for us at NPP to skip checking for COPYVIO altogether and leave it all to to Copypatrol, but the problem there ids the 48 hour lag where NPP needs that information the minute a new page is created. If your Copyvio programme is using up all the bandwidth or quotas, maybe more should be allocated to NPP - which is the critical gatekeeper, especially at times like these when we have new challenges to face: a lot of COPYVIO is introduced by COI paid editors on spamvertorial masquerading as articles. Perhaps Doc James would like to comment. Kudpung กุดผึ้ง ( talk) 04:02, 25 May 2018 (UTC) reply

Copypatrol does give us "% of edit" that matches a source. [6]

From what I remember we have that tool set so that it only displays those that are over a certain percentage of matching.

So if we could run item as they enter the NPP cue and add the results to both "CopyPatrol" and the NPP cue would that work?

User:Kudpung how many items enter the NPP cue per day? Doc James ( talk · contribs · email) 04:25, 25 May 2018 (UTC) reply

I don't know James, I don't know how to extract that data. Since ACREQ and discounting the Autopsatrolled pages, probably not more than 600 at a rough guess. Kudpung กุดผึ้ง ( talk) 04:45, 25 May 2018 (UTC) reply

600 is not a huge number. I bet we could manage that no problem with one or both of the tools. Doc James ( talk · contribs · email) 04:53, 25 May 2018 (UTC) reply

I don't know that any additional copyvios would be detected by doing this, as the edits have already been checked using the Copypatrol system. — Diannaa 🍁 ( talk) 16:55, 25 May 2018 (UTC) reply

I like Diannaa's suggestion of a noting whether or not there is an open copypatrol report. It would give those of us who like to do both a way to cross-reference, also, a good reminder for me to get back into doing more copypatrol. TonyBallioni ( talk) 13:20, 25 May 2018 (UTC) reply
A note at Copypatrol when an edit is a new article creation or new draft creation might also be useful, so patrollers could more easily prioritize those reports. — Diannaa 🍁 ( talk) 17:07, 25 May 2018 (UTC) reply

We have a magnificent dashboard for New Page Patrollers. That'a where the action should be. Patrollers are not going to flip back and forth between apps for every article they review. Copypatrol may be fine for edits, but its not integratable for NPP. Kudpung กุดผึ้ง ( talk) 19:20, 25 May 2018 (UTC) reply

I meant copyvio patrollers, not new page patrollers. — Diannaa 🍁 ( talk) 20:26, 25 May 2018 (UTC) reply

Diannaa, thank you for the in-depth thoughts on some of the options here. The idea of using the same query to back both New Pages Feed and CopyPatrol is definitely an interesting one. And thanks to everyone who weighed in on these copyvio questions and on the mockups -- please keep any additional thoughts coming. I posted a couple follow-up questions for people, and now the engineers here have a lot to think about. I'll post another update next week. -- MMiller (WMF) ( talk) 00:10, 26 May 2018 (UTC) reply

Coming back to this...

MMiller (WMF), I have no idea about the technology involved with finding solutions for automaticakly detecting and signalling COPYVIOs, all I know is what is urgently needed to streamline the work of reviewers at AfC and NPR. I appreciate Diannaa's input but with around 600 non autopatrolled new mainspace pages to check every day, this statement gives me pause as to how the CopyPatrol project can help, bearing in mind they have admitted to a 48 hour latency and that their work is focused on new edits rather than on new pages: "...there are anywhere from 70 to 100 potential copyright violations to be assessed each day, and I do the bulk of that work", and this comment from Sphilbrick: " I tried to do some copyright issue review every day, but I do far less than Diannaa, and I find the volume I work on exhausting — frankly I don't know how Diannaa does it. As she mentioned, there are far too many incidents each day to expects reviewing editors to fix each of the problems."
We want to be sure that we are still looking for a solution that like Coren did, automatically tags the article as it arrives in the feed. Let's not forget that any flags in the feed entries only depend on the reliability of the reviewer a) to see it; b) wanting to do something about it, and c) showing the actual level of COPYVIO in the feed entries doesn't add much impact - the decision of the reviewer is simply to investigate using Earwig or the Dup Detector, which is theorectically mandatory, or do nothing. Once it is decided which COPYVIO detection API is to be used, it should be a simple matter to get it to template the articles. Kudpung กุดผึ้ง ( talk) 02:27, 7 June 2018 (UTC) reply

Just stopping by to correct some errors in Kudpungs post. CopyPatrol does not focus on new edits per se, but all additions over a certain size threshold to draft space and article space. The 48-hour latency is an outside limit, which takes into account days where I am unable to participate to my usual level. Reports are more typically cleared within the first 24 hours, and many are cleared almost immediately. — Diannaa 🍁 ( talk) 13:33, 7 June 2018 (UTC) reply

I'd also like to clarify some points, especially if my comments led to any misunderstandings. When I refer to "new edits", I was trying to distinguish copy patrol from initiatives such as Copyright problems and Contributor copyright investigations, both of which tend to involve older issues. I think it should be obvious that new edits includes new pages, but if that was confusing I will make it clear that new pages both mainspace and in draft space are picked up by CopyPatrol. I subscribe to the for ice principle, so in the case of new articles and new drafts, if there are substantial copyright issues I tag is G12.

I do see that review is often looked at the page by the time I see it, and often have reverted an edit. In most cases, those edits have not been rev-del to and if I confirm that it is a copyright issue I will go ahead and rev-del the edit.

It is also critical to clarify what I meant by this statement: ... there are far too many incidents each day to expect reviewing editors to fix each of the problems. This does not mean copyvio's are not being addressed. It is an allusion to an editor who whined about a copyright violation being removed, expecting that what we should be doing is rewriting the material for them. It is my opinion that identifying and addressing new copyright issues is working well (subject to my belief that Diannaa is overworked and could use some help). My point was that it is not the standard practice of an editor reviewing a potential copyright violation to do the rewrite on behalf of the offending editor.

Finally, I'd like to know more about what's prompting the phrase urgently needed to streamline the work of reviewers at AfC and NPR. I see that there is extensive discussion above in this thread, I'll read it before commenting further.-- S Philbrick (Talk) 14:36, 7 June 2018 (UTC) reply

@ Diannaa and Sphilbrick: I hope you are both able to detect that I was not criticising your work in the slightest. What I am pointing out however, is that it has often been mentioned - not by me - on this project that CopyPatrol claims to have a positive impact on the work of the New Page Patrollers, which of course it cannot possibly have, neither technically nor humanly. Diannaa cannot alone do the COPYVIO checking of of over 600 new articles a day which even our mammoth task force of 630 reviewers are totally unable to to achieve.

The work at NPP is vastly different from the work at CopyPatrol - beyond providing basic advice and/or linking to the guidelines andhelp pages, Reviewers are absolutely not expected to rewrite copyvios to make them acceptable. They either simply remove the copyvio phrases and tag the article so that the author knows what's up, or just simply tag the whole thing for CSD12 and have done with it. And that is why the NPP process needs streamlining. It's nothing new, NPP just wants back that got lost when Coren went AWOL. Kudpung กุดผึ้ง ( talk) 20:22, 7 June 2018 (UTC) reply

I'd like to try to to clarify something, as an occasional user of CopyPatrol and active New Page Patroller. There are some 400-500 pages per day that need checking. Copypatrol has a much smaller number of daily issues to address, I have the impression that Kudpung thinks that somehow not all new pages are "reviewed" by CopyPatrol (please correct me if I'm wrong). It is my understanding that CopyPatrol DOES check all new pages and flags a number of them (it looks to be about 15%) that need to be looked at by a reviewer. The difference is that CopyPatrol only has to deal with the flagged pages. I think it would be nice if NPPers could see if an article had been flagged. Vexations ( talk) 22:08, 7 June 2018 (UTC) reply

Comments by Barkeep49

I think Wireframe A is a substantially better than B (coming at this from someone who purely does NPP and not AfC). The ability to enter my own COPVIO cutoff score is what makes this the winner for me. Best, Barkeep49 ( talk) 23:53, 25 May 2018 (UTC) reply

Answers to posed questions: For those of you who have experience with both the Google and Turnitin services, what are your thoughts on their pros and cons for copyvio?

I find Turnitin to be far less useful for NPP than Google. So if we had to pick one or the other it's no contest that I would pick Google. The pros are that Turnitin's universe is much less likely to have the kinds of text being put into NPP, like corporate PR generated writings, than Google. Best, Barkeep49 ( talk)

If the New Pages Feed were sortable/filterable by copyvio scores, what would the next step for a reviewer be upon finding pages with high scores? Is the next step always to look them up in Earwig's tool? Are there any other workflows?

I'd like to think that my next step for a high scoring would be to go right to Earwig to see what's what. I worry, however, that it could make it easy to apply G12 and move on without having done any of the leg work. Regardless I would go to Earwig to get the nice comparison text. Best, Barkeep49 ( talk) 23:53, 25 May 2018 (UTC) reply

We're thinking that better approaches might be to re-score pages after a certain amount of the page has been changed, or re-score them on a certain schedule. That means that at any given time, the copyvio score would be "approximate". Does this sound sufficient for the purpose of finding a page you would like to review? Or are there other advantageous approaches?

To conserve API resources I like the idea of a comparison once when it's submitted for review at AfC and once when a new page hits NPP. For pages which hit NPP because they're articles which had been redirects or the like I would suggest not scoring them right away. I tend to hang out in that part of the stream and it seems like a few other reviewers do too and those pages are reviewed quickly, so those pages could get reviewed without taking any API resources. And then perhaps it's rescored once a day if there has has been a change in page size over X kb (not sure what the right number would be).

I would add that I have one "COPYVIO" resource that wouldn't require any API credits: COPYVIO searching with-in EN wikipedia. It would be nice if this kind of comparison tool is being generated to maybe give indications if it's a copy and paste move, which could indicate nothing (properly attributed) or any number of somethings. Best, Barkeep49 ( talk) 23:53, 25 May 2018 (UTC) reply

Applying a category to all declined submissions

This is a question mostly directed at Primefac, Enterprisey, and anyone who has maintained the AFC submission template -- but I'm definitely interested in hearing opinions from all.

The WMF team is currently working on making draft pages appear in Special:NewPagesFeed, and making it possible to filter them by their state: "Unsubmitted", "Awaiting review", "Under review", "Declined", and "All drafts" ( this wireframe shows those states visually). The team will be using categories to facilitate the filtering, as that is expected to be more performant and quicker to engineer than using templates.

We plan on identifying drafts "Awaiting review" with Category:Pending_AfC_submissions and "Under review" with Category:Pending_AfC_submissions_being_reviewed_now. But there is no single category that identifies all drafts that have been declined. Though almost all declined drafts have one of the subcategories in the container category of Category:Declined_AfC_submissions, querying at the subcategory level could have performance and engineering challenges.

Therefore, the question is whether it would be possible to change the AFC submission template to apply a category to all declined drafts. This could be a new category (or hidden category), or potentially it could be Category:Declined_AfC_submissions, though that's currently set up as a container category. Thank you, and please let me know what you think! -- MMiller (WMF) ( talk) 23:31, 1 June 2018 (UTC) reply

Category:All declined AfC submissions shouldn't be that hard to implement. Enterprisey ( talk!) 04:40, 2 June 2018 (UTC) reply

Adding Category:Declined AfC submissions to the decline template would suffice, yes? It would be same as Category:Pending AfC submissions, which has all of the pending drafts but also contains all of the subcats (which between them also contain all of the pending drafts). Primefac ( talk) 14:28, 2 June 2018 (UTC) reply

@ Primefac: your suggestion of adding Category:Declined AfC submissions to the decline template would be ideal, especially because it mirrors Category:Pending AfC submissions. Would you be willing to make that change? Then it will be applied to all the currently declined drafts and the engineers on our end can start taking it into account in their code. Thank you! -- MMiller (WMF) ( talk) 22:13, 4 June 2018 (UTC) reply

Done. It'll obviously take some time for it to percolate through the 10k or so decline drafts. Primefac ( talk) 12:39, 5 June 2018 (UTC) reply

@ Primefac: thank you for the quick turnaround! Everything looks good from our end. We are off to the races. -- MMiller (WMF) ( talk) 17:52, 5 June 2018 (UTC) reply

Copyvios revisited

My intention was to read through the last few months history of this thread concentrate on the discussions of copyright issues. I tried to do that but there's a lot of material and I apologize in advance if I miss something critical. However, I do see some misinformation, which unfortunately gets repeated even after being corrected, so I'll try again.

I see a pining for a return of the Corenbot. I think this desire has three aspects:

A desire for an automated way to detect and report potential copyright violations as they occur.
A desire to have these reports immediately timely (within seconds)
A desire to have software tagging of articles with the notice of a potential copyright issue

I believe CopyPatrol achieves the first two.

I see editors up-thread wishing there were such a program. There is and it is running. It applies to articles in mainspace and drafts. I don't know that it applies to userspace drafts. I don't know how quick it is but my impression is that it typically reports within seconds (would have to check with the developers to be certain of this). Other than possibly the need to extend it to user space drafts, are there any known deficiencies in terms of scope or timing?

Regarding tagging, it is my opinion that tagging should not be done. I did a very quick review of the last 50 items I worked on. This is obviously an insufficient sample for a couple of reasons, but I can use it to illustrate my point and if necessary, we can do a more formal study.

The last 50 items I reviewed, 12 were false positives, that is, identified by the software as needing further review but as a result of the review team to be not copyright violations. 12 out of 50 is a 24% false positive rate. Probably happenstance, but that's remarkably close to Diannaa's statement that the false positve rate is about 25%. In my opinion, 25% is far too high a false positive rate to be prominently tagging an article suggesting it may be violating copyright.

The most egregious problem is the repeated statement that it takes 24 to 48 hours for the CopyPatrol editors to assess potential violations. Diannaa has corrected this this statement, but it's been repeated since the correction. 48 hours is roughly speaking, an upper bound, and rarely comes into play. I looked at my last 50 items closed in the time between the edit and the close for selected items is as follows:

Medha Patkar 6 minutes
Giovanni Tria 26 minutes
Petter Wallenberg 37 minutes
Sean Romans 52 minutes
Myers Park (Charlotte) 64 minutes
Gabriel Naddaf 68 minutes
Internet Assigned Numbers Authority 77 minutes
Raila Odinga 79 minutes
Working dog 127 minutes

That's not a random sample, it's the fastest of the group, but most of those are cleared well within 24 hours.

I'm worried that there may be some unrealistic expectations. In a project with the unofficial motto There is no deadline, clearing of potential copyright issues and new edits typically within a few hours and almost all within 48 hours strikes me as an astounding feat. Someone should be inviting Diannaa to the next wiki conference and appointing her Wikipedia and of the year, not asking how to get the review process down to minutes instead of hours. We have all kinds of backlogs, some stretching into weeks and months. This is not the problem.

I'd like a clear articulation of the problem statement.

I think I read that the following might be a problem — if the reviewer marks a new article is patrolled, that triggers the ability of Google to index the article, in an article containing a copyright violation might end up in the search engine. If that's the problem, there's a simple solution. Marking a pages patrolled should include a delay I'll suggest 48 hours), and the article is released for indexing after the end of the delay, which allows plenty of time to check for copyright issues. I fully understand some brand-new editors may want their article up and indexed as soon as possible, but I have no problem telling them that if they are not a veteran editor, they might have to wait 48 hours before it shows up in a Google index.-- S Philbrick (Talk) 16:06, 7 June 2018 (UTC) reply

The other option, of course, is to mandate that NPRs (and AFC reviewers) use Earwig's tool to at the very least catch the blatantly obvious violations. We've already pretty much got this nailed at AFC, and while some mistakes will always slip through the cracks I see significantly more G12 on pages that are still in the draft space (as opposed to ones that were accepted and then G12'd). Primefac ( talk) 16:15, 7 June 2018 (UTC) reply

Part of NPP is supposed to be checked for copyvios anyway. Natureium ( talk) 16:26, 7 June 2018 (UTC) reply

@ Sphilbrick: thanks for joining this discussion (and to Diannaa and Kudpung for continuing to take part) -- I know there is a huge amount of material to read, but we need as much input from people experienced with copyvio as we can get. I want to speak to your question about a clear articulation of the problem statement. The original problem that this endeavor is setting out to solve is to help the AfC reviewers more easily prioritize drafts for review, so that they can get high quality pages into main article space as quickly as possible, while keeping inappropriate pages out. Since the New Pages Feed is an existing mechanism for prioritizing pages for review, the idea became to extend the New Pages Feed to include AfC drafts, and to add useful data for prioritization to both workflows.

Likely copyvio status is one of those useful data points that reviewers could use to prioritize pages to review. And so the problem statement for copyvio that we're working from is: Reviewers need to make decisions about which pending pages need their attention soonest because they contain likely copyright violations, and once they are looking at a page to review, they need to decide how closely to inspect for violations. Therefore, the planned work is to allow reviewers in both NPP and AfC to quickly find those pages that contain likely violations, though, as Kudpung, DGG, and others have said, those pages will still need to be checked by those reviewers in case they are false positives.

Does this help? And to others -- is that an accurate problem statement?

-- MMiller (WMF) ( talk) 19:17, 7 June 2018 (UTC) reply

@ MMiller (WMF): Thanks for your response. The proposal to extend the New Pages Feed to include AfC drafts is interesting, and helps put the issue in perspective though I still have some timing questions.

I don't have direct involvement in the AFC review process, but I do have indirect involvement, as I'm an active OTRS agent, and we often get queries about how to get a user space draft into mainspace. That almost always includes an explanation that they have to ask for a review. We typically caution them that the review process is backlogged. I'm assuming that this initiative is part of a way to address that problem. While I don't monitor the backlog on a regular basis, my recollection is that it has been up about two months, and maybe a little less than that now. I'm in favor of initiatives to cut that backlog time. I am completely sympathetic with a brand-new editor who is chomping at the bit to have an article viewable to the world, only to find out that they have to wait two months or so, and often that long wait will be rewarded by a declination. This must be a turnoff to new editors.

So, on the one hand, we have a backlog measured in months, one I would like to see reduced materially.

Obviously, one of the elements is a timely addressing of any potential copyright issues.

However, when I see this statement, For Copypatrol to have any meaning at NPP it needs to scan and tag a new page within about 25 seconds, I wonder if expectations are unrealistic or if I misunderstanding the problem.

If the goal is to cut down the time of the review process from two months to two days, then I don't think copyright issues are a contributor to the critical path.

If, on the other hand, there truly is a need to have potential copyright information available in 25 seconds, then either I'm missing something fundamental about the critical path, or the desired goal is entirely unrealistic.

To put it differently, one of the reasons I wanted to hear the problem statement, is that if the goal is to get new article review backlogs down from two months to two days, then I don't quite understand why there's been so much discussion about copyright issues as the copyright review process is not contributing to the backlog. If the goal is to complete the review within two minutes, then I think the goal is unrealistic.

This is not to say there aren't things that could be done to better integrate the copyright review process with the NPP. I expressed reservation to automatically dropping a template onto the article (false positive rate too high), but I would not be averse to dropping a template on the article talk page. I don't know that 25 seconds is a realistic expectation, but I suspect it's minutes at most. S Philbrick (Talk) 19:49, 7 June 2018 (UTC) reply

side note Yes Sphilbrick the backlog was over two months but we've been pushing hard to reduce it and its been dragged down to 3 weeks, even with WP:ACPERM. Hopefully the process improvement will help get that better. Cheers KylieTastic ( talk) 20:52, 7 June 2018 (UTC) reply

That's impressive. While I'm sure the subject editors would like it even shorter, and ways to accomplish that are being discussed, you deserve a lot of credit for what I assume to be a massive, and mostly thankless task. I'll try to keep that timeframe in mind when talking to people at OTRS. I may modify my standard wording to tell them that it could be up to a month and hope they are pleasantly surprised if it is a week faster.-- S Philbrick (Talk) 21:19, 7 June 2018 (UTC) reply

I know we're getting slightly off-topic, but 75% of new drafts are reviewed within 48 hours. Those that aren't reviewed often sit around for a week or two (as alluded to above there is currently one draft that is exactly three weeks old), but the backlog has been steadily decreasing. Primefac ( talk) 22:05, 7 June 2018 (UTC) reply

Sphilbrick, The problem which we have is that when the NPR user right was created, in its wisdom the community insisted that inexperienced users and raw beginners still be allowed to tag pages for maintenance and deletion, This has in fact increased the workload of the reviewers rather than alleviate it. These button mashers hang with their mice over a tickertape feed of titles only of new articles and tag them literally within seconds before they even feature in the New Pages Feed. On seeing that, the regular reviewers assume all the work to have been done and mark the tagged articles as patrolled. What simply happens is that knowing that AFC is more geared towards spending time on copyright violations, they will simply be pushing more new pages to the draft names space. This would be totally counter-productive to what we are attempting to achieve in reducing the backlogs for both processes.

New Page Review has made great inroads in the last year and a half since I created a user right for it, reducing a backlog of 22,000 to 3,800, but still has a long way to go. It now faces new challenges, those of increasing COPYVIOS brought in by spam and quick-fix paid editors. Copyvios are often the first key to detecting undeclared paid editors who are exploiting our voluntary work for their personal or corporate gain. Hence since the introduction of ACPERM - another achievement of mine - the focus of the work of the new page controllers has changed slightly, but critically. The reason we need this level of mechanisation today more than ever befofre is not because we personally need it, but due to the general apathy in maintenance work throughout the encyclopaedia it is the only way in which we can address the attrition of maintenance workers, stay on top, and attempt to avoid the encyclopedia from degenerating into a slum of adverts and junk.

Automatically dropping a template onto the article is precisely what Coren did and very quickly, and I don't recall it having a high rate of false positives. It was certainly a very positive contribution to the workflow. Technology is always in a state of progress, and what I don't understand is why we can't have something we once had and was realistic. What are the real reasons for not addressing it? Copypatrol is totally unrelated to the the requirements of AfC and New Page Review - why are we even discussing it? James? Kudpung กุดผึ้ง ( talk) 22:31, 7 June 2018 (UTC) reply

I've been working on a post which I just dropped in below. However, I'm blown away by your assertion that Copypatrol is totally unrelated to the the requirements of AfC and New Page Review - why are we even discussing it?. As noted below, I see significant overlap. One of us is missing something, and it may well be definitional.

Using my notation below, CPP_new is a material subset of the daily workload of CopyPatrol, probably 25-75 articles each day. Similarly, NPP_issue is a material subset of the New Pages reviewed each day. I hope they are close to identical. You think there's no relation. What do you think I'm missing?-- S Philbrick (Talk) 00:19, 8 June 2018 (UTC) reply

Let's get specific.

Draft:Deseret First Credit Union is a new draft, just reviewed by me at Copypatrol.

Is this, or is this not something that will be reviewed by the NPP?-- S Philbrick (Talk) 00:41, 8 June 2018 (UTC) reply

If the request is to drop a template onto new articles that are copyright concerns that should not be too hard with one of the currently avaliable tools. Doc James ( talk · contribs · email) 01:30, 8 June 2018 (UTC) reply

I've worked in the system and am afraid that the rate of false positives is somewhat high, which may lead to a considerable number of false-taggings. But, technically, ought to be doable.... ~ Winged Blades^Godric 05:59, 8 June 2018 (UTC) reply

Color me as one who was utterly taken aback by Kudpung's statement about the irrelevance of CopyPatrol. AFAIS (based on some off-wiki discussion), there's no chance of an exact re-incarnation of CorenBot making a return in the recent-future and we need to utilize methods which are near-equivalent of it to achieve our goals.Holding on to CBot as our sole goal and trashing all other options isn't going to be a viable way of doing things or so does it seem...... ~ Winged Blades^Godric 05:59, 8 June 2018 (UTC) reply

Winged Blades of Godric, care to expand a bit on your secret information? Or is it political rather than technical? Kudpung กุดผึ้ง ( talk) 10:33, 8 June 2018 (UTC) reply

Draft:Deseret First Credit Union is a classic example of a page that should be deleted for other reasons ( Hammersoft) even before it comers under scrutiny for COPVIO., and before it even reaches NPP for its second review if it were to be accepted by a AfC reviewer. There are discussions taking place elsewhere to find ways of quickly deleting such pages and, if appropriate, blocking the creator. Taking the copyright out of such a page is a waste of time - it would be rapidly deleted anyway if it came first through NPP (at least on my watch it would be). Kudpung กุดผึ้ง ( talk) 10:28, 8 June 2018 (UTC) reply

The reason I brought that up was to explore how much overlap there is between articles handled by NPP and those showing up at CopyPatrol. I think there's a lot (or possibly I am misunderstanding the "irrelevance" comment). Here's another: Yantra Mandirs.-- S Philbrick (Talk) 12:54, 8 June 2018 (UTC) reply

Kudpung; I'm not clear why you wanted my presence in this conversation. If it's to question my analysis of that draft; I'm quite familiar with it, having dealt with these people starting about a year ago. It's standard company attempts at self promotion, with the typical attendant copyright violations. I removed a copyright violation from the page, as did another editor, as such things should be removed. I considered tagging the page for {{ db-promo}} or {{ db-copyvio}}, of which I have done many (see User:Hammersoft/log appropriate sections), but decided to wait in favor of stripping the copyvio content (and other promo content) and attempting to discuss with the editor who made the article, via placing at {{ uw-paid}} warning on their page along with advice about secondary vs. primary sources on the draft. What's now left is a draft free of copyright violations. Yes, I could have chosen to G11 or G12 the draft, and would have done so if it were in mainspace. Given that it was in draftspace, I attempted conversation with them instead while removing the most problematic content in the draft. It was a choice, one with low impact. I prefer discussion where I think there may be an opportunity for one and the impact on the project is low, and can be monitored. But, if you think I'm wrong in any of this, I'm open to input as always. -- Hammersoft ( talk) 13:47, 8 June 2018 (UTC) reply

I don't think the issue was your handling. I think you were pinged following the usual politeness protocol - it is polite to ping someone when you mention them in a post. I'm the one who originally brought up the article, and I brought it up because there's some confusion about how much overlap there is between NPP and Copypatrol. I picked it as it is one example of many that might pop up at both places. The actual handling of the particular incident isn't relevant to my point.-- S Philbrick (Talk) 15:00, 8 June 2018 (UTC) reply

Thanks! -- Hammersoft ( talk) 15:24, 8 June 2018 (UTC) reply

Can we do some math?

Let me define:

NPP as the number of New pages created each day.
CPP as the number of articles reported to CopyPatrol each day.

Based on comments above, I think the following approximations are correct:

NPP=600
CPP=150

I trust it is obvious that we mentally must think in terms of Venn diagrams;neither NPP nor CPP are proper subsets of one another, but there is some overlap.

(There's a complication due to CPP false positives, but I think this complicaiton can be ignored for the following reason - if an article is flagged at CPP, or some Copyvio investigation identifes possilbe issues, it must be investigated in either case, even if the ultimate resoluion is false positive.)

Further define:

CPP_new as the subset of CPP that are new pages, that is, the overlap of CPP with NPP
CPP_old as the subset of CPP that are old pages, that is issues that have to be addressed by CopyPatrol editors but have nothing to do with New Page Patrol

NPP_issue as the subset of NPP which register as potential copyright issues, some of which will be reverted or deleted, and some of which will be false positives
NPP_clean as the subset of NPP for which there is no indication of a copyright issue needing investigation.

One hopes that CPP_new=NPP_issue. In words, this means that there is a set of pages showing up in the copy patrol tool that are new pages (including drafts). This also a set of pages in the new page patrol that are identified as having potential copyright issues needing investigation.

Ideally, the sets are identical. In practice, they won't be exactly identical, but if they are close, this has some implications for how both teams work. If the sets are NOT close to each other, this has different implications, and means that the copyright issue identification processes of the two groups are different enough that further research is needed.

It might be worth investigating, perhaps by selecting a recent day, and determining how much overlap there is. This is doable, though tedious, on the CopyPatrol side (Although I suspect someone with the right skill set might be able to do it automatically. I don't know what's involved on the NPP side, to ascertain the subset of articles that were reviewed for copyright issues.-- S Philbrick (Talk) 00:21, 8 June 2018 (UTC) reply

Problem statement

I can't tell from the splintered discussion that has occurred since MMiller (WMF) last posted whether there is an objection to the problem statement. I'm personally still onboard with it as it will speed up the required CV checks AfC reviewers are required to do. Because of the way MMF has proposed to implement this, it should also help with NPP but maybe that's help that's unwanted. Like I said, I can't tell because the discussion since hasn't really addressed MMiller (WMF)'s question, is that an accurate problem statement?. ~ Kvng ( talk) 14:23, 12 June 2018 (UTC) reply

If there was a Category of AfC submissions with possible copyvio issues interested editors would prioritize pages in the Category, like we do with submissions in userspace or now submissions matching titles in mainspace. It becomes an area where you can attack the pages with similar issues.

Currently I only run earwig on pages I plan to accept. This slows down the accept process while I wait for earwig. On the other side I could G12 many drafts I reject if I knew they were copyvio, which would cut down on resubmissions and reduce the backlog. How many reject/resubmits happen before someone realizes the page is copyvio anyway? If we could see the earwig score on the page opportunities to speedy delete crap whould rise and time to pass the good would decrease, cutting the backlog accordingly. Legacypac ( talk) 16:27, 12 June 2018 (UTC) reply

Well, we have this category, the pages in which need cleaning and revdel. And yes, you absolutely should check the cv of every page you accept or decline - what's the point of declining a page that's a blatant cv and not delete it? Check. Every. Page. You. Review. For. Copyright violations. Period.

And no, you don't have to check every draft you look at, but if you're clicking any of the "review" buttons a cv check should have been performed. Primefac ( talk) 12:16, 13 June 2018 (UTC) reply

It is amazing (well, not really I suppose) how often pages have been declined multiple times by different reviewers yet earwigs reveals significant copyright violations Galobtter ( pingó mió) 19:31, 13 June 2018 (UTC) reply

Yes, this is exactly what the proposed WMF work is supposed to address. My understanding is that we would have the Earwig scores displayed next to the drafts you're about to review. This avoids the wait running the tool, the need, in many cases, to rerun it on resubmit and assures that all drafts are first off checked for CV. If you see a high score, you would know that is something that should be addressed first.

I have also pushed CV check to the end in some of my reviewing. It's not really a good way to do it. The flowchart suggests CV should be checked first. The wait is a pain and the way I deal with that now is to run Earwig on a batch of drafts in separate browser tabs. When I am ready to start reviewing the next draft in the batch, Earwig results are already waiting for me in a tab somewhere. ~ Kvng ( talk) 14:56, 14 June 2018 (UTC) reply

Any thoughts on project update from 2018-06-14

Hi all -- I just wanted to leave a note here to say that I am following along with the discussion above on problem statement and copyvio, even as the team is still working on the parts of the project that do not (yet) involve copyvio. I just posted an update and screenshot on the project page, and I'm making this new section for anyone to leave reactions (even if it's just a simple "looks good"). Thank you all for your thoughts. -- MMiller (WMF) ( talk) 19:31, 14 June 2018 (UTC) reply

Just as a note, there is zero point in saying "NO CATEGORIES" or "ORPHAN" because drafts aren't supposed to have categories, nor should they be linked from the article space. If that's a technical limitation because it's also for NPP, then I suppose we'll have to live with it, but since you're the ones doing the programming... Primefac ( talk) 02:23, 15 June 2018 (UTC) reply

You might consider not limiting the filter selections to mutual exclude. If they were multiple selectable (square) checkboxes we could remove "all" or have ticking of "all" tick all the others. ~ Kvng ( talk) 14:41, 15 June 2018 (UTC) reply

@ Primefac: thanks for pointing that out, and it makes sense that drafts don't need those indicators. The developers say that we can probably hide them in the interface (and potentially also save some database space that would have to store those labels). I will add that change to the list.

@ Kvng: I'm glad you brought this up, because it was something we were discussing. What are the situations in which you would want to see the feed combine drafts of different states (e.g. both "Awaiting review" and "Declined" at the same time)? Our original thinking was that mixing drafts of different states together might make for a confusing feed.

-- MMiller (WMF) ( talk) 21:11, 15 June 2018 (UTC) reply

I do think that having a mixture could be confusing and if we're concerned about that, maybe the "all" option should be removed. The mutually exclusive options you list there (minus "all") are all that I think I would need. It is possible I or someone else could come up for a reason for wanting other options and then you'd have to come back and rework. If you changed it to 4 non-exclusive checkboxes, all possibilities would be available to reviewers and you'd be pretty sure you'd not have to come back for rework. 4 non-exclusive checkboxes is also a more concise interface and no more difficult to figure out and operate than what you have now. There you have it, I've clearly overthinked this. ~ Kvng ( talk) 00:07, 16 June 2018 (UTC) reply

Thanks for laying that out, Kvng. I think you're right that having the "All" option is logically inconsistent with preventing subsets of that combination. I suppose we were thinking there might be use cases for seeing all drafts, perhaps when looking for copyvio in the Draft space in general. Before we decide how to move forward, I just want to check whether any other AfC reviewers have an opinion on this? Legacypac? -- MMiller (WMF) ( talk) 23:34, 18 June 2018 (UTC) reply

@ Kvng: I just wanted to follow-up on this conversation we had earlier about checkboxes and radio buttons. Our team has decided to stick with the mutually-exclusive radio buttons for now. The reason is that we're planning on putting the interface up in a testing environment for reviewers to interact with, and then it will be easier to get a feel for which option makes more sense. If after testing, it seems important to enable all the combinations, we can change the software. Thanks for thinking about this. -- MMiller (WMF) ( talk) 00:27, 6 July 2018 (UTC) reply

Question about handling of redirects

Hi all -- I have a quick question as the WMF team works on this project: how would AfC reviewers prefer for the New Pages Feed to handle draft redirects? While it looks like there are only a handful of draft redirects awaiting review, there are many thousands of draft redirects that are not submitted to AfC. Which set, if any, would be useful to have in the New Pages Feed?

The feed currently contains new redirects created in the main namespace for NPP review. If we were to include draft redirects, we would also include the ability to filter them out of the view, likely by default.

Pinging Legacypac, Kvng, and Primefac for advice -- as well as from any other AfC reviewers who want to chime in! Thank you. -- MMiller (WMF) ( talk) 19:06, 3 July 2018 (UTC) reply

I have never reviewed a draft redirect. I don't think I have an opinion. The thousands of unsubmitted draft redirects are probable submissions that were accepted. The move to mainspace leaves a redirect to it in its former location in draft space. ~ Kvng ( talk) 13:01, 4 July 2018 (UTC) reply

Technically speaking redirects should be requested at WP:AFC/R, and the few that aren't tend to get approved anyway. As Kvng says, 99.9% of the redirects in draft space will be pointing towards articles (or other drafts) and don't necessarily need to be reviewed. It wouldn't hurt to be able to filter out just the redirects, though. Primefac ( talk) 14:42, 4 July 2018 (UTC) reply

Thanks, Kvng and Primefac, that helps me understand how the redirects work. I took a look at the redirect request process and it looks like when someone requests a redirect, they do not actually create a draft; rather, they add the request to the project page. Therefore, it seems like virtually all the redirects in Draft space are from submissions that were accepted. Because of that, I think the best path forward would be to simply exclude them from the New Pages Feed. This will keep them from cluttering the feed, and also allow the engineering team to implement the feature changes sooner. Let me know if this doesn't make sense, or if you have any other thoughts. Thank you! -- MMiller (WMF) ( talk) 00:11, 6 July 2018 (UTC) reply

I would personally keep WP:AFC/R and NPP in separate pages as they are very different. But technically speaking, yes, redirects have to be patrolled by a new page patroller after their creation. I get tens of automatic notifications every day about my redirects being patrolled. L293D ( ☎ • ✎) 00:28, 6 July 2018 (UTC) reply

Thanks, L293D. I think that helps confirm that it's best to keep redirects out of the feed, at least for now. -- MMiller (WMF) ( talk) 22:49, 6 July 2018 (UTC) reply

Updates?

MMiller (WMF), Hope you had a satisfactory Wikimania. We're looking forward to more progress reports. Kudpung กุดผึ้ง ( talk) 21:04, 26 July 2018 (UTC) reply

@ Kudpung: thanks for checking in. I definitely learned a lot from community members at Wikimania that I'll be able to apply to the team's work -- particularly about copyvio detection. The most recent update is from last week on the project page here. Because several team members are in the midst of traveling, I won't be able to post an update for this week, but I will do so at the beginning of next week. I can say that coming soon are two things: (a) access to a testing environment where community members can try changes to the New Pages Feed before they are put into production, and (b) a plan for how we will approach copyvio detection. I hope you, and others, will be able to weigh in on both aspects. -- MMiller (WMF) ( talk) 23:15, 27 July 2018 (UTC) reply

Thanks, MMiller (WMF). It would be good if you could share what you learned with those of us who were not privileged to be present at Wikimania. Kudpung กุดผึ้ง ( talk) 00:24, 28 July 2018 (UTC) reply

Question about accepted drafts

If accepted AfC drafts are marked as patrolled, it would be nice to have some way of identifying them. I ay least need this, a part of the ongoing background process of auditing AfC reviewers. (there are other ways of picking them up,, and I use them now, but they're indirect and non-systematic. Or is there some way I haven't realized? DGG ( talk ) 03:33, 30 July 2018 (UTC) reply

@ DGG: thanks for the question. I want to make sure I understand. It sounds like you're saying that when AfC reviewers are patrollers, and they accept drafts (thereby moving them to mainspace), those drafts are automatically patrolled at that time -- but that means you can't see those new articles in Special:NewPages because it filters out patrolled pages? I know that in Special:NewPagesFeed, it is easy to include or exclude patrolled pages in the feed. Is the issue also that it's not possible to tell which new articles originated in AfC? -- MMiller (WMF) ( talk) 21:41, 6 August 2018 (UTC) reply

I would advise that either that they should not automatically be marked patrolled, or that it should be indicated they come from AfC. DGG ( talk ) 02:59, 7 August 2018 (UTC) reply

@ DGG: got it; thanks for clarifying. I'll keep an eye on this idea, but I don't know if we'll be able to have it in scope for the work the team is doing now. I think it would be good to list this idea at this page, where other suggested changes are, if you have time. That would help us keep track of it in the future. -- MMiller (WMF) ( talk) 19:16, 7 August 2018 (UTC) reply

Keyboard shortcuts…

Any hope that these can be slipped in at the end, please? ^Thanks, L3X1 ◊distænt write◊ 11:48, 6 August 2018 (UTC) reply

@ L3X1: thanks for bringing this up. I saw that you posted this idea here. I have a pretty good idea of what this might look like for the Page Curation toolbar, but I'm wondering if any AfC reviewers think keyboard shortcuts would be useful with the AFCH gadget (please chime in if so!) Unfortunately, either way, that work will likely be out of scope of the time this team can spend. The Community Wishlist is one good way to get development time for a project like this. -- MMiller (WMF) ( talk) 21:15, 6 August 2018 (UTC) reply

By the way, if anyone wants keyboard shortcuts in the AFCH gadget, I'd appreciate a ping as well so I can work on it. Enterprisey ( talk!) 03:43, 7 August 2018 (UTC) reply

I think we should be cautious here. We do not want to encourage people working in a semi-mechanical way. DGG ( talk ) 13:21, 7 August 2018 (UTC) reply

Quality over quantity is nice, but for userspace spam and redirects I can do in excess of 7 per minute as is, and the queues are long and my fingers sore. ^Thanks, L3X1 ◊distænt write◊ 18:08, 7 August 2018 (UTC) reply

I have found that there is nothing at Wikipedia that I can do accurately more than 1/minute (sometime I do reach 2/min,but I find that when I go that fast I make more errors) however routine it looks. As a minimum, it's worth seeing what else the person may have contributed to tried to contribute. If there's userspace spam, perhaps there's some elsewhere also. DGG ( talk ) 19:25, 7 August 2018 (UTC) reply

Concurring with DGG, it is neither technically nor physically possible to correctly review a new page in under a minute. It says on the top of the feed:

Rather than speed, quality and depth of patrolling and the use of correct CSD criteria are essential to good reviewing.

(I wrote it). Some reviewers are reviewing as fast as 4 pages per minute, with one having reviewed nearly 40,000 articles in a short time. We haven't had the NPR right long enough for anyone to have spent that number of hours patrolling pages at a correct speed. We do not need keyboard short cuts. Quite the contrary, we need something to slow the reviewers down. Kudpung กุดผึ้ง ( talk) 23:50, 7 August 2018 (UTC) reply

Kud you know I usually agree with you, and because of that is the reason I haven't pressed hard for shortcuts, but there are at least 3 functions that could use making user friendly: the page info button, the "next in queue" button ( can we not all agree that advancing the queue by shortcut is not decadent??), and the first part of accepting a page. Every other tool with automation contains this warning Warning: You take full responsibility for any action you perform using Foo's Automated Tool Bar. You must understand Wikipedia policies and use this tool within these policies, or risk losing access to the tool or even being blocked from editing. The same is for Page Curation. And anyone is welcome to come to my world by loading up Blocked Userpages and plowing thoruhg dozens of UPs that containing a blocked-user/cublocked-user template and nothing else. ^Thanks, L3X1 ◊distænt write◊ 02:01, 8 August 2018 (UTC) reply

Testing feedback

Hi all! I just posted an update on the project page about how reviewers can now use the Test Wiki to try out the team's work as we build it. I'm creating this section as a place for everyone to post feedback. You are also welcome to file Phabricator tasks by tagging either me or the Growth Team. Thank you, and I'm hoping everyone has time to try out the evolving New Pages Feed. -- MMiller (WMF) ( talk) 20:46, 6 August 2018 (UTC) reply

Hi MMiller (WMF) Thanks for opening it up for testing. I've been looking the way the dates for declined submissions are handled. I would expect the date that is so prominently displayed on the right to be the date that the value of the sort criterion was changed. If an article was created on August 1, and declined on August 3, then I expect the date to show August 3 if I sort by Declined date. But if I look at Test:Draft:Test Draft 004, It is listed as 12:55, 2 August 2018, the date it was last edited, not the date it was declined (that was 12 July 2018) or the date it was created (that was 11 July 2018). Did you intend to use the laste edit date in this instance? That would be different from how the NPP queue currently works. Vexations ( talk) 21:38, 6 August 2018 (UTC) reply

Another problem is that there is no correct way to sort submissions by an inapplicable criterion. As it is, I can sort articles that haven't even been declined yet by declined date. Can you clarify what is supposed to happen when sorting not-yet-declined submissions? Vexations ( talk) 21:38, 6 August 2018 (UTC) reply

@ Vexations: thanks for taking a close look! That date behavior definitely does not look like what we intended. The intended behavior is that the bold upper-right date is the date the page was created, and so it would not change, even if the sorting changes. But we are intending to push an update this week that might address both your thoughts. The update will do two things:

1. In addition to the bold "Created" date in the upper right, it will display the "Submitted" date along with drafts that are of states "Awaiting review" and "Under review", and the "Declined" date along with drafts that are of state "Declined". So drafts that are somewhere in the AfC process will have two dates associated with them: "Created" AND ("Submitted" OR "Declined").

2. We were also talking about the second point you made -- about the sorting submissions by an inapplicable criterion. The logic we're going to try will allow sorting by "Submitted" date when the list is filtered to drafts that are "Awaiting review" and "Under review", and sorting by "Declined" date when the list is filtered to drafts that are "Declined".

Do you think those two changes make good sense? In the meantime, I'll try to figure out why that date is changing in the upper right. Thanks again for weighing in. -- MMiller (WMF) ( talk) 23:57, 6 August 2018 (UTC) reply

Yes, that makes sense. You're (read: you and your team) on top of things. It's hard to tell from a beta version of software how much thought has already gone in an unfinished product.

Vexations ( talk) 00:09, 7 August 2018 (UTC) reply

@ Vexations: just to follow up, that issue you found about the changing date in the upper right is filed here, and we're working on it now. -- MMiller (WMF) ( talk) 19:09, 7 August 2018 (UTC) reply

@ Vexations: the issues you identified have been fixed in Test Wiki. If you get a chance, I hope you can take a look. We're not 100% happy with how the dates are laid out in the feed, and we're open to suggestions. Thank you! -- MMiller (WMF) ( talk) 15:55, 10 August 2018 (UTC) reply

Seems good to me; I think the default should be to show unreviewed drafts though; being able to review declined drafts could be useful but in general showing unreviewed drafts seems like a sensible default. (but this is minor since your choices are saved..) Galobtter ( pingó mió) 17:35, 7 August 2018 (UTC) reply

Thanks for testing, Galobtter. That issue about default settings is something we had been considering. It's filed here, and we're now planning to work on it in our team's next sprint. -- MMiller (WMF) ( talk) 19:09, 7 August 2018 (UTC) reply

@ Galobtter: I just wanted to follow up and let you know that this is now complete in Test Wiki. The default AfC filters will be "Awaiting review" and sorted by oldest submitted date, so that those who have been waiting for review longest are at the top of the feed. Thanks for bringing it up. -- MMiller (WMF) ( talk) 23:15, 27 August 2018 (UTC) reply

I have the same comments as Vexations about dates. I see that this has been addressed and I'm happy with it. I will primarily use the Awaiting review filter sorted by oldest submission first. But actually, until you have some copyvio scoring, I don't have a reason to use this new interface and an unresolved reason not to - see the reviewer conflict discussion above in #Update 2018-04-25. ~ Kvng ( talk) 17:24, 12 August 2018 (UTC) reply

@ Kvng: thanks for checking the feature out. I understand what you're saying about copyvio, but I'm also hoping you'll find the ORES scores useful. Those should be testable within the next week or so. Regarding the reviewer conflict discussion, I have a couple thoughts/questions:

The "Awaiting review" filter will exclude any drafts that are "Under review", which happens when the reviewer uses the AFCH to mark them as such. I know you said you don't tend to mark drafts as under review, but you at least won't be opening any that other reviewers have marked.
How do you currently avoid conflicting with other reviewers who are also selecting the oldest drafts to review? Is it by choosing a few drafts at random? How often do you end up conflicting with other reviewers with your current approach?

Thank you! -- MMiller (WMF) ( talk) 23:39, 13 August 2018 (UTC) reply

@ MMiller (WMF): I am aware of the "Under review" feature. It is an unwanted/unnecessary extra step and it only works perfectly if all reviewers use it. Since it is pretty buried in the current AFCH interface, my assumption is that few reviewers are currently using it. So, even if I use it, there may still be problems for me due to others not using it. I will "win" in these cases but really, no one wins if we're stepping over one another.

My current MO is to review random articles from the second oldest by-week category. I haven't actually had problems with conflicts in all the reviews I've done over the years. Perhaps there is actually no problem. Perhaps I have solved the problem with my dodging through the backlog. ~ Kvng ( talk) 00:12, 14 August 2018 (UTC) reply

@ Kvng: got it, thanks. It sounds in general that conflicts tend to be very rare, and I'm hoping that you won't have them even when using the New Pages Feed. And in the New Pages Feed, you will be able to scroll to a few random spots in the list and grab pages, instead of just choosing the top three. -- MMiller (WMF) ( talk) 00:50, 14 August 2018 (UTC) reply

I note that it is still not possible to filter only redirects or 'nominated for deletion' (or perhaps there are no redirects in the test list?) — Insertcleverphrasehere ^{(
or here)} 21:45, 6 August 2018 (UTC) reply

@ Insertcleverphrasehere: thanks for this thought. Are you saying that the AfC side of the New Pages Feed would benefit from having those two filters, like the NPP side has? One of the reasons we did not prioritize the redirect filter for the AfC side is because of this conversation above, in which we discussed how there is a separate process for reviewing draft redirects. What do you think is best? -- MMiller (WMF) ( talk) 23:57, 6 August 2018 (UTC) reply

It doesn't work on the NPP side - you can't select just nominated for deletion or redirect. You must also choose reviewed or unreviewed and so it gives you tons of stuff that aren't redirects or deletion noms. Best, Barkeep49 ( talk) 00:09, 7 August 2018 (UTC) reply

Barkeep49 and Insertcleverphrasehere -- oh, I see what you're saying. I will look into it and try to figure out if there was a reason for that. -- MMiller (WMF) ( talk) 19:09, 7 August 2018 (UTC) reply

Barkeep49 and Insertcleverphrasehere -- we did some digging, and it does look like this is a bug, and that it's already been filed. Thanks for bringing it up again. We'll figure out if we can fix it on this pass, and I'll let you know. -- MMiller (WMF) ( talk) 23:39, 13 August 2018 (UTC) reply

Comments by Kudpung

I think it's basically looking very good. At least we now have a proper feed for AfC where everything can be done from a central control panel, and by combining it with the New Pages Feed brings all quality control into one place - which also permits the more highly qualified New Page Reviewers to dig in and help out on the fly with the AfC backlog (notwithstanding, NPR still has its own backlog problems).

I assume that AfC reviewers will be able to select the NPR feed, but we must ensure that unless they hold the NPR right, they do not get access to the Curation tool. I don't believe however, that they should be able to patroll their own moves of AfC drafts to mainspace.

I'm a little bit disappointed that my mock up of the NPR preferences selector has been ignored and. not even commented on. Although I realise that this current project is focused on AfC, I would have thought that as a gesture, the devs could have entertained those small but useful tweaks.

That said, and still about NPR, ( DGG isn't going to like this), but I think it's now time to allow patrolling of New Pages to be done only from the New Pages Feed and not from the old feed and Twinkle. Twinkle is fine for tagging older pages that are no longer in the feed, but for the strict purpose of curating new pages, it causes havoc with the stats and makes it very difficult to control who is doing what, and how well they are doing it. Kudpung กุดผึ้ง ( talk) 20:21, 8 August 2018 (UTC) reply

Indeed, I rely on the old feed as the only effective way of quickly scanning work--the NPP display is much to slow. Each time I look, I typically keep looking until I find something I think dubious one way or another-- it may take a few hundred items till I find something that doesn't seem right, and this cannot be done with the new format. I also rely on it for finding articles on specific subjects I try to keep aware of that occur at low frequency, maybe once every week or so. This cannot be done with the new format , and my intuitive scanning is better than any subject arrangement we have either implemented or in prospect. This is analogous to what librarians call browsing. It isn't efficient, but it does things that cannot be done otherwise.

what will really suffer if we follow Kudpung's suggestion is not my own preferences, which I cannot expect others to accept as enough reason to keep a system going. It still is the only possible way of auditing performance.

And frankly, I don't give a damn about a few percent of error in the stats. It's the work that matters. The better argument is that it can be (and is) abused. The abuse has to be weighed against the auditing. It won't be open to abuse if the people who do like the new NPP keep up to date with the work. If they do, having this won't matter. But if they don't, then other approaches need to be encouraged also. DGG ( talk ) 00:34, 9 August 2018 (UTC) reply

And there's a principle: we have a raw feed of edits. It's conceptually an essentially part ofan open site, to let anyone see what is happening. The new pages feed is just a special selection from it. DGG ( talk ) 00:34, 9 August 2018 (UTC) reply

I'm not sure what Kudpung is proposing here. Is it to disable Twinkle for unreviewed pages? Patrolling is not done from Special:NewPagesFeed but from the Curation Toolbar. Twinkle does not have a Mark as patrolled function. The only other way of patrolling a page that I know of is to close the Curation Toolbar and use the [Mark this page as patrolled] link. Is the proposal to remove that link? If so, how does using that Mark this page as patrolled link in stead of the Curation Toolbar skew the stats?. Vexations ( talk) 01:49, 9 August 2018 (UTC) reply

@ Kudpung: thanks for trying it out and recording your thoughts! Here are some answers to your questions and some more questions for you:

AfC reviewers will be able to select the NPR feed, as the New Pages Feed has always been a public page accessible to anyone. They won't, however, have patrolling rights or access to the Page Curation toolbar.
I looked back again at your mockup, and I see that the you included additional filtering criteria about citations, Twinkle, stubs, and a few others. I apologize for not commenting on those; we've been focused on adding drafts to the feed. I don't think we'll have the bandwidth to take those changes on, but it would be great to hear thoughts from other NPR reviewers on those ideas for future reference. -- MMiller (WMF) ( talk) 15:55, 10 August 2018 (UTC) reply

@ MMiller (WMF): I would make regular use of all the filters in Kudpung's mock-up (except Twinkle tags) and think they could all be useful. I'm sorry to hear that they can't be a part of this development. Is there time anticipated that WMF will further be developing NPP? If not despite this being AFC focused it would be nice if there could be some marginal improvements especially in areas the team will already be making changes with (and thus familiar). Best, Barkeep49 ( talk) 16:05, 10 August 2018 (UTC) reply

As an example, please consider the new user Luftfall. I just looked at their history following a couple of AfD nominations and notice that they started editing with an AfD nomination as their second edit. How is that even possible when, even now, they don't seem to have auto-confirmed status? The edit has the tag "Visual edit: Switched" and doesn't seem to have been done by Twinkle. Is there some feature of the Visual Editor which lets new users nominate articles for deletion or what? Andrew D. ( talk) 07:36, 15 August 2018 (UTC) reply
@ Andrew Davidson:-Auto-confirmed status hasn't got anything to do with dispatching articles for deletion (CSD/PROD/AFD) or editing at AfDs.Anybody, irrespective of whether he is a PagePatroller or not, can tag pages (either by manually adding the templates or through Twinkle) and/or initiate every deletion-procedure. (The old-school-procedure to start an AfD is described at this section and goes fine with both SE and VE.) From a technical perspective, new-page-patroller(s) have the sole additional authority to index a Wikipedia-page, which then lets the search-engines crawl it. ∯WBG ^converse 10:55, 15 August 2018 (UTC) reply

@ Winged Blades of Godric and Andrew Davidson: and that is precisely the fly in the ointment. 50% of the proposal for a New Page Reviewer right was originally intended to prevent newbies, for whom meta areas are a magnet, and other inexperienced editors from tagging pages for deletion. The other half was it being being done: preventing them from passing new articles as OK for search engine indexing. Allowing these beginners to tag for deletion is where a lot of discouragement is meted out to new, but serious content creators. However, the community in its odd wisdom at the RfC insisted that they should retain this access while Wikipedia remains the only user editable website in the world that does not exercise such a control. Kudpung กุดผึ้ง ( talk) 02:59, 23 August 2018 (UTC) reply

Two questions:
Two questions:

Will the feed say Redirect when it's a redirect?
If an NP reviewer approves an AfC, will it go into the NP feed or will it be marked as reviewed?

Thanks, ^{Atsme
📞
📧} 03:18, 17 August 2018 (UTC) reply

@ Atsme: thanks for the questions.

Regarding redirects, the feed currently indicates redirects with the word "REDIRECT" in front of the page content. For instance, the feed might say, "REDIRECTViceroy of Shaan-Gan" in gray lettering with an item in the list. While that's not an ideal way to indicate it, we are thinking about how to improve the feed's filtering to make it possible to restrict the feed only to redirects here. However, all of this only applies to redirects in the main article space; not the draft space. That's because we decided here that we would keep draft redirects out of the feed entirely.
Regarding NP reviewers approving AfC drafts, I believe that yes, they will be marked as reviewed. That's because by approving a draft, the reviewer is simply moving it to the main article space, thereby creating the article, which should follow all their user rights as usual.

Does that answer your questions? Do you have any other thoughts or feedback on the changing feed? Thank you. -- MMiller (WMF) ( talk) 22:35, 20 August 2018 (UTC) reply

tweaks

1.For comments, the text of the comment should be copied onto the user page. Unless I'm mistaken, this should be easy. 2. For rejections, the color should be some shade of yellow or orange or light red, to give the proper message. This should be even easier. I have already found articles with two successive rejections, one day apart. DGG ( talk ) 19:39, 7 August 2018 (UTC) reply

@ DGG: thanks for weighing in. It sounds like you're talking about the formatting and wording around the templates used in AfC reviewing. This project has become all about the New Pages Feed itself, though it looks like some people are discussing the templates here. -- MMiller (WMF) ( talk) 16:10, 10 August 2018 (UTC) reply

drafttopic

I love the new ores scoring! Are there any plans to integrate drafttopic as well at some point? SQL ^{Query me!} 22:24, 16 August 2018 (UTC) reply

@ Halfak (WMF): how "ready" is drafttopic? It doesn't appear on this status page or anywhere else that I could find. -- Roan Kattouw (WMF) ( talk) 22:38, 16 August 2018 (UTC) reply

FWIW my last patch to User:SQL/AFC-Ores implemented drafttopic tagging. SQL ^{Query me!} 23:04, 16 August 2018 (UTC) reply

@ SQL: Very cool, thanks for the link! As you found, the production ORES service includes our first drafttopic model, but there are no plans as of yet to integrate in the UI. Here's another toy script to play with if you're feeling inspired, Adamw/DraftTopic.js Adamw ( talk) 21:32, 21 August 2018 (UTC) reply

Blank Articles

I admit I only have done a wee bit of RC patrol but I was surprised that when I blanked this it found no Predicted Issues. What besides vandalism are the issues ORES predicts? Best, Barkeep49 ( talk) 00:19, 17 August 2018 (UTC) reply

We trained the model on articles that were CSD'd for spam (blatant advertising), attack pages, and vandalism/blatant hoaxes. Everything else (good or subtlety bad) gets grouped into "OK" which isn't really the best term. Maybe we should have named that class "not terrible". -- EpochFail ( talk • contribs) 23:07, 20 August 2018 (UTC) reply

This is helpful explanation but even more perplexing for me that a blank page doesn't trigger as articles with no content definitely get CSD'ed.Best, Barkeep49 ( talk) 00:07, 21 August 2018 (UTC) reply

New Pages Feed

MMiller (WMF) This project has become all about the New Pages Feed itself,... , absolutely, and that's why it is imperative that the requests at WP:PCSI be addressed urgently rather than persistently relegating them to the Wishlist which is for non urgent community comfort and convenience gadgets. If the WMF can spend some time on making these enhancements for AfC, they can spend some time and money on the even more pressing issues. Let's not forget that the New Pages Feed/Curation was developed in collaboration with Erik Möller and Jorm, with technical work by Kaldari largely as a result of my original initiative, and although still incomplete, I have always maintained that their accomplishment was the best thing for quality control since sliced bread, but it needs to be used and the accredited reviewers encouraged to use it properly. Since the mass exodus of C staff, the WMF as it now stands , with the exception perhaps of ORES, does not appear to consider these critical issues to be of major importance. I do realise of course, however, that this is possibly not within your specific sphere of influence. Kudpung กุดผึ้ง ( talk) 01:19, 21 August 2018 (UTC) reply

But the communitycan have a role in encourqagingthe foundation!. DGG ( talk ) 22:00, 21 August 2018 (UTC) reply

The foundation is so bad at allocating funds that they feel no need to spend any developer time on empowering the one group of editors that is the first firewall against unwanted content. Unwanted content that, if left unfiltered, damages the reputation of the encyclopedia and throws into doubt the validity of all of the other articles, no matter how well written and researched. I actively discourage any of my friends or acquaintances from donating to Wikipedia. They don't need our money and I don't trust them with it. [7] [8] [9] Their lack of willingness to maintain tools that they built (built in such a way that we can't even fix them ourselves) is just another symptom of this. — Insertcleverphrasehere ^{(
or here)} 22:39, 21 August 2018 (UTC) reply

It's the age old problem, DGG, and despite their occasional conciliatory words and minor actions (such as for example this current development), the WMF people with whom the community is forced to negotiate will decide among themselves what is done with our money, raised through the power of volunteer efforts at building and policing this encyclopedia. In short, the WMF cannot be 'encouraged'. They can only be forced to concede through very strong volunteer action such as, for example, the community decision to enact ACTRIAL if nothing were done. Kudpung กุดผึ้ง ( talk) 02:40, 23 August 2018 (UTC) reply

Like, I appreciate the effort being made here; but to get anything done for NPP we basically had to piggyback on the WMF's plan to help out AfC. Its just a bit frustrating how little value the WMF seems to place on New Page Patrol. A while back I nominated the top 10 reviewers of the year for T Shirts over at the Merchandise givaways page, but these editors are generally gnomish, and so don't have the popularity needed on that page. Because popularity is the only thing that page cares about, only 2 of the top 10 reviewers actually received gifts. For users that did thousands of reviews over the last year, it is probably an extra slap in the face to be nominated for a gift and then for the WMF to turn them down. — Insertcleverphrasehere ^{(
or here)} 22:13, 23 August 2018 (UTC) reply

The Foundation, as would be expected, is primarily interested in spending money to support the broader missions of the movement. As central organizations of any kind do, it is not closely concerned about the narrow issues of the quality of the actual operations of the actual system. (analogous to the real -- rather than pretended -- interest of any university administration) in the quality of what actually goes on in the classrooms) As would be expected of any organization, when it does get involved in operations, it will prefer those conceived by its own staff rather than the volunteers, especially those conceived of by its higher level administrators. This is the environment in which we will increasingly work--as WP grows, it will increasingly resemble a conventional organization. (It was widely doubted ten years ago whether a decentralized volunteer could grow to anywhere near our present size without collapsing; it is more likely that there is a point beyond which it will either A, disintegrate,or B, transform into a more conventional top-down structure. We may have thought we avoid either fate, but we are falling into B. It's still better than A )

There are many ways fro use to go forward nonetheless ( and for now, I will just discuss those that I think possible within the current structure) :

1} We can make allies of the programmers and other central staff who actually do the work-- encouraging them to take our needs into account within their assigned work, and to try to devote some informal effort into our needs.

2) We can continue to work with higher level staff when we have opportunities--and continue to maintain our values and priorities when doing so.

3) We can make use of the input we have into the Board--and try to keep them supportive of us after they are chosen

4) We can continue to make our case using existing internal means of communication--and try to encourage new ones to the extent we can support them

5) We can try to devise ingenious ways to improve work that are within our own control-- and develop further capabilities for doing so

6) We can continue improving the quality of our own internal work in maintaining the encyclopedia -- and recognize that this require continuing attention

7) We can continue to recruit newcomers -- especially those who will have an interest in producing and improving quality content. DGG ( talk ) 06:05, 24 August 2018 (UTC) reply

Copyvio plan feedback

Hi all! I just posted an update on the project page about our plans for adding automated copyvio detection to the New Pages Feed, which is the third (and final) major part of this effort. I'm creating this section as a place for everyone to post feedback. We definitely want to make sure we're building something useful. Thank you! -- MMiller (WMF) ( talk) 00:45, 23 August 2018 (UTC) reply

I would be curious if your team could present more information about the equal effectiveness. This has not been my personal experience but I admit that there are any number of reasons why I could be wrong (an unusual subset of pages, I've noticed incorrectly, etc). What is the plan to enable testing in the AfC and NPP feeds on test wiki before bringing it over to en? Best, Barkeep49 ( talk) 00:57, 23 August 2018 (UTC) reply

@ Barkeep49: thanks for the quick thoughts. I did want to quickly say that it's not that our analysis makes it definitively look like the services as equally good as each other, but in a quick analysis, it seems that neither is obviously better than the other. We also hear anecdotally from the experience of those using CopyPatrol that Turnitin does a good job. What have you found in your experience? Regarding the question about testing, I'll bring that up with our team and get back to you. -- MMiller (WMF) ( talk) 01:11, 23 August 2018 (UTC) reply

@ MMiller (WMF): I was interested about the archiving of websites as a turnit in plus because the two areas of weakness I've noticed are with information tied to recently updated pages (which seems to happen particularly in cases of paid editing) or where there is an obscure corner of the web which is cited as a reference in the article. These two scenarios are where it has seemed to me that Earwig shines and Turn-it-in does not. When I was using both I did not find a time where Turn-it-in found an issue and Earwig did not and so I've turned to using earwig exclusively. Best, Barkeep49 ( talk) 20:50, 23 August 2018 (UTC) reply

@ Barkeep49: thanks; that helps add some color. I wouldn't be surprised if Google does cover the web better than Turnitin, and we'll have to continue to find out as we add Turnitin to the New Pages Feed. One thing that Diannaa pointed out to me is that Turnitin can find copyvios from web pages that have been taken down already, which is one of its advantages. Regarding your question about testing, unfortunately we probably won't be able to truly test copyvio in the Test Wiki for two reasons. Technically, we won't be able to wire CopyPatrol to look at edits to Test Wiki -- to do that without adding noise to the work of those using CopyPatrol directly would be an amount of engineering work that would probably not be worthwhile. And then there would be the issue of adding "test" copyright violations to the Test Wiki -- I don't think we would want to deliberately add violating text there. But you can, of course, get a feel for Turnitin by using CopyPatrol filtered to "Drafts only". Given these issues, the team will continue to figure out what we can do with respect to testing on Test Wiki. -- MMiller (WMF) ( talk) 23:03, 27 August 2018 (UTC) reply

One of the problems, MMiller (WMF), is over-enthusiastic New Page reviewing, particularly in so called backlog drives. There are users who hover over the feeds waiting to tag or patrol articles the instant they are saved/published. They do this faster than any CorenBot, Earwig, CopyPatrol, or whatever, can do their searches (Median response time for iThenticate is 4.83s, while for EarwigBot it is 11.95s). Reviewers sometimes patrol as fast as 4 pages a minute, and users such as DGG and I know that this is not only not possible, but this is faster than these bots can do their work, and certainly faster than any conscientious reviewer can fully check a page manually. In my experience these copyvio detector bots often take 20sec or longer to work)

One solution would be to delay the appearance of new articles - in all feeds - for a minute or so until the bot(s) have functioned. This does not affect AfC because drafts wait much longer to be reviewed, and they are not generally likely to be moved to mainspace before a cpyvio check has (hopefully) been done by the reviewer. Kudpung กุดผึ้ง ( talk) 03:20, 23 August 2018 (UTC) reply

That's an interesting idea. I think it's been mentioned before in the context of preventing premature reviews. I doubt the WMF would have any objections to the idea, but it could be controversial with other reviewers. Would be good to hear other opinions. Ryan Kaldari (WMF) ( talk) 20:21, 23 August 2018 (UTC) reply

Reviewing should be about quality not speed. Even articles with blatant issues can survive being on the website for a few minutes if we a greater portion of problem articles are dealt with better (and in the end faster thanks to the new notices with copyvio and ores). Best, Barkeep49 ( talk) 20:44, 23 August 2018 (UTC) reply

I just want to state here: Often when I am reviewing I will open up a number of articles, start the earwig tool running on all of them, then read through each of them while waiting on the copyvio report. I will wait to mark them as reviewed until checking the earwig result, and this can often result in 'bursts' of reviewing which can make it look like I have reviewed a half dozen articles in a very short time. There are also situations where someone submits several short stubs in a row which are very similar which can be reviewed very quickly. There are quite a few ways in which over 4 articles a minute might appear in a reviewer's logs and be totally fine. I am not aware of the specific string of articles being referred to here though, and substandard/hasty reviewing does happen, but I just wanted to qualify the statement that "4 pages per minute isn't possible". — Insertcleverphrasehere ^{(
or here)} 22:01, 23 August 2018 (UTC) reply

This is also my method (and one I believe I learned from you). I assumed good faith that it wasn't meant for editors whose approach is like ours. Best, Barkeep49 ( talk) 22:38, 23 August 2018 (UTC) reply

My understanding is that the issue is with articles that are reviewed too soon after they are created, not with reviewers reviewing them too quickly per se. For example, sometimes articles are reviewed before the creator has had a chance to flesh them out and they might still be in the midst of writing the article, but saved an early version for some reason. Ryan Kaldari (WMF) ( talk) 23:38, 24 August 2018 (UTC) reply

@ Ryan Kaldari (WMF): The above is not about how soon articles are reviewed, that's a separate discussion, but how long a reviewer takes in doing the review. As to reviewing new articles before they're ready, I don't tend to patrol that end of the queue so I can't speak with any authority how those are reviewed. However as the flow chart lays out if an article subject can be identified it should be reviewed on the basis of its inherent notability not on completeness of notability in the article. I think many reviewers, including myself when I do patrol that way wait 10 minutes for reviewing articles even that aren't blank. Exceptions can be made in certain cases of course (e.g. copyvio and BLP). Best, Barkeep49 ( talk) 02:00, 25 August 2018 (UTC) reply

Kudpung, Barkeep49, Insertcleverphrasehere: thanks for bringing up that issue. We'll talk amongst the team about the idea of delaying the articles, or indicating which ones have not yet been checked for copyvio yet. This hasn't been a problem with ORES because the ORES scores return almost instantly, but copyvio will be done with a different service.. We're going to look into how quickly the CopyPatrol results are returned in this Phabricator task. -- MMiller (WMF) ( talk) 23:03, 27 August 2018 (UTC) reply

Kudpung, Barkeep49, Insertcleverphrasehere: just to follow up on this, the volunteer who maintains the bot that backs CopyPatrol helped us understand that the bot usually takes seconds, but could theoretically take minutes to scan new pages. Given what you have said about reviewers grabbing new articles too quickly, this could be an issue. But our team thinks that most prudent thing is to get something out into the hands of reviewers, and then for us to see what kinds of problems crop up -- we don't want to solve a problem that may not exist. We will keep an eye on this. -- MMiller (WMF) ( talk) 21:13, 4 September 2018 (UTC) reply

I am very excited for this development. Questions: how long does it take for the bot to run? How many bytes is over "a certain number" that will trigger a re-check? Delaying appearance in the New Page Feed wouldn't stop people reviewing articles that had previously been checked as OK, but which the copypatrol was currently being run on a new revision. Would it be possible for the new page feed to simply state that the copyvio bot is running? It could state something like: "Copyright check being run, PLEASE WAIT". This would signal reviewers from the new page feed not to click on those articles until the copypatrol was completed. — Insertcleverphrasehere ^{(
or here)} 22:01, 23 August 2018 (UTC) reply
- @ Sbisson (WMF): Do you happen to know the edit size threshold that is being used? Or how long a typical Turnitin API request takes? Ryan Kaldari (WMF) ( talk) 23:42, 24 August 2018 (UTC) reply

@ Insertcleverphrasehere: I'm glad you're excited, and thanks for the questions. We'll double check the number of bytes that constitutes the threshold for CopyPatrol and get back to you. And we're going to look into how quickly results will be returned after a page is published in this Phabricator task, which will help us decide whether we need to indicate that the check is being run, or something else. -- MMiller (WMF) ( talk) 23:03, 27 August 2018 (UTC) reply

@ Insertcleverphrasehere: CopyPatrol has a somewhat complex set of logic to determine exactly which revisions it scans, but the short answer is that it scans revisions over 500 bytes, which is a few sentences. The longer answer involves logic to strip out bytes related to templates and formatting, etc. One change we're planning on making to the logic will cause it to scan all new pages, regardless of size, since new pages, such as this one, can be stubs under 500 bytes. -- MMiller (WMF) ( talk) 20:26, 28 August 2018 (UTC) reply

In my experience very short stubs like the one that you brought up are very rarely verifiable a copyvios. Often the single sentence in the article might be also represent a simple and natural language descriptor. — Insertcleverphrasehere ^{(
or here)}

@ Insertcleverphrasehere: thanks for making this important point. This is a good case to not scan very small pages. We'll leave them out for now, and keep an eye on whether that behavior should be changed in the future. -- MMiller (WMF) ( talk) 21:13, 4 September 2018 (UTC) reply

I'm not sure if this was the right choice, and I'd like to know what the cuttoff point is likely to be (For single sentence articles, this might be right, but a 100% copyvio of several sentences is a big problem, but might not be identified if it isn't run). Changed in the future!? Fat chance of that happening. The community tech team hasn't listened to any of our suggested changes to the toolset except for blatant bugs, (and in many cases have ignored those too). — Insertcleverphrasehere ^{(
or here)} 21:22, 4 September 2018 (UTC) reply

Insertcleverphrasehere and Kudpung -- I wanted to revisit this question about the 500 byte threshold. You can see on this Phabricator task that Eran, who wrote the bot behind CopyPatrol, chose the 500 byte threshold because of making sure there wouldn't be too many false positives. I'm tagging Diannaa because she might have some wisdom on whether this threshold has served patrolling purposes well. -- MMiller (WMF) ( talk) 21:13, 20 September 2018 (UTC) reply

Sorry I don't have any information on whether or not that's an appropriate threshold. — Diannaa 🍁 ( talk) 22:01, 20 September 2018 (UTC) reply

I agree that one simple partial fix is to delay the appearance of new articles for a minute or two. This will not only give the bots time to run, but will prevent the most problematic reviewing of articles while they are still being written. In general rapidity of reviewing is a balance--some junk can and should be spotted immediately, but for most articles it's better to wait 5 or 10 minutes. An experienced reviewer who is paying attention knows how to distinguish, but that does not describe the majority of reviewers. . The method Insertcleverphrase here uses I sometimes use also, but only for two or so articles. 4 or 5 articles a minutes is possible in special cases, but I see altogether too many problems from people who attempt it unwisely.

I'm also concerned about establishing the bots as a perfect solution--they are a good first screen. Everything the find as a problem needs to be checked manually, and everything the don't find as a problem that to an experienced eye looks like it might be copyvio needs to be checked manually by someone who knows how to check in the particular subject--I'd suggest more frequent use of the "copypaste" template. We need to teach beginners to leave anything they aren't sure of for someone more experienced. And even for the most experienced, it is better not to review articles in subjects one does not understand. DGG ( talk ) 02:05, 25 August 2018 (UTC) reply

Thanks, DGG. In designing how we want to display the copyvio information, we took the point of view that the bot is a guide, not a rule, which you and Kudpung pointed out a few weeks ago in this conversation. That's why the feed will say that a page has "Potential issues: copyvio", instead of saying the exact score. It will also not say when CopyPatrol has not found an issue. That's because we are less confident in negative results from Turnitin than we are in positive results. In other words, we want the feed to be saying to reviewers, "look extra closely at this one." Hopefully this will not cause reviewers to rely too heavily on the bot. -- MMiller (WMF) ( talk) 23:03, 27 August 2018 (UTC) reply

Lining up the new pages in tabs is a system I also sometimes use. However, the pages that load in the browser tabs will only show the state of the article at the time they were loaded in the browser. Even when loaded in the browser, they still need the full attention. I doubt therefore, that all abut the simplest ones - low hanging fruit - can be fully scrutinised in seconds.

I do not doubt for a moment DGG's claim that for certain reasons he prefers to use the old Special:NewPages feed, but IMO this should not be used by less experienced reviewers who will appreciate the advantages of the more complete meta information displayed in the Special:NewPagesFeed from which immediate information is shown that could highlight possble UPE, COI, and the upcoming COPYVIO information from ORES. Kudpung กุดผึ้ง ( talk) 05:54, 25 August 2018 (UTC) reply

I altogether agree that it's better for beginners to use the curation toolbar and its associated feed. Advising them to do otherwise is not helpful. Indeed, even for the experienced, in reviewing an individual article it has many advantages, and I use it about 2/3 of the time. . The old NPP feed has two--easier scanning for things that just don't look right, and checking on proper handling of autopatrolled and patrolled articles--I especially appreciate it as a guide for identified regular proven competent contributors in a special field who should be give autopatrolled status. None of those are appropriate except for experienced users; when I was starting out 11 years ago, I would greatly have appreciated he new system, and it would have helped mer to learn more quickly and accurately. (But psychologically I'm one of those people who prefers to look at raw data --or as close to raw data as feasible when possible, rather than after its been normalized or manipulated . That's what my teachers taught me do to, and that;s what I taught my students. I know no other way to develop a criticla judgmeent independent from other's prior input. ) I'd suggesy that experienced reviews, which pretty much describes everybody here, give it at least an occasion trial. You'll be surprised what you can pick up--I usually look at it until I find at least one anomaly. DGG ( talk ) 06:23, 25 August 2018 (UTC) reply

Production rollout discussion for adding drafts to the feed

Hi all -- I just posted an update on the project page about our team's plan to roll out the first part of this work into English Wikipedia on September 17. I'm excited to be getting some of the new features out into the real world. Let's discuss here. I'm explicitly tagging Primefac, Legacypac, Kudpung, Insertcleverphrasehere, Kvng, Vexations, and TonyBallioni but definitely want thoughts from everyone. I'll also post on NPP talk and AfC talk tomorrow. -- MMiller (WMF) ( talk) 21:26, 30 August 2018 (UTC) reply

The testwiki hasn't been updated in a while, will it be the version on there, or with additional features? It is important that the update doesn't break the newpagesfeed, so it would be good to be able to test whatever version goes live before hand. I note that there are still a lot of tasks unresolved on what I presume is the main phab task. — Insertcleverphrasehere ^{(
or here)} 21:53, 30 August 2018 (UTC) reply

@ Insertcleverphrasehere: thanks for bringing that up.

The version that we'll deploy is actually going to be a little behind where Test Wiki is now, because Test Wiki currently contains the ORES models, which we won't roll out with this first deployment. What we'll be deploying is only the toggle to switch between the NPP and AfC sides of the feed. And then for reviewers that select the AfC side, they will see all pages in the draft space, filterable by their state in the AfC process, and sortable by their created, submitted, and declined dates. Therefore, the only change for NPP reviewers will be the presence of the toggle to go to the AfC side.

Though it won't be possible to test the exact New Pages Feed that will appear with this deployment (because the ORES models are now in Test Wiki), we feel confident that the various components have been well-tested. I'll also mention that the way we chose the September 17 date is that it is a day where our team will be able to devote full attention to the deployment and will be able to troubleshoot issues right away.

Regarding Test Wiki not having been updated in a while, we have some layout updates that have been in there for a few days that change where and how the ORES scores and the AfC dates are displayed. I had not announced those yet as we are still testing them ourselves, and because I have been focusing on planning out this initial deployment to production. I'll post about them in another day or so!

Regarding the unresolved tasks, everything related to the parts we'll be deploying actually are resolved -- they're the ones under this task (which is nested inside the main Phab task you linked). The only tasks still open under that one are the two related to the deployment itself ( Deploy PageTriage AfC to production and Run backfill script).

-- MMiller (WMF) ( talk) 22:54, 30 August 2018 (UTC) reply

@ MMiller (WMF): I certainly think the phased roll-out strategy is wise but am I correct in understanding that the COPYVIO piece won't be put on TestWiki prior to live deployment? That seems like a potential missed opportunity. Best, Barkeep49 ( talk) 23:43, 30 August 2018 (UTC) reply

@ Barkeep49: thanks for the question. We are planning to put copyvio on Test Wiki (in some form) before deploying to production. We're currently figuring out the best way to do that, because of a few technical challenges. For instance, CopyPatrol doesn't currently scan revisions in Test Wiki, and we're not sure whether it will be worthwhile to wire it to do so. But I think that we'll be able to have a testable experience for reviewers to try. In case you're interested in following along, here is the Phabricator task where we're working on it. -- MMiller (WMF) ( talk) 23:57, 30 August 2018 (UTC) reply

Copyvio testing feedback

Hi all! I just posted an update on the project page about testing copyvio detection in Test Wiki. I'm creating this section as a place for everyone to post feedback. -- MMiller (WMF) ( talk) 18:28, 6 September 2018 (UTC) reply

Hi, MMiller (WMF) - I just gave it a whirl, and love it!! One tiny glitch I came across (or it could be something I did) is that the Earwig tool isn't available for articles that have previously been declined - see this draft. Is there a work around? ^{Atsme
📞
📧} 22:13, 6 September 2018 (UTC) reply

Thanks, Atsme, I'm glad to hear it's looking good. We'll look into that Earwig question. I also want to ping Insertcleverphrasehere, Kvng, Legacypac, Barkeep49, Kudpung, Primefac, and Vexations for testing feedback, since this is the most substantial change that will affect both the NPP and AfC experiences with the New Pages Feed (I know I owe several of you answers on a few questions in a few places. I'll work on getting back about those). -- MMiller (WMF) ( talk) 21:06, 10 September 2018 (UTC) reply

@ Atsme: I was just trying to reproduce issues with the Earwig tool, but I wasn't able to. Are you still seeing the issue? When you say it isn't available, what do you mean? Is it that you enter "Draft:Wandile Sihlobo" and don't get results? I just tried that page title and did get normal results. -- MMiller (WMF) ( talk) 21:02, 20 September 2018 (UTC) reply

MMiller (WMF) - just read your post - will ping you with results in a couple of days - first let me do a bit of experimenting to see if I can reproduce the issue (if it still exists), or possibly find others - most of all to see if it was something I did. ^{Atsme
✍🏻
📧} 21:27, 20 September 2018 (UTC) reply

MMiller (WMF) - Is the copyvio feature not ready yet? I don't see any sign of it anywhere in the NPP feed, or AfC. Also, I noticed that the curation tools in the right margin don't show up for certain articles - it happened to me on 2 different articles - Anna Wong (artist) which is the still in feed, and Ellen Carey which I moved back to draft. I tried clicking on the article name while in the queue to see if that would help, and then I tried clicking on Review but that didn't help, either - still no curation bar. I tried it 3 different times with each article. ^{Atsme
✍🏻
📧} 05:23, 21 September 2018 (UTC) reply

Hi Atsme. No, copyvio detection is not yet deployed to production. We've only deployed the first part of this three-part project: getting the drafts into the feed with an "Articles for Creation" switch. The second part is ORES scores, which will be deployed on Oct 1, and copyvio detection, which will be deployed Oct 15. You can test copyvio detection in Test Wiki -- here's how to do that. Please let me know if you have any issues. Regarding your not seeing the Page Curation toolbar on some articles, I'm wondering if that is a similar issue to what another user reported here. I will bring it to an engineer's attention, and it would be great if you could go to that page and answer the questions that I asked the other user. -- MMiller (WMF) ( talk) 01:37, 22 September 2018 (UTC) reply

I still have the issue (which sounds the same as what atsme is describing) but clicking the curate button resolves every time so I can live with it. Best, Barkeep49 ( talk) 01:46, 22 September 2018 (UTC) reply

Barkeep49 and Atsme -- we think we have a fix for this that is being reviewed now. I'll let you know when it's deployed so you can see if it solves the issue. Thanks for following along. -- MMiller (WMF) ( talk) 17:14, 24 September 2018 (UTC) reply

✅ Thank you! ^{Atsme
✍🏻
📧} 17:21, 24 September 2018 (UTC) reply

There were a lot of questions above about how COPYVIO would be displayed - for instance if there would be a numerical or qualitative descriptor to describe potential problems. At this point am I correct that it either displays a copyvio report or it doesn't? If that's correct this is considerably less useful than what had been described before as a particular section of a longer article can have extensive copyright issues needing addressing while the article as a whole is fine. I hadn't left feedback before because I can just be missing how it's going to work when it's not trying to get around the fact that we don't want Copyright issues in testing it. Every article at NPP and AfC should be examined with human eyes for Copyright problems but this implementation seems to suggest to me only a limited subset will be done so through the feed. If that's correct it's something I guess but a disappointment as compared to what had been discussed earlier. Best, Barkeep49 ( talk) 21:18, 10 September 2018 (UTC) reply

@ Barkeep49: thanks for the questions. Here are some answers and more questions:

All pages will continue to be listed in the feed, and the ones that have an edit that scores 50% or above in CopyPatrol will have the indicator "Potential issues: Copyvio" with their entry in the feed. We decided to display the simple binary indicator instead of the actual score because of conversation from several community members on this page ( such as here) about how to make it so that reviewers don't take the scores too literally and read too far into them. It looks like the binary indicator might be the right balance.
With respect to the question about particular sections of an article having issues -- I think I know what you mean, in that it's possible for portions of a page to slip through the cracks. This actually may be an advantage of CopyPatrol that we'll be inheriting with our implementation, because CopyPatrol checks each diff separately. So while a small part of a large diff may slip through the cracks, if a copyvio is added as a small diff on a large page, it should be caught. And that page will then be tagged in the New Pages Feed, because it will tag a page where any of its diffs have potential violations.
I see what you're saying about only a limited subset of checks being done through the feed. It would be nice for every page in the feed to have a link for inspecting its copyvio issues. But the issue is that the vast majority of pages don't have any issues at all, and so there is nothing to inspect as far as CopyPatrol is concerned. In other words, if a page doesn't have a link to CopyPatrol in the feed, it means CopyPatrol didn't find anything suspicious in it. There is a potential conversation about the thresholds in CopyPatrol, currently set at 50%. But in discussions with Diannaa and Eran, who are the experts on CopyPatrol, it looks like the 50% threshold is a safe place in terms of false positives and false negatives. Given all this, I think that reviewers should follow their normal workflows for investigating copyvio, whether that means using Earwig's Tool or googling portions. But if the page they're reviewing has the flag in the feed, it will make it that much faster to go dig into the details in CopyPatrol. And if there are reviewers who want to hunt down and remove copyvio as a primary activity, they'll be able to filter the feed to just those pages that are more likely to have violations.

I'm hoping this helps answer some of your questions. What do you think? -- MMiller (WMF) ( talk) 21:02, 20 September 2018 (UTC) reply

MMiller (WMF) Some points to note:

1. The AfC feed is making duplicate entries. See Draft:Selco Builders Warehouse , Draft:Bambob Cat , Draft:Bambob Cat for examples.

2. The AfC feed is not showing a tally of unreviewed pages in the footer.

3. 500 bytes is too much. A lot of spam, copyvio, and other toxic pages are created with a lot less. See the lorem ipsum example below for 500 bytes:

Lorem ipsum dolor sit amet rutrum sed facilisi in fringilla rutrum amet quis maecenas at sit lacus per. Vestibulum vitae luctus eget metus quam. Integer et nunc eum tortor donec. Erat lobortis ut. Nec ut lorem. In vitae donec. Ut arcu sociosqu. Porttitor sit fermentum pharetra etiam nisl dolor consequat ut eu nulla lectus. Irure con nec in mauris suspendisse ut sodales quis nonummy feugiat nullam. Tincidunt sed ide egestas vestibulum rutrum. Dui et purus. Aliquam sem etiam. Purus sodales suscit.

4. It is hoped that once these ORES/COPYVIO are rolled out, that the WMF will continue to address bugs and request once some first hand experience has been made by the broader community of reviewers, and not abandoned by the techs like the New Pages Feed/Curation has been. Kudpung กุดผึ้ง ( talk) 00:22, 20 September 2018 (UTC) reply

@ Kudpung: thanks for trying things out. Here are some responses:

1. We also saw the issue with the duplicate entries -- thanks for spotting it. We deployed a fix today, and you shouldn't be seeing them anymore. Please let me know if you are seeing it.

2. At the beginning of planning this project, we decided not to include a tally at the footer, probably because we weren't yet sure how to engineer it. We know a lot more about the feed now, and I'll add this to a list of things we might revisit in the coming weeks. Thanks for bringing it up.

3. Good to know regarding the 500 bytes. Insertcleverphrasehere brought up something similar higher up on the page. I'm going to continue the conversation there and tag you.

4. Regarding future development work, my best recommendation is that the reviewing communities assemble a Community Wishlist proposal for work on these tools. Unfortunately, our team will be needing to move on to other projects. I think the latest conversation on that is here.

-- MMiller (WMF) ( talk) 21:08, 20 September 2018 (UTC) reply

Copyvio detection ready for testing in English Wikipedia

Hi all! I just posted an update on the project page about testing copyvio detection in English Wikipedia. I'm creating this section as a place for reviewers to post feedback. It's also totally fine to post feedback at AfC or NPP talk instead. -- MMiller (WMF) ( talk) 23:41, 17 October 2018 (UTC) reply

Note that short copyvios are being missed. See Soothrakkaran where the content was a direct copy of the first paragraph of http://www.cinemaexpress.com/stories/news/2018/jun/28/gokul-sureshs-next-is-soothrakkaran-6725.html (also the only ref in the article). — Insertcleverphrasehere ^{(
or here)} 07:50, 31 October 2018 (UTC) reply

What about design consistency?

I'm curious, why this looks like old, deprecated Apex that was about to be "replaced" by OOUI? What about Wikimedia Style Guide? Someone decided to abandon the principle of consistency? or is there a lack of institutional memory? Tar Lócesilion ( queta) 15:45, 21 October 2018 (UTC) reply

@ Tar Lócesilion: thanks for checking out the project. While it would be wonderful to convert the New Pages Feed to the latest design principles and OOUI, that amount of work was out of scope for how much time our team could spend on this project. We were only able to spend the time to add some enhancements to the existing interface. I do know that product teams are using OOUI as much as possible for new development going forward. -- MMiller (WMF) ( talk) 01:05, 24 October 2018 (UTC) reply

'Potential Issues' flagged in Page Curation Toolbar Page Info flyout

Tracked in Phabricator
Task T207847

MMiller (WMF). As far as I can see, the page curation tool is supposed to flag 'potential issues' in the 'Page info' section of the toolbar (or at least this is what was intended when it was being made [10]). This info is currently visible from the NewPagesFeed, but not from the toolbar. This functionality seems to be nonexistent and was either never implemented properly, or has become bugged and broken. It should contain things like 'Blocked user', 'Orphaned', 'No categories', as well as the ORES stuff that has been added recently: 'Possible Vandalism', 'Possible Spam', 'Possible attack page', and 'Possible Copyvio' which is currently being added to New Pages Feed (this last one should also have a link to the Copyvios report).

When a page has issues it should be flagged with a red number on the Page Info Icon as shown in the mockups at the link above. I have also made this request at Wikipedia:Page_Curation/Suggested_improvements#84._'Potential_Issues'_flagged_in_Page_Curation_Toolbar_Page_Info_flyout and on Phab. — Insertcleverphrasehere ^{(
or here)} 13:48, 24 October 2018 (UTC) reply

Hi Insertcleverphrasehere -- I think this idea makes sense. Thanks for bringing it up. I saw that you included it in the Community Wishlist proposal, and I'm glad it's in there. I think that's the right place for it, as the Growth team won't be able to address it. -- MMiller (WMF) ( talk) 00:34, 31 October 2018 (UTC) reply

MMiller (WMF) I get that. but how will AfC be notified on page of these potential issues flagged in the New page feed? One solution might be to have the page curation tools load by default on drafts as well as articles (if the user has the toolbar turned on). Most AfC reviewers also have NPR rights, so this would result in the toolbar flagging potential issues on drafts as well (if and when the potential issues flags are added to the toolbar). — Insertcleverphrasehere ^{(
or here)} 07:41, 31 October 2018 (UTC) reply

@ Insertcleverphrasehere: yes, that's a good question. I think that the idea of adapting the Page Curation toolbar for AfC could potentially do that, but also brings up the larger (and longer-term) question of how the two somewhat-similar processes should co-exist. Another route is AfC's own user-created gadget, the AFCH, which is similar in function to the Page Curation toolbar. Perhaps it would be possible to add that metadata to the gadget (which might even be doable by the volunteer developer who built it). -- MMiller (WMF) ( talk) 21:42, 31 October 2018 (UTC) reply

Resetting Filters

At the moment if I choose a set of filter options (e.g. unreviewed articles that are predicted to be GA or FA) that gives me only a few results, I think it's if it's less than 4, and I then want to reset my filter options to include a different set of parameters it is difficult - I end up having to zoom way out in order to see the green set filters button. Is there anything that can be done with this? Best, Barkeep49 ( talk) 16:38, 31 October 2018 (UTC) reply

@ Barkeep49: thanks for bringing that up. Yes, it's one of the last outstanding things our team is working on. When there are very few pages in the feed, the "Set filters" button gets cut off, and the only way to click it is to zoom out. Though this issue was always there, by adding more specific filters to the feed, our work essentially made this happen more often. We're going to fix it in this Phabricator task, and we'll post when it's fixed. -- MMiller (WMF) ( talk) 21:45, 31 October 2018 (UTC) reply