From Wikipedia, the free encyclopedia

This is a good place to start

Q: Where do you personally think you fall on the spectrum between absolute inclusionist and absolute deletionist?

A: I don't think there really is any such "spectrum," in any meaningful way. I'm sure there is some tiny group of people that thinks all of Wikiepdia ought to be deleted, and some other tiny group of people that thinks Wikipedia ought to include every article anybody ever wants to write, but the other 99.9 percent of Wikipedia users all fall in the middle somewhere. People generally recognize that the vast majority of our 1,647,000+ articles are worth keeping, yet any look through Special:Newpages will show plenty of unsalvageable articles being written every day. I've watched plenty of pages I thought ought to be kept get deleted, and I've also helped improve plenty of articles so they would be worth keeping. Deletion of content is just a fact of life if we're to have any sorts of standards for quality, neutrality, and following copyright and libel law. As we say, "If you don't want your writing to be edited mercilessly or redistributed by others, do not submit it."

Q: Well, the spectrum, as I see it, is the "middle" that people fall into. And I have heard people ask why anyone deletes anything at all. But let's step back from that a bit. Leaving aside the issues of erroneous information, ad hominem attacks and bad writing, why even bother worrying about whether a subject is "notable" enough to be written about?

A: " Notability" concerns whether a topic has been noted by independent reputable sources. It is what we rely on to make sure that the topics we cover have enough written about them to help ensure that we can write an article with information that has been evaluated and fact-checked, information that comes from multiple points of view so that we can be neutral, and information that explains why a topic is of any importance. I think it's that last part that you're asking about. The last part is a basic function of writing an encyclopedia -- we want to include topics of some historical significance rather than just be a directory or a collection of trivia. See Wikipedia is not an indiscriminate collection of information for more information. If we include every article that anyone wants to write, then the encyclopedia becomes useless because nobody can find the actual needle of worthwhile information on a topic hidden in that hay stack of trivia. To give a webcomics-related example, if I'm trying to research webcomics over on a wiki with much more indiscrimnate content policies, like comixpedia.org, I'll find articles like this one on the webcomic Nigger. Without requiring this topic to be noted by several independent reputable sources, we won't know whether this webcomic is of any importance, or just something that somebody made up one day and posted on the internet. It's just a roadblock put up to keep researchers from finding more important information. It's like trying to research photographers and having to wade through article after article on people who take yearbook photos or put vacation pics on their Fickr accounts. Maybe that's fine for Comixpedia's mission, but it's not for Wikipedia's.

Q: What is that mission? I think there's a lot of confusion on this point. The makers of encyclopedias sometimes toss around terms like "the sum of human knowledge..."

A: "The sum of human knowledge..." is a good idea. But, keep in mind that an encyclopedia is a specific type of reference. It's not a dictionary, so we don't do simple slang and jargon guides. It's not a directory, so we don't list everything of some category that we can find. We're not a newspaper or a magazine, so we don't do news reports or personal opinion essays. We don't do advertising, we don't do personal web pages, we don't do instruction manuals, we're not a social networking site, etc. We're an encyclopedia, with encylopedic standards. And I think that's where the confusion usually starts -- people come here thinking that because Wikipedia is on the internet that it's just like a message board or a blog, and it's not.

Q: The definition of "notability" reads as follows: A topic is notable if it has sufficient, independent works that are reliable and can act as the basis for an encyclopedic article. But what constitutes reliability, in this case? Alexa rankings are independent empirical data, but their reliability is controversial at best, especially when you get down to the audience level of most webcomics. The jury's out on Compete.com as well. But the more respected traffic-measures like Hitwise are out of the average user's reach.

A: First, I should point out that the guideline you are quoting was changed earlier today from "a topic is notable if it has been the subject of multiple, non-trivial, reliable published works, whose sources are independent of the subject itself" to "a topic is notable if it has sufficient, independent works that are reliable and can act as the basis for an article." [1] It's not that big of a difference, and that change may not even last long, but it's probably worth noting.

Either way, notability has nothing to do with traffic ranks or hit counters. There was an attempt over a year ago to try to create a special standard for Web sites based on things like their Alexa rankings. If I remember correctly, it was anything under 10,000 for websites without comics, and then a special lower bar of anything under 200,000 for webcomics. Here's a snapshot from October 2005 of a notability guideline for webcomics that have an Alexa ranking "better than 200,000", [2] and here's one of the "Alexa ranking of 10,000 or better" for other websites [3]. I'm not sure why anyone thought it was a good idea to lower the bar so far for webcomics, allowing them to be considered "notable" with an Alexa rank 20 times worse than any other type of web site. To me that seems diresepectful to webcomics as an art.

Ultimately, this "some Alexa rank equals notability" idea was done away with completely as it was totally arbitrary (who decides what rank equals notability?), unreliable as you point out (it only counts certain browsers and willing users), and doesn't really help write an encyclopedia (having a certain Alexa rank is not an achievement or event of historical significance). And it also showed a systemic bias toward web content. Why, assuming all else being equal, would one photographer's self-published, printed book of photos be less notable than another's photo blog just because the blog has an Alexa rank?

Ironically, part of the reason the "Alexa test" idea was done away with was because authors of webcomics with Alexa rankings over 200,000 were complaining about how stupid it was to use Alexa ranks. That is, they weren't happy that the special lower bar created for them wasn't low enough. So, instead of things like Alexa ranks, we went back to just basing encyclopedia articles on independent sources with reputations for fact-checking and accuracy. That is, notability is based on whether other sources actually note a topic in a way that allows us to write a neutral, verifiable article on the topic's importance. So, unless a site's Alexa rank is considered important enough to be covered by reputable sources, it's really not an indicator of "notability." Keep in mind that Wikipedia is an encyclopedia, a tertiary source. We don't do original research, and so we're limited by what good secondary sources note. It's basically just requiring that encyclopedia articles are based on sources at least as good as you'd expect to find in a junior high school research paper.

For example, I have about one-and-a-half pretty good sources for a Dylan Meconis article, but that's really not enough to write an article with. So, I'm waiting to see if there will be any reputable sources reviewing her book with Jim Ottaviani that comes out this spring. I suppose I could tear my hair out because Wikipedia has sourcing standards, but I'd rather just work on other articles in the meantime.

Q: Likewise, what constitutes "independent?" I am a supporter of the Web Cartoonist's Choice Awards, but I can't deny that they're racked by problems, some of which spring from being rooted in the community they judge. On the other hand, if you have higher standards you could say the same for the Oscars. I know the WCCAs have been controversial in some of the recent movements to delete...

A: As far as the Web Cartoonist's Choice Awards you mentioned, there are multiple problems with things like that on Wikipedia, not all of them having to do with independence. One is that they don't have enough reputable sources themselves to write a decent article. They've been around for seven years, yet we can barely scrape together a decent source and a half on them. Again, it's difficult, as a tertiary source like Wikipedia, to consider something "notable" when reputable secondary sources really don't bother to note the topic. And without more reputable sources, we can't adequately cover the topic. As you say, they are "racked by problems." If that's true, then the article should address that in some detail, but without reputable sources, there's no way to cover those "problems" without letting in any crackpot theory or opinion from someone with an axe to grind. On the blogs I read, it seems to be near universal that the awards are disresepected, but it seems less than neutral to be quoting profanity-laced blogs. Then again, it's hardly neutral to write a puff piece on an award "racked by problems" that doesn't delve into the problems. So, if we don't have enough reputable sources to cover the topic, we shouldn't cover the topic.

The other main problem is that some have floated the idea, similar to the failed "Alexa test" idea, that Wikipedia have a special standard where we should have an article on every Web Cartoonist's Choice Award nominee or winner. Again, it's hard to consider a comic winning an award a notable achievement if no independent secondary sources bother to note the comic winning it. It also doesn't help us write more than a brief sentence in an article -- it's not like an actual reputable text source that we can use for analysis and criticism.

Q: Well, they've been mentioned in The Beat and Journalista. These are the online arms of Publisher's Weekly and The Comics Journal, which have served their respective industries for decades. If those aren't independent and reputable sources, then what are?

A: Like I said, not all of the problems with these types of sources have to do with independence. Remember, we're also looking for non-trivial sources, or sources that are sufficient to write an encyclopedia article -- actual reputable text sources that we can use for analysis and criticism. If you have to use the word "mentioned" to describe the depth of the coverage, then it's probably trivial, and insufficent to help in writing an encyclopedia article. Neither the Publisher's Weekly nor The Comics Journal blogs you link to offer anything more than the most trivial blog coverage, just repeating some brief information from the Web Cartoonist's Choice Award site and then giving a link. There's no real content there to help write an article. I suppose Dirk Deppey does go into slightly more detail this year, [4] writing that those awards are "announced in the form of a mock ceremony where mediocre cartoonists draw a page for each award; after three pages, my eyeballs started bleeding and I ceased to care who won." I suppose that's starting to approach analysis that could be included in an article on the awards, but it's more like a throw-away sentence on a blog, and a throw-away sentence on a blog about not really caring about an award is not the best evidence for showing the importance of the topic. In contrast, Deppey's multi-page interviews with Joey Manley and Chris Onstad in The Comics Journal magazines have the depth to be good sources for webcomic encylopedia articles. Another example: I love Heidi MacDonald (I've used her Publisher's Weekly articles as sources many times), and I love Lea Hernandez (I created and wrote quite a bit of her Wikipedia article); however, as long as Heidi MacDonald blogs on topics like "Someone has been pooping in Lea Hernandez’ yard!" [5] then it makes it pretty clear that simply being mentioned on Heidi MacDonald's blog is no sign of encyclopedic achievement or historical significance.

Q: What would you say to a cartoonist who finds herself deleted, underrepresented, or inaccurately represented? Wikipedia frowns on people editing their own information. So what's her recourse?

A: "Finds herself deleted"? We don't delete people. We might delete an article about a person, but as far as I know Wikipedia has never actually killed anyone. We also get "You deleted my webcomic" a lot. No, we didn't delete your webcomic. If your webcomic actually doesn't exist any more, then please contact your server administrator. We didn't touch your webcomic.

If a cartoonist finds their article being considered for deletion, and they're interested in keeping it, they can like any editor provide sources to improve the article so it no longer fits whatever reason editors think it ought to be deleted. If a cartoonist thinks they're "underrepresented," they should avoid starting an article on themselves. People generally aren't the best judges of their own importance, are rarely able to write a neutral article about themselves, and Wikipedia is not the place to advertise your website. And as far as people editing articles about themselves, we welcome anyone that wants to correct or remove inaccurate or libelous statements. Of course, the way to prevent inaccurate or libelous statements is to require that articles be well-sourced. There is more information on this at Wikipedia:Biographies of living persons.

This is probably a good time to point out that our same sourcing requirements for showing the notability of topics are basically the same sourcing requirements for preventing the use of Wikipedia to libel and attack the subjects of articles. I think that gets lost sometimes with webcomics articles. It seems people want to use blogs and message boards to say good things about webcomics and put those in encyclopedia articles to show "notability", but they also want to selectively ignore any bad things said on blogs or message boards about a webcomic or its author. An encyclopedia really can't have it both ways and still write from a neutral point of view. If we're going to document blogger praise we'd also have to document blogger flame wars and name-calling. But they're both generally trivial, so it's best to include neither.

I'd also suggest that web comic artists try to keep some perspective on things. Wikipedia can rightly delete a poorly sourced article on a blog with 2.4 million monthly readers that has The Washington Post as a source, yet somehow non-webcomics bloggers seem to be able to avoid being overly dramatic about it.

Q: Well, you know. Cartoonists.

But it is important. I don't know of any art form whose practitioners have had to fight for respectability quite as hard and as long as cartoonists have.

Wikipedia has given comics in general and webcomics in particular far more attention than the encyclopedias with which it competes, and that has been good for online cartoonists. Tertiary source or no, Wikipedia is one of the most popular sites on the Web, and "Wikipedian notability" has become a badge of prestige like high PageRank or high hit-count. Such notability has far less value if it's conferred freely on everyone who wants it; I understand that.

On the other hand, you can play a Cartesian game where you challenge everything that is disputed (The New York Times? Bah! How we can trust them after Jason Blair? Webster's Dictionary? Ha! Haven't you read about Webster's political biases?) down to "Cogito ergo sum" (Fah! Many philosophers reject Descartes' proofs!), at which point "deleting everything in Wikipedia" becomes a real option.

Let's keep that perspective in mind and try to look at the long term. Do you think that this time next year, Wikipedia will have more articles about webcomics or fewer? More articles in general, or fewer?

The number of English Wikipedia articles grew exponentially from 2002 to 2006, with a doubling time of roughly 1 year.

A: As far as your question, that's an easy one. There will be far more articles on Wikipedia in a year then there are today. See the chart to the right showing the exponential growth in the number of Wikipedia articles over the last several years. Also read Wikipedia:Modelling Wikipedia's growth. Yes, articles will be deleted, but many more will be created. In 2006 I created webcomics-related articles on Amy Kim Ganter, Gene Yang, Svetlana Chmakova, and more. In 2007 I hope to have the sources to create that Dylan Meconis article and any other webcomics-related article I can find good sources for. It would be nice if I could spend less time mopping up spam articles and dealing with trolls saying my encyclopedia writing makes them feel like killing, and more time in the coming year researching and writing encyclopedia articles.

And of course, quantity is a less important goal than quality. Wikipedia already has well over a million and a half articles, but as Wikipedia founder Jimmy Wales has put it, "We can no longer feel satisfied and happy when we see these (article) numbers going up. ... We should continue to turn our attention away from growth and towards quality." [6] For example, in the past year we managed to get Megatokyo up to featured article quality; I suspect we could write featured articles on other webomics like Penny Arcade (webcomic), Van Von Hunter, Get Your War On, Diesel Sweeties, Fetus-X, Mom's Cancer, PvP, Achewood, Narbonic, Sinfest, The Perry Bible Fellowship and many others.