Research talk:Wikipedia article creation/Work log/Monday, December 9th
Monday, December 9th
editI'm trying to incorporate AfC drafts into my analysis of the survival proportion of English Wikipedia. I consider a page to be an AfC draft if it remains under Wikipedia_talk:Articles_for_creation/<title>. All drafts that were deleted or that haven't been moved to Main within 6 months of article creation are considered to be "failed". To deal with the right truncation issue, I've limited my dataset to go up until April this year.
Sadly, I don't know which pages started as drafts. That is, I wouldn't if I didn't have this fancy dataset that has a history of moves. Mwahahaha! OK. Time to get that into shape so that it will be useful. --Halfak (WMF) (talk) 21:28, 9 December 2013 (UTC)
While I'm waiting for my move extractor, I ran some numbers on how many drafts are being arbitrarily chosen to have "failed" by me in this analysis.
> select LEFT(first_revision, 4) AS year, page.archived OR should_be_archived OR declined_submission, count(*) FROM nov13_afc_draft_status INNER JOIN nov13_page page USING(page_id, page_namespace, page_title) WHERE first_revision < "20130501" GROUP BY 1,2; +------+------------------------------------------------------------+----------+ | year | page.archived OR should_be_archived OR declined_submission | count(*) | +------+------------------------------------------------------------+----------+ | 2004 | 1 | 1 | | 2005 | 1 | 1 | | 2006 | 0 | 1 | | 2006 | 1 | 30 | | 2007 | 0 | 1 | | 2007 | 1 | 52 | | 2008 | 0 | 12 | | 2008 | 1 | 3300 | | 2009 | 0 | 47 | | 2009 | 1 | 8720 | | 2010 | 0 | 70 | | 2010 | 1 | 11654 | | 2011 | 0 | 305 | | 2011 | 1 | 28153 | | 2012 | 0 | 1746 | | 2012 | 1 | 71779 | | 2013 | 0 | 851 | | 2013 | 1 | 24935 | +------+------------------------------------------------------------+----------+ 18 rows in set (2.46 sec)
Wow! That's better than I thought! Only 851 ambiguous drafts in 2013! --Halfak (WMF) (talk) 23:04, 9 December 2013 (UTC)