Research talk:Daily unique page creators
Latest comment: 10 years ago by Halfak (WMF) in topic Tuesday, September 9th
Work log
editarchive
|
---|
|
Tuesday, September 9th
editSo, the only time you should ever get a rev_parent_id = 0
should be during page creation. I wanted to quickly check this hypothesis with a query to the revision table. The following query is based on a random sample of pages.
> SELECT page_id, COUNT(*) FROM (select page_id FROM page ORDER BY RAND() LIMIT 30) as page_sample INNER JOIN revision ON page_id = rev_page WHERE rev_parent_id = 0 GROUP BY 1; +----------+----------+ | page_id | COUNT(*) | +----------+----------+ | 4092412 | 1 | | 4745958 | 1 | | 6663197 | 1 | | 6918098 | 1 | | 10545411 | 1 | | 15126121 | 1 | | 17933077 | 1 | | 21001278 | 1 | | 21261177 | 1 | | 21884138 | 1 | | 22841278 | 1 | | 22973985 | 1 | | 23918337 | 1 | | 27643800 | 1 | | 30124110 | 1 | | 32762258 | 1 | | 33503979 | 1 | | 33766304 | 1 | | 34316332 | 1 | | 35341283 | 1 | | 35649955 | 1 | | 36075561 | 1 | | 36546809 | 1 | | 38088059 | 1 | | 38752153 | 1 | | 41010195 | 1 | | 41538770 | 1 | | 41703527 | 1 | | 41707634 | 1 | | 42775010 | 1 | +----------+----------+ 30 rows in set (40.75 sec)
Looks like this is going to work. --Halfak (WMF) (talk) 21:54, 9 September 2014 (UTC)