Research:Daily unique page creators
Daily unique page creators
WMF Standard
- = 1 page creation
Measures
Editing population size SQL
SET @date = "20140101";
SET @n = 1;
SELECT
COUNT(*) AS page_creators
FROM (
SELECT
rev_user,
rev_user_text,
COUNT(*) AS page_creations
FROM (
SELECT
rev_user,
rev_user_text
FROM
revision
WHERE
rev_timestamp BETWEEN @date AND
DATE_FORMAT(DATE_ADD(@date, INTERVAL 1 DAY), "%Y%m%d%H%i%S") AND
rev_parent_id = 0
UNION ALL
SELECT
ar_user as rev_user,
ar_user_text AS rev_user_text
FROM
archive
WHERE
ar_timestamp BETWEEN @date AND
DATE_FORMAT(DATE_ADD(@date, INTERVAL 1 DAY), "%Y%m%d%H%i%S") AND
ar_parent_id = 0
) page_creations
GROUP BY
rev_user,
rev_user_text
) page_creator
WHERE page_creations >= @n;
Daily unique page creators is a standardized metric used to measure the number of users who create new pages on a wiki in a given day. It's used as a proxy for editing population size.
Discussion
editIdentifying page creations
editRegretfully, MediaWiki does not track a history of page creation events. However, by using the rev_parent_id
field, this metric makes a close approximation. rev_parent_id
usually points to the previous revision. For the first revision of a page, rev_parent_id = 0
. This metric approximates page creations by looking for these parentless revisions.
Time lag
editAs this is a daily metric, a full 24 hours must elapse after the beginning of the date (UTC) in order to calculate an uncensored value.