Research:Rolling re-activated editor

Rolling re-activated editor
Specification
A is a user who completed less than edits between and and completed edits (but was not a R:newly registered user) between and .
WMF Standard
  • = 5 edits
  • = 30 days
Status
SQL
SET @n = 5; /* edits threshold */
SET @u = 30; /* activity unit in days */
SET @T = "20140102"; /* February 1st, 2014 before midnight */
 
/* Create a temp table of the current month's active editors who are not new */
CREATE TEMPORARY TABLE staging.current_active_non_new
SELECT
    rev_user
FROM (
    SELECT
        rev_user
    FROM (
        SELECT
            rev_user
        FROM revision
        WHERE rev_timestamp BETWEEN
            DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S") AND @T
        UNION ALL
        SELECT
            ar_user AS rev_user
        FROM archive
        WHERE ar_timestamp BETWEEN
            DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S") AND @T
    ) AS split_revisions
    GROUP BY rev_user
    HAVING COUNT(*) >= @n
) AS active_editors
LEFT JOIN logging nru ON
    log_type = "newusers" AND
    log_action = "create" AND
    log_user = rev_user
WHERE 
    nru.log_id IS NULL OR 
    nru.log_timestamp <= DATE_FORMAT(
        DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S");

/* Create a temp table of the last month's active editors */
CREATE TEMPORARY TABLE staging.last_active
SELECT
    rev_user
FROM (
    SELECT
        rev_user
    FROM revision
    WHERE rev_timestamp BETWEEN
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u*2 DAY), "%Y%m%d%H%i%S") AND
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S")
    UNION ALL
    SELECT
        ar_user AS rev_user
    FROM archive
    WHERE ar_timestamp BETWEEN
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u*2 DAY), "%Y%m%d%H%i%S") AND
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S")
) AS split_revisions
GROUP BY rev_user
HAVING COUNT(*) >= @n;

SELECT
    user_id,
    user_name,
    user_registration
FROM staging.current_active_non_new
INNER JOIN user ON rev_user = user_id
WHERE 
    rev_user NOT IN (SELECT rev_user FROM staging.last_active) AND
    rev_user NOT IN (SELECT ug_user FROM user_groups WHERE ug_group = "bot");

Rolling re-activated editors are non-bot editors who were not recently active in the previous time period, but are active in the current time period.

Discussion

edit

The n edits threshold

edit

The edits necessary to cross this threshold, the fewer editors will meet the criteria. Historically, "active editors" were considered to be editors who made 5 edits to specific types of pages over the course of a calendar month. So for historical purposes, we recommend  .

The u activity unit

edit

The wider this unit is set, the more casual editors will be considered "active". Historically, "active editors" were considered during calendar months. So for historical purposes, we recommend  .

Time lag

edit

This metric has a built-in time lag of   for the time period in question.

Bot filtering

edit

Bots are filtered using the bot flag method.

Analysis

edit

Discussion

edit

Notes

edit