Research:Rolling recurring old active editor

Rolling recurring old active editor
Specification
A is a user registered before , completed edits between and and continued to complete edits between and .
WMF Standard
  • = 5 edits
  • = 30 days
Measures
Editor retention
Status
SQL
SET @n = 5; /* edits threshold */
SET @u = 30; /* activity unit in days */
SET @T = "20140102"; /* February 1st, 2014 before midnight */

/* Build a table of active editors from last month */
CREATE TEMPORARY TABLE staging.last_active
SELECT
    rev_user
FROM (
    SELECT
        rev_user
    FROM revision
    WHERE rev_timestamp BETWEEN
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u*2 DAY), "%Y%m%d%H%i%S") AND
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S")
    UNION ALL
    SELECT
        ar_user AS rev_user
    FROM archive
    WHERE ar_timestamp BETWEEN
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u*2 DAY), "%Y%m%d%H%i%S") AND
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S")
) AS split_revisions
GROUP BY rev_user
HAVING COUNT(*) >= @n;

/* Build a table of active editors from current month */
CREATE TEMPORARY TABLE staging.current_active
SELECT
    rev_user
FROM (
    SELECT
        rev_user
    FROM revision
    WHERE rev_timestamp BETWEEN
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S") AND @T
    UNION ALL
    SELECT
        ar_user AS rev_user
    FROM archive
    WHERE ar_timestamp BETWEEN
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u DAY), "%Y%m%d%H%i%S") AND @T
) AS split_revisions
GROUP BY rev_user
HAVING COUNT(*) >= @n;

/* Find the intersection and filter the newbies & bots out */
SELECT
    user_id,
    user_name,
    user_registration
FROM staging.current_active
LEFT JOIN staging.last_active USING (rev_user)
INNER JOIN user ON rev_user = user_id
WHERE 
    user_registration <
        DATE_FORMAT(DATE_SUB(@T, INTERVAL @u*2 DAY), "%Y%m%d%H%i%S") AND
    rev_user NOT IN (SELECT ug_user FROM user_groups WHERE ug_group = "bot");

Rolling recurring old active editors are non-bot editors who registered more than 2 time periods ago and completed a minimum number of edits in the last two time periods.

Discussion

edit

The n edits threshold

edit

The edits necessary to cross this threshold, the fewer editors will meet the criteria. Historically, "active editors" were considered to be editors who made 5 edits to specific types of pages over the course of a calendar month. So for historical purposes, we recommend  .

The u activity unit

edit

The wider this unit is set, the more casual editors will be considered "active". Historically, "active editors" were considered during calendar months. So for historical purposes, we recommend  .

Time lag

edit

This metric has a built-in time lag of   for the time period in question.

Bot filtering

edit

Bots are filtered using the bot flag method.

Discussion

edit

The n edits threshold

edit

The u activity unit

edit

Time lag

edit

Plain word definition

edit

In plain words (using the default values) a '

Analysis

edit

Discussion

edit

Notes

edit

In plain words (using the default values) a 'Rolling recurrent old active editor' is an editor that registered before 2 months ago and completed 5 edits in the last month and another 5 in the month prior to that one. Month is defined not as a calendar month but rather 30 calendar days.