Research talk:Onboarding new Wikipedians/Rollout/Work log/2014-03-27

Thursday, March 27th edit

Today, I'm working on measuring the revert rate of GettingStarted edits. In order to do this, I plan to sample one edit per newcomer on a GS wiki during the 30 days in question. In order to do this, I'll need to gather all edits that newcomers made in their first 24h. I'll need the set labeled by wiki, user, rev_id, page_id and gs_tagged (boolean -- tagged as "gettingstarted edit"?).

So first things first, I need those edits from each of the GS wikis.

SELECT
    DATABASE(),
    rev_id,
    user_id,
    rev_page AS page_id,
    gs_edit.ct_rev_id IS NOT NULL AS gs_edit
FROM user
INNER JOIN logging ON
    log_user = user_id AND
    log_type = "newusers" AND
    log_action = "create"
INNER JOIN revision ON 
    rev_user = user_id AND
    rev_timestamp BETWEEN 
        user_registration AND 
        DATE_FORMAT(DATE_ADD(user_registration, INTERVAL 1 DAY), "%Y%m%d%H%i%S")
INNER JOIN change_tag gs_edit ON 
    ct_rev_id = rev_id AND
    ct_tag = "gettingstarted edit"
INNER JOIN page ON
    rev_page = page_id
WHERE 
    page_namespace = 0 AND
    user_registration BETWEEN "20140211183000" AND "20140313183000";

Time to run that cross-wiki. --Halfak (WMF) (talk) 19:43, 27 March 2014 (UTC)Reply


Fixed the query since I forgot the to join to change_tag and get "gettingstarted edit"s. --Halfak (WMF) (talk) 19:49, 27 March 2014 (UTC)Reply


So... there's no index on user_registration in most MediaWiki databases. This means that running this query on just users in the relevant month will take quite a long time. In the meantime, I wrote a script to grab the first revision for each user and check if it was reverted. Time to go work on some schemas while I wait. --Halfak (WMF) (talk) 22:02, 27 March 2014 (UTC)Reply

Return to "Onboarding new Wikipedians/Rollout/Work log/2014-03-27" page.