Research talk:Wikipedia article creation/Work log/Tuesday, January 21st

Add discussion
Active discussions

Tuesday, January 21stEdit

Today I want to work out the data anomaly I have for dewiki. It looks like the number of newcomer created drafts falls off abruptly in the middle of 2011. So, I'd like to find some move events from the middle of that period (let's say, 2011/09-2011/10) to see what the revision comments look like and figure out if my regex was matching them properly or not.

> select * from logging where log_type = "move" and log_action = "move" and log_timestamp BETWEEN "201109" AND "201110" limit 2;
+----------+----------+------------+----------------+----------+---------------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------+-------------+----------------+----------+
| log_id   | log_type | log_action | log_timestamp  | log_user | log_namespace | log_title  | log_comment                                                                                                                                                              | log_params                     | log_deleted | log_user_text  | log_page |
+----------+----------+------------+----------------+----------+---------------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------+-------------+----------------+----------+
| 37317605 | move     | move       | 20110901001127 |   995197 |             0 | Joseph_II. | BKL Modell II                                                                                                                                                            | Joseph II. (Begriffsklärung)
  |           0 | MFleischhacker |  6430697 |
| 37317717 | move     | move       | 20110901002353 |   708213 |             0 | Fundament  | Eine Verschiebung wird erforderlich zur Aufspaltung der Artikelseite in eine allgemeine Begriffsklärungsseite und in eine Artikelseite über das Fundament im Bauwesen.   | Fundament (Bauwesen)
          |           0 | A.Abdel-Rahim  |  6430701 |
+----------+----------+------------+----------------+----------+---------------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------+-------------+----------------+----------+
2 rows in set (0.31 sec)

OK. Time to look for the page.

> select * from page where page_title = "Joseph_II._(Begriffsklärung)" and page_namespace = 0;
+---------+----------------+-------------------------------+-------------------+--------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+-----------------------+
| page_id | page_namespace | page_title                    | page_restrictions | page_counter | page_is_redirect | page_is_new | page_random    | page_touched   | page_links_updated | page_latest | page_len | page_no_title_convert |
+---------+----------------+-------------------------------+-------------------+--------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+-----------------------+
|   55428 |              0 | Joseph_II._(Begriffsklärung)  |                   |            0 |                0 |           0 | 0.641909627543 | 20131024114813 | NULL               |   121808972 |      413 |                     0 |
+---------+----------------+-------------------------------+-------------------+--------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+-----------------------+
1 row in set (0.04 sec)
> select rev_comment from revision where rev_page = 55428 and rev_user = 995197;
+-------------------------------------------------------------------------------------------+
| rev_comment                                                                               |
+-------------------------------------------------------------------------------------------+
| verschob „[[Joseph II.]]“ nach „[[Joseph II. (Begriffsklärung)]]“: BKL Modell II          |
+-------------------------------------------------------------------------------------------+
1 row in set (0.13 sec)

OK. Now to check if my regex matches.

> select rev_comment RLIKE ".*(hat „|verschob die Seite )\\[\\[([^\]]+)\\]\\]“? nach „?\\[\\[([^\]]+)\\]\\]“?(.*)" from revision where rev_page = 55428 and rev_user = 995197;
+-------------------------------------------------------------------------------------------------------------------+
| rev_comment RLIKE ".*(hat „|verschob die Seite )\\[\\[([^\]]+)\\]\\]“? nach „?\\[\\[([^\]]+)\\]\\]“?(.*)"         |
+-------------------------------------------------------------------------------------------------------------------+
|                                                                                                                 0 |
+-------------------------------------------------------------------------------------------------------------------+
1 row in set (0.04 sec)

It doesn't match. So now to figure out how to make the regex match. It should be pretty easy.

> select rev_comment RLIKE ".*(hat „|verschob „|verschob die Seite )\\[\\[([^\]]+)\\]\\]“? nach „?\\[\\[([^\]]+)\\]\\]“?(.*)" from revision where rev_page = 55428 and rev_user = 995197;
+--------------------------------------------------------------------------------------------------------------------------------+
| rev_comment RLIKE ".*(hat „|verschob „|verschob die Seite )\\[\\[([^\]]+)\\]\\]“? nach „?\\[\\[([^\]]+)\\]\\]“?(.*)"           |
+--------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                                              1 |
+--------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.03 sec)

There we go. Time to kick off the move detector again. With any luck, I can have new data by the end of the day. --Halfak (WMF) (talk) 17:52, 21 January 2014 (UTC)

Return to "Wikipedia article creation/Work log/Tuesday, January 21st" page.