Grants:IEG/Revision scoring as a service/Midpoint

This project is funded by an Individual Engagement Grant

This Individual Engagement Grant is renewed

Welcome to this project's midpoint report! This report shares progress and learnings from the Individual Engagement Grantee's first 3 months.

Summary

In a few short sentences or bullet points, give the main highlights of what happened with your project so far.

We've built a generalized scoring system for revisions/pages
We've trained models that compare with the state-of-the-art for precision/accuracy
We've documented our project and posted about it in three different major Wikipedian news sources
We've begun work on a revision coding service.

Generally, we're on or ahead of schedule, but we have had to shift around some of our deliverables (revision coding service was supposed to be done, but we stood the scoring service up on labs instead).

Methods and activities

So far, the majority of our activities have been focused on the development of a generalized scoring system for revisions and pages in a MediaWiki installation -- with a focus on Wikipedia. We have implemented state-of-the-art, revert predicting machine learning models for English, Portuguese, Turkish and Azerbaijani Wikipedia. These models and their scores are accessible via an instance on Wikimedia Labs. For example, see http://ores.wmflabs.org/scores/enwiki?models=reverted&revids=4567890|4567892

{
  "4567890": {
    "reverted": {
      "prediction": false,
      "probability": {
        "false": 0.6967103285095867,
        "true": 0.3032896714904134
      }
    }
  },
  "4567892": {
    "reverted": {
      "prediction": false,
      "probability": {
        "false": 0.5479798477219674,
        "true": 0.45202015227803266
      }
    }
  }
}

In parallel to this development work, we have been building documentation and specification for our next stage of work. For an overview of our documentation, see Research:Revision scoring as a service. For information about our sub-projects, see also:

Finally, we have engaged in public outreach about the project. We presented on the project at the Wikimedia Foundation's metrics meeting in January. We also ran articles on the project in the English Wikipedia signpost en:Wikipedia:Wikipedia Signpost/2015-02-18/Special report, the Portuguese Wikipedia technical Village Pump pt:Wikipédia:Café dos programadores#Serviço de pontuação de edições and the Persian Wikipedia. We're currently in the process of producing a translation for the Turkish Wikipedia signpost as well.

Midpoint outcomes

Mockup of revision coder interface

Our primary goal of constructing high signal damage scorers has been achieved. We were also able to get the scoring service hosted via a web API on Wikimedia Labs -- which was one of our end-goals. However, we were not able to stand up a revision coder service as planned, so we were forced to push that goal back to our final deliverables. Despite this setback, we've made substantial progress iterating on designs and discussing the technical considerations.

Finances

We have spent our funds as planned and have not requested or received additional resources.

Learning

We have been following a light SCRUM process with weekly sprints and that has been serving us very well. If anything isn't going well, it's that we sometimes allow weekly reports to fall behind by a week. Since we are using Trello to manage our work, we haven't had any trouble "remembering" what was done in past weeks.

What are the challenges

Timezones are difficult, but we make due very well.
It took a little bit of time to get everyone up to speed and submitting pull request to the central repositories.

What is working well

Use of Trello for project management
Use of GitHub for version control and pull requests/code review for keeping track of changes.
Grants:Learning patterns/Git repository for software

Next steps and opportunities

We'll be constructing a revision coder service on Labs to which the revision handcoder gadget will communicate
We'll be constructing new classifiers based on human assessment obtained by means of the revision coder.
We are investigating the possibility to include Farsi in our language library.

Grantee reflection

It's difficult to manage a full time job and volunteer work on a project like this that has deadlines and documentation expectations. Luckily, it seems that my work on this project has been sanctioned during office hours. Regretfully, none of my other responsibilities have been reduced, so I still end up working on this in the evenings and weekends. Otherwise, work on this project has been a true joy. :) --EpochFail (talk) 00:53, 12 March 2015 (UTC)
This has been something I had been looking for since 2012 when I first attempted developing AI tools for Wikipedia. Back then I had buy a spare drive to download something like 550Gbs worth of compressed dumps and hit my monthly internet limit twice in a row just to be able to download. I also had data corruption issues with this but I was able to recover from that with the aid of apergos (kudos to her again). This project intends to eliminate such difficulties researchers face when dealing with the massive size of Wikipedia and it has been most pleasant to work on - not just at an individual level but as a group as well which makes this a blast for me. -- とある白い猫 ^chi? 12:02, 20 March 2015 (UTC)
It is really cool to work with people with other experiences and backgrounds, and collaborate in code development for a project like this. I'm having the chance to put in practice the machine learning theory I was learning last semester. Also, code review is providing a unique opportunity for me to become a better Python programmer. Helder 19:14, 20 March 2015 (UTC)