Research talk:Automated classification of article importance/Work log/2017-04-21

Friday, April 21, 2017 edit

Today I will continue the work on identifying active WikiProjects, using the datasets I gathered yesterday as well as an additional dataset on article activity. I will also start developing a pipeline for handling global data.

WikiProjects edit

I wrote a Python script to gather edit statistics for all importance-rated articles in WikiProjects where we have data on both their article categories as well as edits to their WikiProject pages. There are 1,043 such projects, of which less than 50 appear to have a significant amount of activity, but that number might change when we look at article statistics as well. Another thing I noticed when I was looking at the statistics is that a lot of projects have a large number of articles without importance assessments. That is something we should be able to help with, as we know already that we can do well on the WikiProject level.

Return to "Automated classification of article importance/Work log/2017-04-21" page.