Research talk:Automated classification of article importance/Work log/2017-02-22

Thursday, February 23, 2017 edit

Did some debugging of the category assessment script and implemented a check for whether pages exist so we're not submitting invalid revision IDs to ORES. WikiProject Medicine now has a page with Start-class articles that are good candidates for reassessment, looking forward to learning their experiences with it.

Created a Quarry query for the number of articles (technically talk pages) in each of the importance-categories. Here are the results, and it looks like we'll be processing a few million pages, but since it's all based on page IDs it should be straightforward to batch-process them similarly as we do for SuggestBot's link queries. I'm also cautiously optimistic that a dataset of pages with unanimous importance ratings from multiple WikiProjects will be sufficiently large given the results of my Quarry query.

I'm not satisfied with the definition of reputation/authoritativeness in the literature review, and have started digging through some information science literature to see if they provide some solid definitions. Did not find anything conclusive yet. Might have to also pull in the Stanford Encyclopedia of Philosophy with regards to authority in the political sense, as that seems relevant here.

Return to "Automated classification of article importance/Work log/2017-02-22" page.