Research talk:Automated classification of article importance/Work log/2017-06-08

Thursday, June 8, 2017 edit

Today I'm looking into handling redirects in view data.

Redirects edit

It turns out that redirects are not accounted for in view data. We would therefore like to get an updated dataset where views through redirects are part of it, perhaps as a separate column.

First question is to what extent redirects that originate and terminate in the main namespace go to pages that are also redirects. I wrote a couple of SQL queries for this. The queries are fairly slow due to the necessary join to limit it to those where both the source and the target are in the main namespace.

As of today, there are 7,811,171 redirects that originate and terminate in the main namespace. Only 208 (or 0.00266%) of these lead to another redirect. Given this low occurrence of multiple redirects, it appears unnecessary to create any kind of software that will handle these.

Return to "Automated classification of article importance/Work log/2017-06-08" page.