Research:One Link, Two Links, Red Links, Blue Links

Duration:  2011-07 – 2011-07
Open access project  Open access
no url provided
Open data project  Open data
no url provided
This page documents a completed research project.

This sprint is an analysis of blue and red links -- that is, links to pages which do and do not exist, respectively. The main area of interest are links from articles in the main namespace to articles in the main namespace. Also, what is the ratio of red to blue links, and has this been changing over time?

Topic edit

Process edit

See Research:Query Library#Top links, red and blue. NOTE: This query is out of date and was not the one used to generate this dataset.

Results and discussion edit

Total blue links Total red links percentage of red links # of unique blue links # of unique red links
Nov 2009 178,001,691 13,517,770 7.058% 4,561,409 4,882,292
Jul 2011 234,475,161 17,476,973 6.937% 5,588,754 5,696,942
Difference 56,473,470 3,959,203 -0.12% 1,027,345 814,650
number of incoming links greater than number of non-existent articles
at least 1 5,696,942
> 1 1,434,126
> 3 517,189
> 5 354,727
> 10 234,959
> 20 141,958
> 30 95,639
> 50 46,906
> 75 24,188
> 100 14,927
> 150 7,994
> 200 4,856
> 300 1,951
> 400 1,171
> 500 611
> 1000 12
Plot of the number of blue links to enwiki articles.
Plot of the number of red links to non-existent enwiki articles.

Top Links edit

25 June 2011 edit

The following tables are from an analysis of the current pagelinks table on 25 June 2011.

Top 100 Blue Links edit

100 Top Red Links edit

03 November 2009 edit

The following tables are from an analysis of pagelink and page tables dumped on 3 November 2009 and accessed from Wikipedia's dump

Top 100 Blue Links edit

Top 100 Red Links edit

Links to countries and states edit

Note: These tables of incoming link totals provide a good estimation for coverage in the encyclopedia. However, due to various conventions and formats, certain entities may rank higher in terms of incoming links than others in their category. For example, some demonyms redirect to the article about the country, while others link to an article about the people of that country -- so the phrase "they met a South African man" in an article about a film that otherwise had nothing to do with the country would count as a link to the country, while the phrase "they met an Angolan man would not." Furthermore, this assumes that countries are always linked by their proper names, and not by shorthand names, such as Congo for the Republic of the Congo and/or the Democratic Republic of the Congo; Macedonia for Republic of Macedonia; and China, Taiwan, the People's Republic of China, and the Republic of China.

Countries edit

U.S. States edit

Future work edit