Grants:Simple/Applications/Wikimedia Espana/2018/National Geographic Institute


National Geographic Institute
(Program story)


Introduction edit

 
Headquarters of the NGI in Madrid

In August 2015 we contacted the National Geographic Institute of Spain as parts of their website indicated that it was completely free and unlicensed. This offered us the possibility of releasing in Commons a series of materials, such as municipal boundary acts.

The institute responded by showing interest and we set up a meeting where we could talk about it. At the meeting they pointed out that, on the one hand, they had unlicensed, freely usable material, and on the other hand, they had more modern material that had a license that did not allow commercial use.

Among the first were the notebooks of boundary lines and databases of geodesic vertices and nomenclature, all of which could be used in Wikimedia projects. Among the material licensed for non-commercial use, and therefore incompatible with Wikimedia, were planimetries, orthophotos and old maps and plans. These latter materials were of particular interest to Wikipedia.

In November 2016 we contacted the institute again regarding the new license they had posted on their website. In a meeting we explained the problems and legal holes that it posed; they indicated that it was not yet definitive and that the final text of the license was in the draft stage.

Our objective was the massive uploading of a series of materials, such as topographical maps, aerial photographs and old plans. At the beginning of 2017 they sent us the draft of the final text of the license and it was verified that as they had it it was already compatible with the licenses accepted in Commons. Therefore, we were awaiting its official publication.

Development edit

 
One of the MTN25 maps uploaded

At the end of 2017 the new license was published, under CC BY 4.0, which allowed materials to be uploaded to Commons. Therefore, from that moment we began to work on the process of uploading materials.

Of all the possible materials, we decided that the first information packages would be topographic maps, provincial and regional maps, aerial photography and old maps. At the same time, WMF proposed us that this upload, due to the characteristics of its content, be a Structured Data pilot project.

We spoke with both Alex Stinson and Sandra Facounnier, and we accepted the proposal. The first upload would be the MTN25 topographic maps and would be done in the traditional style, while the rest of the packages, already in 2019, would be done under Structured Data.

First of all we requested access to a virtual machine in Wikimedia Labs, from where we would work. The process of downloading, decompressing and deleting files in formats that did not interest us lasted several weeks and, once ready, we generated a CSV with the files we had.

Afterwards, we used OpenRefine to reconcile the rows we could with Wikidata, we set the template we would use ({{map}}), in which we included all the data, and we proceeded with the upload of the files with Pywikibot, a little more than 9000 in this first round.

Each file has this information: title, description, authorship, date, source, license and attribution, location of the map, scale, sheet number, set of maps to which it belongs, editor and language, among others. In addition, the way of exposing this information was done at the suggestion of Sandra Facounnier, with a view to a future conversion into Structured Data.