GLAM-WIKI 2009 report

(for the event itself see https://wikimedia.org.au/wiki/GLAM-WIKI and for the recommendations see GLAM-WIKI Recommendations)

Tim Starling on the GLAM-WIKI 2009 conference, from foundation-l:

I thought I'd better write up a report about the conference I went to last week, to justify the time I spent there. I'll give some general observations followed by some technical ones.

GLAM-WIKI was a two-day conference billed as a meeting between Australia's GLAM sector (galleries, libraries, archives and museums) and Wikimedians. GLAM representatives outnumbered Wikimedians, but we had enough people there to make sure our point of view was heard both inside and outside of the formal program. Many of the talks were from people in the GLAM sector who were already converted to our way of thinking, and who endeavoured to convert the rest of the GLAM audience by speaking in their language.

The GLAM representatives were generally very receptive. When dissenting questions came up, they were often answered in our favour by another GLAM representative. I asked one of the delegates about this favourable mood, and he said that the delegates were generally self-selected people who had a favourable opinion of Wikimedia and free content, and that the skeptics did not attend. However, the discussions had at the conference would provide valuable ammunition against those skeptics back in the office.

As far as I know, only one speaker expressed a completely contrary opinion to the general mood of the conference, and that was Ian MacDonald of the Australian Copyright Council. He said, in essence, that institutions need to prevent reuse or modification of the content they hold in order to preserve its purity, which risks sullied by the cumulative distortions of the general public. This was passionately countered by Jessica Coates during question time, with some success judging by nearby whisperings. MacDonald also warned the audience about evil Wikimedians like the one who "hacked into" the NPG (UK) website and stole a million pounds worth of images. The factual errors in this statement were briefly addressed during question time.

I tried to get a feeling for what sort of hard drive capacity we would need if the institutions in the room decided they wanted to share large amounts of content with us. Many of them have tens or hundreds of terabytes of data storage, in tape and hard drives. However, the bulk of this is in restoration-quality images (e.g. TIFFs tens of thousands of pixels wide), which they would not be willing to share with us even if we wanted them. Liam Wyatt proposed as a business model or compromise with management, the idea of sharing images of a 1000-2000 pixel width and charging a fee for access to the full resolution images. That seems like the most likely arrangement, and if so, it wouldn't need a significant change to our current capacity planning for file storage.

A GLAM delegate expressed an opinion in question time that they would be reluctant to have us mirror their collection, since they've spent a large amount of money setting up their data storage, so mirroring would seem like a waste. Brianna Laugher was receptive to the idea of having Wikimedia projects hotlink or cache images from galleries. I kept quiet, the significant technical challenges with that approach were not discussed.

There is a need for bulk upload tools to be better advertised and more readily accessible. One of the institutions reported paying students to upload hundreds of photos to commons via the usual web-based UI, but found it to be too time-consuming and expensive to consider on a large scale.

Special:BookSources came up a couple of times. The libraries would love to see software improvements, such as geolocation giving the ability to present the nearest few libraries at the top of the page, without the user having to click on the world map. Liam mentioned the geolocation projects based on detecting nearby 802.11 access points. I think MaxMind's GeoIP City would be a better as a software development starting point.

Delegates from the National Library of Australia reported that they have an ongoing project to collate collection metadata from all libraries in Australia. It may be possible to replicate this data to Wikimedia servers, or otherwise make it available. This would enable a feature whereby the user is told which libraries have the book being searched for, in the requested edition or a different edition. It may even be possible to report whether the book is on the shelf or not.