This community consultation has the goal to determine what is the stance of the community regarding the recent surge of open data, what should be its relationship with our projects, and gather proposals to handle these open data sets more efficiently in order to benefit editors, readers, and enable potential partnerships.
Thanks to initiatives like the one lead by the Open Knowledge Foundation, more and more governments are opening their data in platforms like CKAN. Also, scientists part of the Open Access movement disclose their data, which might be partially relevant to the Wikimedia movement mission. How should we adapt to this new situation? How does it affect our mission?
Wikidata was created as a collaboratively edited knowledge base, which means that its community has to be selective when integrating data into its structure for several reasons.
Information overflow: on Wikidata contributors and data coexist to create semantic structures that can be queried in several ways. Adding too much data, might make data curation impossible (think of copy-pasting text dumps into Wikipedia)
Structure of the project: Wikidata is conceived in a way that each relevant piece of information is stored in its representing item (a city item stores its population), which is great for adding contextualized data, but difficult when this data is just relevant for a specific graph.
Software limitations: on Wikidata each field is supposed to be editable. That requires formatters and widgets, which would render the pages unusable when the number of elements is in the range of hundreds or thousands. Datasets do not need to be edited, just versioned as a whole, like files in Commons.
Even with Wikidata not being the right place the data structure of Wikibase (the software powering Wikidata) should be kept into account.
Together with the Open Knowledge Foundation Austria and the Cooperation Open Government Data Austria, Wikimedia Austria is currently establishing an open data portal that will host non-governmental open data in Austria. Open data is an important part of free knowledge and often serves as a valuable resource for Wikimedia projects. Some of this data will be incorporated into Wikidata, but in some other cases it won’t be possible (reasons described above). However it it might be still interesting to use some datasets for visualization purposes. Wikimania '14 submission
Recently Amical Wikimedia was approached by OCMunicipal, a citizen organization whose aim is to monitor town hall spending. In order to achieve this they engage citizens and journalists to process the data. Once it is processed and standardized it becomes more useful and aligned with our knowledge divulgation mission. It would also help in their mission if potential partners and contributors would know that their work can be reused in Wikipedia.
The Analytics team from the Wikimedia Foundation generates a great deal of valuable information that has to be formatted in cumbersome wikitext in order to be published. Their work would be much easier if a tool to manage datasets was available.
As outlined in a previous proposal (see DataNamespace), there is a great deal of information in Wikipedia pages that cannot be shared in different language editions. It is also hard to update when there is new data available, and, as explained above, it doesn't always belong to Wikidata scope.