LinkedOpenData/Strategy2021/Wikibase

Wikibase Ecosystem
Wikibase logo.svg

Purpose

Wikibase makes it easier than ever before to create, connect, and grow a collaborative linked knowledge base. We are enabling the co-creation of the world’s largest knowledge graph of free and open data which will be used to create new knowledge for the world.

Guiding principles

Openness: The Wikibase Ecosystem is free and open first. We prioritize enabling the release of more free and open knowledge into the world (e.g., CC licensed data and publicly accessible instances) over optimizing Wikibase for closed-system use cases. We will therefore invest the most resources in supporting projects that are committed to allowing the broader public to access (i.e., read and query) their Wikibases, and/or those that contain data with a Creative Commons (or compatible) license. Although there are certainly interesting and inspiring use cases for Wikibase software that benefit the members of specific organizations, these projects should not be our focus when it comes to deciding the direction of Wikibase products. Supporting projects that do not directly result in the release of more Linked Open Data into the world is not contributing to the vision of the Wikibase Ecosystem, nor is it as impactful to the Movement.

Sustainability: The sustainability of the Wikibase Ecosystem depends on a variety of factors that must play well together: an engaged user base maintaining diverse and high-quality data, a healthy support network of service providers, a community of developers producing complementary software and tools, and ways to connect data across institutional boundaries. We prioritize the sustainable development of a holistic Wikibase Ecosystem over, for example, rapid scaling of any one aspect simply for the sake of growth.

Knowledge equity & Co-creation: Individuals, institutions, research teams, and niche projects around the world will be able to directly shape the decentralized knowledge graph by contributing to a diverse range of Wikibase projects. Wikidata is one very important hub in the Wikibase Ecosystem with its own thriving community of contributors that are building and maintaining a broad range of general-purpose knowledge. For the knowledge graph to expand to include many more well-maintained knowledge hubs, we must ensure that the caretakers of a wide variety of specialized, non-CC0 data have the core technology (e.g., products within the Wikibase product portfolio) and documentation that they need in order to contribute.

Utility: Within the Wikibase Ecosystem, interconnectedness is a necessary element of data utility. We must ensure different Wikibase instances (including Wikidata) can be linked with each other while also empowering others to form communities of practice around the interconnected hubs in the knowledge graph. This will ensure that Wikibase breaks down knowledge silos and results in the synthesis of new knowledge. Ultimately, there are two aspects to this effort. One is providing a technical platform that will transform isolated Wikibase instances into a web of linked knowledge hubs. The second is employing our international reach and the broad view of the Wikibase Ecosystem to ensure that others (e.g., institutions, researchers, and knowledge holders) are empowered to connect with one another so that they can form their own communities around Wikibase projects they care about -- for example, by introducing similar projects to one another so that they can support each other as they explore the software.

Background info

Wikibase is an open-source software suite for running collaborative knowledge bases, opening the door to the Linked Open Data web. As its central units, Wikibase has Items and Properties which are used to describe concepts and which can be extended to other domains like language and media files. Wikibase is well-suited to heterogeneous data rather than data that fits easily into a tabular dataset (e.g., describing a city with statements about its mayor, number of inhabitants, country, etc., rather than a collection of temperature measurements for a city in one-hour intervals across 100 years). Wikidata is the most prominent example of a Wikibase instance. More info on Wikibase can be found on its website.

The Wikibase Ecosystem describes the sum of Wikibase instances and the technical and social infrastructure surrounding them, from a thriving community of volunteers and end users to developers producing complementary software and tools to a network of service providers. The Wikibase Ecosystem is part of the Linked Open Data web. While some interlinking of Wikibases is already possible (e.g., federated querying with SPARQL, or using Wikidata’s properties on your own Wikibase instance), deeper connections that would allow data to be shared across multiple Wikibases are yet to come..

In 2018, Wikimedia Deutschland began to actively foster the sustainable growth of the Wikibase Ecosystem in order to better support people and institutions in freeing and opening up their data through Wikibase instances. Since then, the WikiLibrary Manifesto was created in a collaboration between the German National Library and Wikimedia Deutschland. This manifesto has since been signed by more than 40 libraries, among them the Finnish National Library and the International Federation of Library Associations and Institutions (IFLA). Recent years have seen the Wikibase Ecosystem continue to grow to include new communities of practice such as the Wikibase Stakeholder Group and a diverse array of institutions that have implemented or piloted Wikibase.

The following table is a list of example Wikibase projects.

Wikibase projects
Project Name Initiator Description

Integrated Authority File/Gemeinsame Normdatei (GND)

see also the blogpost about this project

German National Library/Deutsche Nationalbibliothek

(DNB)

The objective of the GND is to form a modern, web-compatible authority file for the German-speaking world.
Artbase Rhizome Rhizome’s Artbase is an archive of born-digital art. It was the first Wikibase instance outside of Wikimedia projects.
French Authority File & NOEMI National Library of France / Bibliothèque nationale de France (BnF) & Abes The National Library of France has explored Wikibase for its authority file through the FNE and NOEMI cataloguing pilots. NOEMI seeks to replace a 20-year-old software with Wikibase for metadata production.
EU Knowledge Graph (Kohesio) EU Commission Contains structured data about different topics such as EU countries, institutions, projects and beneficiaries of EU funding. The EU Knowledge Graph constitutes a data repository for the Kohesio website where users can search for projects that were funded by the EU Commission. Kohesio was created by the EU in an effort to be more transparent about funded projects and to centralize scattered information about this topic.
Factgrid Gotha Research Centre / University of Erfurt Originally started as a project to document the history and connections between members of the Illuminati, Factgrid is now a collaborative database used by researchers in the digital humanities.

Enslaved.org

see also the blogpost on Wikibase and Enslaved.org

Michigan State University & University of Maryland Enslaved is a linked open data platform containing more than 750k records (People, Events, Places, and Sources) related to the historical slave trade. Using Wikibase, they can interlink several long-standing legacy databases and preserve the data for future generations.
Black Bibliography Project Yale & Rutgers Universities The Black Bibliography Project (BBP) aims to revive the practice of descriptive bibliography for African-American literary studies.
WikiFCD Volunteer-run WikiFCD is a database of food composition and nutrient information that anyone can edit. They hope to reduce the time that epidemiologists, nutritionists and other researchers spend searching for food composition data.

Strategies

  • Empower knowledge curators to share their data: Increase the number and diversity of Wikibases that can eventually be connected to the LOD web.
  • Ecosystem enablement: Enable an ecosystem of extensions as well as tools and custom interfaces based on WB APIs to emerge around Wikibase, extending the functionality of the software for more use cases.
  • Connect data across technological & institutional barriers: Ensure Wikibases can connect more deeply with each other and Wikidata to form an LOD web

Empower knowledge curators to share their data

Work along this strategic thread will increase the number and diversity of Wikibases that can eventually be connected to the Linked Open Data web. We must ensure that more projects are able to independently onboard themselves into the Wikibase Ecosystem by reducing the complexity of software setup, data importing, and maintenance. This will make it possible for institutions with fewer resources to engage with Wikibase.

Why focus here?

  • To support knowledge equity, we need to ensure that the Wikibase Ecosystem does not consist only of the wealthiest, most well-funded knowledge curators.
  • It takes significant time and effort to get a Wikibase project running smoothly. Projects need to import and model their data, develop workflows, build up communities, secure project funding, and so on. Today, the complexity of Wikibase software setup and maintenance is an added burden that can slow this process for those with less technical expertise.
  • We believe that improving this situation will result in new Wikibase instances that span a variety of topics and disciplines and add high-quality data to the overall ecosystem.

Ecosystem enablement

Work along this strategic thread will enable Wikibase users to extend the functionality of the software for their specific use cases. Wikibase has been designed as a generic product that can be used across a variety of disciplines rather than a specialized software for any one discipline. Because of this, extensions and customization are often needed to make Wikibase fit specific use cases, particularly among institutions with complex workflows.

Why focus here?

  • Within the Wikibase ecosystem, there are a number of projects that are building (or that intend to build) custom extensions to better suit their needs. For example, the Wikibase Stakeholder Group was recently founded with the specific goal of building new extensions.
  • Developers outside of the Wikimedia Deutschland team need access to clear documentation and guidance so they can build extensions and other custom solutions that will remain stable and reliable in the long term.
  • In addition, there should be a clear pathway for valuable open-source extensions to be made discoverable and usable by Wikibases in the ecosystem.

Connect data across technological & institutional boundaries

Work along this strategic thread will ensure Wikibases can connect with each other and with Wikidata to form a Linked Open Data web. Although each standalone Wikibase instance can bring valuable data out into the open, the true power of the Wikibase Ecosystem will be unlocked when many instances can connect to and communicate with one another.

Why focus here?

  • By connecting with other Wikibase projects, knowledge holders will be able to focus on their own areas of expertise rather than maintaining copies of the same data all over the ecosystem. For example, a specialized Wikibase project could be enriched with the wide array of well-maintained general knowledge from Wikidata.
  • Shared ontologies between Wikibases can empower people to combine data sources from different disciplines.
  • These connections will help people to generate new insights and synthesize linked data into new knowledge.

Target users

Flagship Institutions

Typically very influential within their sectors (and often with access to long-term funding), these Wikibase installations inspire other organizations to explore or adopt Wikibase. Due to the larger scale of their implementations,they have historically been among the first to uncover structural limitations and challenges in working with Wikibase. The German National Library is an example of this group.

Concerns and needs: Flagship institutions tend to have needs focused around installing, configuring, and maintaining multiple Wikibase instances as well as importing and updating large amounts of data. Extending Wikibase to better suit their specific use cases is a key concern. In addition, they may require custom integrations with other software or with specific metadata standards.

Connection to strategy: Flagship institutions will be among the stakeholders we consult with on the topic of “Ecosystem enablement”, as they generally are interested in developing specific extensions and tools for their projects and for the broader Wikibase Ecosystem.

Focused Research Teams

Although these research projects focus on different disciplines, they generally have common concerns: access to limited-term project funding along with a need for long-term data preservation. Wikibase has been used by these teams as a tool to develop custom ontologies for modeling linked data in new ways. Enslaved.org is an example of a project in this group.

Concerns and needs: Focused research teams often have a need to easily pilot Wikibase software without requiring assistance (or pre-approval) from their institutional IT decision-makers. One of their key concerns is keeping their Wikibases updated and operational in the long term, after initial funding for their project concludes and their technical support resources are no longer available.

Connection to strategy: These projects are linked to the strategic thread “empower knowledge curators to share their data”. By adopting Wikibase they will contribute new, high-quality data to the broader Wikibase Ecosystem, increasing the breadth of knowledge which can eventually be linked to Wikidata and beyond.

Representatives of diverse knowledge

Representatives of diverse knowledge share a viewpoint that would otherwise be underrepresented within the Wikibase Ecosystem or in other areas of the open internet. This group includes holders of very specialized and/or language-specific data not currently well represented on Wikidata, likely originating from regions outside of Europe and North America.

Concerns and needs: Representatives of diverse knowledge want their otherwise underrepresented knowledge to be reflected in the wider open-knowledge ecosystem and for their voices to be heard in fundamental decisions that shape Wikibase software. They may lack access to the same level of funding or technical resources as large institutional projects; therefore setting up and running a Wikibase instance with more limited technical resources is also a concern.

Connection to strategy: These projects are linked to the strategic thread “empower knowledge curators to share their data”. By adopting Wikibase, they will contribute new, diverse data to the Wikibase Ecosystem, increasing the breadth of knowledge that can eventually be linked to Wikidata and beyond. They support the Ecosystem being more equitable by bringing in otherwise missing points of view.