Wikidata and research/Introduction


The conference Wikidata and Research wants to explore and trigger the synergies between academia and the Wikimedia projects, with a specific focus on open data, collaborative open research infrastructures and research assessment.

Wikidata and research, 5-6 June 2025

Wikidata and Research

edit

Wikidata https://www.wikidata.org is a free open multidisciplinary and multilingual knowledge base that can be read and edited by both humans and machines. It is the central storage of structured data for the Wikimedia projects (including Wikipedia, Wikimedia Commons and Wikisource), and it is a research infrastructure, providing and hosting open data, and supporting new analysis and visualisation.

With more than 113 million items Wikidata has become a hub of the Linked Data Cloud that connects open datasets shared by volunteers, researches, cultural and public institutions from all over the world. It also provides support to many other sites and services beyond the Wikimedia projects. The content of Wikidata is available under the open tool Creative Commons Zero - CC0, can be exported using standard formats and can be interlinked to other open datasets on the linked data web.

All these features make Wikidata a key infrastructure for scientific research and for the implementation of Open Science. Scientific research can contribute to the publication of open data but can also benefit from the use of open data published in Wikidata and the collaboration with its communities. The comparison between datasets produced by research projects, archives, libraries and museums and the LOD Cloud can bring out recurrences, new relationships and infer new knowledge (data mining and evidence based approach). It is also a way to improve data quality and enrich the datasets with links to other databases, in an interdisciplinary perspective. Scholars can collaborate with volunteers, join the Wikidata communities and receive support from other active contributors, opening research to new collaborations and confrontation, developing new strategies and tools, and sharing methodologies at an interdisciplinary and international level.

Wikibase is the open source software used by Wikidata. Everyone can create their own Wikibase instance downloading directly Wikibase or, since 2022, using the platform Wikibase Cloud <https://www.wikibase.cloud/>, which allows to create up to six Wikibase instances per account; these instances are hosted on the servers of Wikimedia Deutschland.

Key topics

edit

The conference’s themes include but are not limited to:

  • Methods and data management. Research strategies and Data Management Plans that use Wikidata and/or other Wikibase instances as key platforms for research data.
  • Sharing datasets. Case studies related to importing datasets (including datasets related to GLAMs and education) to Wikidata and/or other Wikibase instances, dealing with the methods and the tools used, the challenges of the process and the potentialities in terms of data analysis and reuse.
  • Reuse of data. Case studies in any discipline (environmental data, medical data, heritage, archival and bibliographic data, biographical datasets...), completed or still ongoing, involving the use of Wikidata and/or other Wikibase instances to collect, extract, relate, and discover data.
  • Data visualisations and tools. Visualizations and tools based on Wikidata and/or other Wikibase instances created or used by institutions, researchers and research projects.
  • Research assessment. Altmetrics based on Wikidata to evaluate scientific research and research impact.
  • Projects and proposals. Projects and proposals focusing on research and Wikidata and/or other Wikibase instances.
  • Overviews. Analysis of the use of Wikidata in research related to specific disciplinary areas or topics.

Selected bibliography

edit
  • Mora-Cantallops, Marçal, Salvador SĂĄnchez-Alonso, and Elena GarcĂ­a-Barriocanal. 2019. «A Systematic Literature Review on Wikidata». Data Technologies and Applications 53 (3): 250–68. https://doi.org/10.1108/DTA-12-2018-0110.
  • Shenoy, Kartik, Filip Ilievski, Daniel Garijo, Daniel Schwabe, and Pedro Szekely. 2022. «A Study of the Quality of Wikidata». Journal of Web Semantics 72:100679. https://doi.org/10.1016/j.websem.2021.100679.
  • Turki, Houcemeddine, Mohamed Ali Hadj Taieb, Mohamed Ben Aouicha, Lane Rasberry, and Daniel Mietchen. 2024. «Ten Years of Wikidata: A Bibliometric Study». In Proceedings of the Wikidata Workshop 2023 co-located with 22nd International Semantic Web Conference (ISWC 2023). Athens: CEUR. https://ceur-ws.org/Vol-3640/paper13.pdf.
  • Vrandečić, Denny, Lydia Pintscher, and Markus Krötzsch. 2023. «Wikidata: The Making Of». In Companion Proceedings of the ACM Web Conference 2023, 615–24. Austin TX USA: ACM. https://doi.org/10.1145/3543873.3585579.
  • Waagmeester, Andra, Gregory Stupp, Sebastian Burgstaller-Muehlbacher, Benjamin M. Good, Malachi Griffith, Obi L. Griffith, Kristina Hanspers, et al. 2020. «Wikidata as a Knowledge Graph for the Life Sciences». eLife 9 (marzo). https://doi.org/10.7554/elife.52614.
  • Zhao, Fudie. 2023. «A Systematic Review of Wikidata in Digital Humanities Projects». Digital Scholarship in the Humanities 38 (2): 852–74. https://doi.org/10.1093/llc/fqac083.