University of Virginia/Wikicite for education


This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.

Wikicite for education is a project to better curate the metadata for academic publications in the field of education. We estimate that in all academic journals and other scholarly repositories there are 1-2 million research publications on education. Analyzing publications in academic journals is easiest because those papers share the most similarity in format, but this project may consider other documents including white papers, impact research, practice recommendations, preprints, or research notes.

The primary objective of the project is to recommend appropriate academic papers when a user describes what kind of research they want. This project will achieve this through topic tagging to identify the subjects of papers. Secondary outcomes of this project include curating the corpus of papers to analyze, describing social and ethical challenges to creating a cataloging system, and documenting the general process well enough to be a model for curating scholarly literature in any field and not only in education.


Curation at scale is an important research direction to accelerate the availability of professional and academic research in all fields, including education. While better funded fields including medicine have been able to fund high quality manual curation of their research for decades, other fields including education have research collections with less annotation. Funding manual curation curation in education as a stand alone solution would be prohibitively expensive. However, technology has advanced such that compiling a sample of well curated education research could be the start of a machine learning process to automatically tag the rest of the papers at significantly lower resource cost than ever before possible.

By better cataloging papers, we make them more accessible to the benefit of any researcher who wants to quickly determine whether and where they can find research on any given topic.


  1. Consider all collections of academic papers in education
  2. Gather or create sufficient library cataloging data to describe the subjects for a subset of these papers
  3. Use technology to produce topic tags to further describe the rest of the papers
  4. Publish these terms to Wikidata through the Wikicite project
  5. Develop functionality which supports exploration of this content within Wikicite tools



  1. a novel collection of education topic labels applied to academic papers in that field
  2. publication of the same in Wikidata
  3. Development of the Wikidata environment which increases access and use of this data
    1. Queries for browsing academic literature in Wikidata
    2. Integration with other Wikidata scholarly cataloging efforts, including the Wikicite project, author disambiguation, and association of papers and research with the author's institutions

Research TeamEdit