WikiCite 2016/Proposals/Generation of referenced Wikidata statements with StrepHit
< WikiCite 2016 | Proposals
Data quality in Wikidata is crucial and references to trustworthy third-party sources are a way to ensure it. Lots of Wikidata statements are either unsourced or sourced to Wikimedia sister projects (typically Wikipedia via bots). Adding references to such small units of information may be a cumbersome task for human editors.
StrepHit wants to relieve this effort: it is a Natural Language Processing system that reads documents across reliable Web sources and produces referenced Wikidata statements.
- Play with the current StrepHit dataset: biographies in English;
- create and fill a Request for Comments;
- encourage referenced data donations through the primary sources tool:
- @Daniel Mietchen, Aubrey, and Thomas: follow up past discussions with ContentMine and Hypothes.is people.
Install the primary sources tool gadget to check out the StrepHit dataset: instructions at wikidata:Wikidata:Primary_sources_tool#How_to_use
- Basic understanding of how Wikidata works;
- communication strategies for community engagement, in order to:
- raise awareness of StrepHit's potential impact;
- attract new primary sources tool users.