Knowledge Integrity
Knowledge Integrity was a cross-departmental program at the Wikimedia Foundation. Approved as part of the 2018/2019 annual plan, it involved Wikimedia Foundation Programs and Research teams as well as Wikimedia Deutschland's Wikidata team.
Why a “knowledge integrity” program?
editIncreased global attention is directed at the problem of misinformation and how media consumers are struggling to distinguish fact from fiction. Meanwhile, thanks to the sources they cite, Wikimedia projects are uniquely positioned as a reliable gateway to accessing quality information in the broader knowledge ecosystem. How can we mobilize these citations as a resource and turn them into a broader, linked infrastructure of trust to serve the entire internet? Free knowledge grounds itself in verifiability and transparent attribution policies. Let’s look at 4 data points as motivating stories:
- Wikipedia sends tens of millions of people to external sources each year. We want to conduct research to understand why and how readers leave our site.
- The Internet Archive has fixed over 4 million dead links on Wikipedia via InternetArchiveBot. We want to enable instantaneous archiving of every link on all Wikimedia projects to ensure the long-term preservation of the sources Wikimedians cite.
- #1Lib1Ref reaches 6 million people on social media. We want to bring #1Lib1Ref to Wikidata and more languages, spreading the message that references improve quality.
- 33% of Wikidata items represent sources (journals, books, works). We want to strengthen community efforts to build a high-quality, collaborative database of all cited and citable sources.
Our 5-year vision for the Knowledge Integrity program is to establish Wikimedia as the hub of a federated, trusted knowledge ecosystem. We plan to get there by creating:
- A roadmap to a mature, technically and socially scalable, central repository of sources.
- Developed network of partners and technical collaborators to contribute to and reuse data about citations.
- Increased public awareness of Wikimedia’s vital role in information literacy and fact-checking
Plans for 2018-2019
editWe have identified 5 levers of Knowledge Integrity: research, infrastructure and tooling, access and preservation, outreach, and awareness. Here’s what we will be doing with each:
- Research: Continue to conduct research to understand how readers access sources and how to help contributors improve citation quality.
- Tools: Improve tools for linking information to external sources, catalogs, and repositories.
- Accessibility: Ensure resources cited across Wikimedia projects are accessible in perpetuity.
- Outreach: Grow our outreach and partnerships to scale community and technical efforts to improve the structure and quality of citations.
- Awareness: Increase public awareness of the processes Wikimedians follow to verify information and articulate a collective vision for a trustable web.
Category | Output | Phabricator task |
---|---|---|
Editor research and workflows | Research study to map the “state of verifiability”: what content in Wikipedia and Wikidata are unsourced, and which existing cited sources are accessible by the public. | task T199187 |
Research study to understand how readers use citations, with quantitative and qualitative analysis to identify information quality and sourcing gaps: to what degree are readers’ learning goals met by consuming Wikimedia content alone rather than using references and external links. | task T199188 | |
Infrastructure, tech, and tools | Real-time event stream tracking the creation and modification of external links and references across Wikimedia projects, to track and contribute to the citation work of editors. | task T199189 |
Improve tools to identify and fill unsourced statements with algorithmically generated recommendations, such as with Citation Hunt for #1lib1ref. | task T199190 | |
Study how to integrate Citoid in Wikidata. | task T199191 | |
Integrate Citoid in Wikidata. | task T199197 | |
Access and preservation | Internet Archive will immediately cache all resources linked from Wikimedia projects, also prioritize efforts to digitize sources cited in Wikimedia. | task T199193 |
Outreach and partners | Fundraise for sustainability of WikiCite and satellite events. | task T199192 |
Host WikiCite and extend promotion to include a set of satellite events for broader global reach of the event. | task T199194 | |
Run 1Lib1Ref in January and May with expansion to include references for statements on Wikidata. Run #OAbot campaign during Open Access Week. | task T199195 | |
Awareness and literacy | Create an audience map of our ecosystem and develop communications strategies for each audience including blogs, social media, and events. | task T199196 |
Who is involved?
editAlong with volunteers, this program involves the following teams:
- Wikimedia Foundation Technology’s Research Team.
- Wikimedia Foundation Community Engagement’s Programs team (primarily The Wikipedia Library).
- Wikimedia Deutschland Engineering’s Wikidata team.
- The C-level sponsor for Knowledge Integrity is Victoria Coleman, CTO of the Wikimedia Foundation
Name | Team | Role | FTE | |
Melody Kramer | Communications | Messaging | 0 | mkramer wikimedia.org |
Ed Erhart | Communications | Blog | 0 | eerhart wikimedia.org |
Marielle Volz | Contributors | Citoid | 0.5 | mvolz wikimedia.org |
Marcella Florence | Contributors | Citoid manager | 0 | mflorence wikmiedia.org |
James Forrester | Audiences | Citoid consultant | 0 | jforrester wikimedia.org |
Jake Orlowitz | Programs | Program Manager | 0.5 | jorlowitz wikimedia.org |
Sam Walton | Programs | Task manager | 0.25 | swalton wikimedia.org |
Ben Vershbow | Programs | Director | 0.1 | bvershbow wikimedia.org |
Aaron Vasanth | Programs | Coordinator | 0.1 | avasanth wikimedia.org |
TBD | Programs | 1lib1ref coordinator | 0.1 | TBD |
Dario Taraborelli | Research | Director | 0.5 | dtaraborelli wikimedia.org |
Miriam Redi | Research | Researcher | 1 | mredi wikimedia.org |
Bahodir Mansurov | Research | Software engineer | 0.5 | bmansurov wikimedia.org |
Sarah Rodlund | Technical Engagement | Wikicite | 0.25 | srodlund wikimedia.org |
Jonathan Morgan | Research | Design researcher | 0.25 | jmorgan wikimedia.org |
Leila Zia | Research | Team lead | 0.25 | leila wikimedia.org |
Rachel Farrand | Community Relations | Wikicite | 0.1 | rfarrand wikimedia.org |
Lydia Pintscher | WMDE | Wikidata Lead | 0.1 | lydia.pintscher wikimedia.de |
Léa Lacroix | WMDE | Wikicite | 0 | lea.lacroix wikimedia.de |
TBD | WMDE | Wikidata dev | 0.25 | TBD |
TBD | WMDE | Wikidata dev | 0.25 | TBD |
The initiative also spans across an ecosystem of possible partners including the Internet Archive, ContentMine, Crossref, OCLC, OpenCitations, and Zotero. It is further made possible by funders including the Sloan, Gordon and Betty Moore, and Simons Foundations who have been supporting the WikiCite initiative to date.
How you can participate
edit- You can read the fine details of our year-1 plan in the Annual Plan on mediawiki.org:
- Follow our progress via the Phabricator workboard.
- We’ve also created a brief introductory slidedeck about our motivation and goals:
- WikiCite has laid the groundwork for many of these efforts. Read last year’s report:
- Recent initiatives like the just released citation dataset foreshadow the work we want to do:
- In April we’re celebrating Open Citations Month; it’s right in the spirit of Knowledge Integrity: