Federated knowledge graphs

There are many collections of structured knowledge; linking and federating them in a discoverable way is an open challenge. Approaches includes Linked Open Data (and extensions of this idea to specific libraries and authorities, such as LD4L), aggregated authorities like VIAF, aggregated datasets like Wikidata, and aggregated knowledge graphs like Satori, Wolfram Alpha, and GKG.

Naming and separating different layers of knowledge can help to sharpen areas of certainty and confusion, agreement and argument.

Wikibase provides a mechanism for naming and sharing properties, and implicit shapes/schemas.

Related projects edit

Decentralized networks edit

Designing for latency and federation of naming, resolving, storing, and computing. IPFS, et al.

Knowledge spaces edit

Related concepts: dimensions, layers, covers and meshes of knowledge; clusters of topics and schemas with similar internal mappings; category theory.

A cover + mesh might capture a space of concepts being explored, a parameter space being sampled, or a known whole such as the earth linked to geocoordinates.

Layers of interrelated concepts look different based on perspective: whether building out that type of data, using it to synthesize more abstract or aggregate knowledge, or tracing its elements to subsidiary components or sources. For instance:

an interlay: a federated data network, designed for linking, tracking and aggregating context: provenance, schemas, indicators and classifiers.
an overlay: a compilation of knowledge (catalog, document, visual, statement) used for some purpose.
an underlay: the granular data and details that make up an overlay.
a knowledge base or graph: a collection of connected data, often queried by traversing its links or schemas

Constructing public layers as global information goods: An effort to develop shared public underlays used by prominent overlays, and to improve the detail and consistency of provenance for each data point. This can add clarity to those overlays, and identify gaps in knowledge (in coverage, precision, or approach) that can be filled.

Interlace edit

The Interlace ecology
The connective matrix of ideas, discoveries, and knowledge projects, how they relate and communicate, and their principal aspects and modes. The community of participating people, entities, and projects scaffolding and reconstituting this matrix over time.
Wikilinks / go-links everywhere
Federated + cascading authority files for link resolution. From DNS to editable wiki-resolvers.
Underlays & the a global underlay
Grounding layers of observations and primary source data, context, and vocabularies, to use as building blocks or referents for other layers. Immutable namespaces for distributed ostension across language, time, and space.
Overlays & a global overlay catalog
Experiential lenses for understanding a subset of available information as contextual knowledge. In practice, the primary sources and process provenance and curatorial stages relied upon are not visible; but a transparent overlay offers visibility into these things, to arbitrary depth and resolution.

Shared and semi-public knowledge bases edit

Wikipedia
Particularly at the point that it added transclusion, bidirectional link tracking, templates, and universal language/category metadata.
Wikidata itself
A collection of layered data, drawing from Wikipedia, citation collections, and individual editors + bots. Somewhat rate limited by the demands of curation and queries, and the desire for a balanced overall corpus (the current Wikicite effort was producing more material than the community will comfortably accept, so its ingestion was slowed down over the summer)
Freebase
A great, early public knowledge graph, drawing from Wikipedia and many other sources. Currently browsable online (but for how long?), and 10% imported into wikidata.
scholarship
structured documents & citation graph the Annotated Hub of Science
patents
structured prior art, patents, & citation graph
law
structured laws, regulations, & citation graph
science & mathematics
structured proofs, discoveries, & equation graph

Most of the above are centralized subsets of a larger implicit knowledge base embedded in the set of all literature of its type. While we can iterate over {published works} to build a complete reference, as with search engines trying to traverse all documents online, the most common references to date are incomplete popular subsets, with long-tail research handled by search engines + the library network.

Specialized knowledge graphs edit

GKG, Satori
Knowledge graphs tuned to enhance search-engine queries, moderate provenance. (Satori slides)
WikiCite, MS Academic Graph
Citation graphs of academic papers, simple provenance
OpenCyc, Concept Graph
common-sense statements with simple provenance


Visualization + Queries edit

Ordia -- for words, lexemes, text.

Scholia -- for entities: nouns, topics.

Wikibridge -- List of next steps to visualize/understand. (list of links to the above w/ prefilled queries?)

Other edit

Tools for capturing knowledge edit

Roam
Notetaking as collaborative knowledge-graph building on local machines. Identifying granular components, crosslinking them, offering transclusion and templating (Conor W-S)

Library alignments, term & concept resolvers edit

  • Linked Data for Libraries / Production / Implementation
  • Authority file alignment project (Stanford+, 2016-17)
  • Mix-n-match

Unstructured knowledge-sets + data sources edit

Regularly used as test-beds for tools extracting data + layers

  • The Wayback Machine & Common Crawl
  • Research dumps: Imagenet, Youtube videos
  • Sources: via Datavis.tools / Quartz

Other projects edit

  • ... add yours!

Applications edit

Reasoning + Argument + Methods of proof edit

Approaches to this:

  • Defensible Reasoning Non-monotonic logic and formal augmentation -- beyond simple logical schemes. (Uni. Bochum, Germany)
  • Canonical Debate (the case for it, and a whitepaper, by a group of democracy activists)
  • Request for proposals : improving debate for the future of an informed society (Knight, 11.2018)
  • What happens when methods for distributing reasoning break down: Syllabus

World Genealogy edit

Formal proof of Kepler's Theorem edit

  • 2003: launched Flyspeck;  2014: complete
  • Naming intermediate steps allowed many parties to claim proofs of them; and for these to be verified by others

Medical models edit

These all need to be privacy conscious, while allowing models to be trained on broad populations of people + conditions. (The main alternative is less privacy-conscious approaches within large orgs which build proprietary models+tools)

  • Imaging: models based on anatomical imagery, static and dynamic.
  • Responses to treament: models based on aggregate data of treatment + response.

Meta analysis edit

Quality assessment of knowledge bases edit