Wikibase Community User Group/Meetings/2023-02-23
Online meeting of the Wikibase Community User Group.
Schedule
edit- 16:00 UTC (17:00 Berlin), 1 hour, Thursday 23rd Feb 2023
- Online meeting: https://meet.jit.si/WikibaseLiveSession
- Etherpad: https://etherpad.wikimedia.org/p/WBUG_2023.02.23
Participants (Who is here?)
edit- Laurence 'GreenReaper' Parry (WBUG/WikiFur.com)
- David Lindemann (UPV/EHU, https://www.wikidata.org/wiki/Q57694630)
- Andra
- Mairelys
- Wai-yin
- Alexander Pacha
- Jeff Goeke-Smith
- Andreas
- Evelien (WMDE)
- Jon Amar (WMDE)
- Giovanni Bergamin
- Peter
- Jose Emilio Labra Gayo
- Sandra Fauconnier
- Eduards Skvireckis
- James Hare
Agenda
edit- 17:00h - 17:05h Welcome everyone!
- 17:10h - 17:40h Presentations, 'Data Modelling of individual users or collectives using Wikibase'
- 17:40h - 18:00h Exchange and time for questions
Notes
edit- David Lindemann
- "Data Modelling of individual users or collectives using Wikibase"
- Use of NOVALUE statements not as on Wikidata (see https://www.wikidata.org/wiki/Q19798647: "value attributed to a claim when we are sure that the property has no value for an element", e.g. as value for Wikidata P40, when sb. has NO children) but as placeholder for the value obtained through reconciliation (OpenRefine), example https://lexbib.elex.is/wiki/Item:Q33388 (with NOVALUE for person items) versus https://lexbib.elex.is/wiki/Item:Q6273 (with reconciled values). The example is from a workflow where we upload literal values (person names), and then reconcile these against person items existing in our wikibase using OpenRefine. This was also part of what we presented here: https://lexbib.org/wikidatacon21/
- Existing statements are updated from novalue to somevalue vs. deleting whole statement and creating another one. Useful for dealing with literal values you want to reconcile. This is not what is recommended for novalue statements and wouldn't be suitable for use on Wikidata, but fits own use-case.
- Doesn't Author and Author Name lead to the same result? - Yes, but maybe literal value is not the name of the entity - don't lose track of the original name form, leave it as qualifier for the reconciled author statement.
- Presented (at WikiDataCon)
- Andra
- CIDOC-CRM "Modelling linked data in Wikibase"
Slides: https://docs.google.com/presentation/d/14jyMTRkuZwQ9RQ5pLCrxqedrZYCBSYZOeApk_wr52-M/edit?usp=sharing - when I worked with it it is in the context of linked data cloud, not a thing on its own. Relying on linked data and RDF. Start out with an empty wikibase there is an option to clone but I avoid that as much as possible, both have a life of their own so diverge, will be out of sync - it's only useful if you want to stick to their properties, but otherwise copy them locally or just the ones you need in your own wikibase. Means it won't follow the same property numbers. e.g. P3 becomes skos:exactMatch, saying e.g. Px == P31. A couple of approaches, either mint a property like "birth date" (how it is normally done) or use "is a feature", "value", "feature kind" = "date of birth". Issue is that you need to know lots of properties, or alternatively use a catalog code. We wanted to store CDOC-CRM data in wikibase, which means you can rely on definitions provided by an external party, like OLS ontologies. Can we reuse these in Wikibase and save time? There is a constraint in the ontology, and getting it into Wikibase can be challenging but it is possible to flatten it - known as "boxology" (https://en.wikipedia.org/wiki/Boxology) to define the space of the model on the whiteboard first. T-rex taxonomy used as an example with Entity Schema. We wrote a bot to import data (not complex vs. representing data that came from an Excel sheet vs. deciding on the two models of property modelling). The most important part is probably defining an Entity Schema. Either manually create the properties or ideally use a bot to import from the entity schema. If we could build "boxology" on top of Wikibase and create a kind of OpenRefine minting of models that would be great. Rushing is not good.
- Jeff: enslaved.org did much the same thing, took a year of data modelling and then tried to get it into Wikibase, a model importer would have been convenient. "We solved that problem using grad students, I would not recommend that solution." The model: https://docs.enslaved.org/ontology/. A paper published about the process: https://daselab.cs.ksu.edu/sites/default/files/2020-Enslaved-dataorg.pdf. And a paper about how we connected out ontology to our wikibase model: https://daselab.cs.ksu.edu/sites/default/files/2020-CIKM-enslaved-alignment.pdf
- Once people start reusing your data, and effectively your model, deleting properties can be a problem, because you have cemented yourself in the linked data - you can do it but it has an impact, editorial control is a feature/bug of having your own Wikibase.
- David: LexMeta OWL equivalent for external ontology: https://lexbib.elex.is/wiki/Property:P154#P42
- One of the key selling points of Wikibase is the triple-level editing, but GraphDB or Jena might be a better triple-store for local use, Wikibase is useful for reaching out to the community.
- Wikibase as a key-value store is enough for some but if you need SPARQL then WDQS or something else is essentially required vs. making API queries. http://www.semantic-web-journal.net/system/files/swj3372.pdf - it could be that you use JSON dumps to mint RDF rather than WDQS to circumvent Blazegraph constraints. Regardless, being able to model using boxology is great. Could draw models to create entity schema, so we don't have to learn how to code or write entity schemas, user interface to both that and creating properties demands improvement - Wikidata team is working on Entity Schemas v2.
- Anything to reduce the barrier/learning curve would be helpful, documentation helps but visual = better?
- Evelein: Great presentation and extremely helpful everyone, thank you, you've been heard.
- Entity schemas are really hard, there is no lexicon for visual layout, what is the symbol for X or Y, whenever you are halfway through people say "oh it's easier to just write an entity schema" (in the RDF area) - but only if you know the way to write them. Visualization is not the end point, it just helps you write the schema.
- YASHE (ShEx editor) lets you search for properties by name: https://www.weso.es/YASHE/
- Jeff: Structured data is hard
- GreenReaper: Modelling conventions, including cancellations, using deprecated statements and EDTF
- https://furry.wikibase.cloud/wiki/Item:Q4 (use of EDTF period, with specific start/end as qualifiers since limited to day resolution)
- https://furry.wikibase.cloud/wiki/Item:Q3 (deprecated statements for postponements, use end time as qualifier on start time to permit linkage of start/end pairs)
- https://en.wikifur.com/FurryConventionMap.html (eventual desired target output)
- https://pool.wikifur.com/w/index.php?title=Convention_map_script&action=edit#:~:text=addLocation (current hack to replace)
- https://furry.wikibase.cloud/wiki/EntitySchema:E1 (provisional schema, E2 is Cradle-compatible using wdt:)
- Jeff: I follow what you have done and your reasoning
- z. blace: Informative as newbie but why is session not recorded? - GreenReaper: Didn't arrange it before with participants and has technical and GDPR concerns; but has been done for WBSG meetings with Zoom recording feature, and I typed up a lot of it in Etherpad anyway (ultimately will be archived on Meta).
- Maybe of interest, a presentation on how we modelled CIDOC-CRM in Wikibase (Maarten Zeinstra / IP Squared)
- Also maybe of interest, a presentation on how we model metadata of language resources such as dictionaries on Wikibase and in OWL in parallel, see https://lexbib.elex.is/wiki/LexMeta, https://lexbib.elex.is/wiki/LexMeta_OWL, and, about the model, a set of slides at https://www.researchgate.net/publication/364780055_LexMeta_A_Metadata_Model_for_Lexical_Resources, full paper at https://euralex.org/publications/introducing-lexmeta-a-metadata-model-for-lexical-resources/