Wikibase Community User Group/Meetings/2022-03-31

Online meeting of the Wikibase Community User Group.

Schedule edit

Agenda edit

  • What are you working on around Wikibase? You're welcome to come and share your project with the community.
  • LexBib (http://lexbib.elex.is, on wikibase.cloud, see also https://lexbib.org/wikidatacon21/) is a digital bibliography and kowledge graph for the domain of Lexicography and Dictionary Research. Problems we are currently trying to solve are
    1. ontology-based indexation of research paper full texts with terms (from a SKOS vocabulary in the wikibase)
    2. exhibition of wikibase content as OAI-PMH.
  • PhiloBiblon is "a free internet-based bio-bibliographical database of texts written in the various Romance vernaculars of the Iberian Peninsula during the Middle Ages and the early Renaissance." We are moving to wikibase and we have some interesting requirements from our legacy data model.

Participants edit

  1. Mohammed (WMDE)
  2. Alan Ang (WMDE)
  3. Donald "Max" Ziff - U of California Berkeley, PhiloBiblon
  4. Will Hanley, Florida State University. ottgaz.org (Ottoman gazetteer)
  5. Nikolay Komarov, Wikipedia enthusiast from Munich
  6. Tiffany L (affiliated with University of California, Berkeley. I work on the Ancient World Citation Analysis Project with Dr. Adam Anderson and I was invited to the session. )
  7. David Fichtmueller (I work in the Botanic Garden and Botanical Museum in Berlin, Germany. I use Wikibase to model data standards.)
  8. Myst (I'm working on a python library named WikibaseIntegrator)
  9. Zoe Dobbs (zidmgmt) (special collections metadata librarian who's worked on some Wikidata projects and interested in Wikibase especially with cloud and federation possibilities)
  10. Laurence Parry aka GreenReaper (active in the user group and I run WikiFur.com, a furry fan encyclopedia)

Notes edit

  • Presentation by David: Longer presentation at https://lexbib.org/wikidatacon21
  • Looking for help with oahpmi? Mapping and to determine distances between graph nodes for senses.notnusing Wikidata because the data is too noisy and Wikidata has issues with the addition of bibliographic data due to scaling issues.
  • PhiloBiblon: Project been in existence since 1975. Since 1981 been developing this. 135,000 entries so far. Currently a Windows database and mapping to Wikibase.
  • Data Model and Migration
    • Four different bilbliographies and more than 22,000 records
    • contributor interface: Windows App; multi-dimensional data model. been around for a really long time.
    • Web Publication (read-only)
    • Full and faithful archival CSV Dump.
  • Good model fits:
    • most assertions have an authority
    • dates model uncertainty
  • Problem areas:
    • controlled Vocab: "dataclips"
    • markup and special characters
    • reciprocal relationships
  • Migration strategy
    • CSV dump> Open Refine> QuickStatements
      • perhaps 60-80% PB facts
      • Limitation: OpenRefine/ QS require a P-item at the top of each column. we have columns where some of the cells identify a P-item. some of them might not be
    • Pilot/test in a throwaway sandbox
    • docker-based privates instances
  • Enumerated types 'dataclips' (see slides). we want to guide the contributors to choose the right terminologies
  • Sandbox>FactGrid
    • federated properties are slow
    • wish: items ref by name (not number)
    • problem is hardest for portions of the PB model that fit QSv1 better. OpenRefine helps but it does not reconcile p-items.
  • Union Types
    • PB authorities may be represented as strings or as item refs
    • wish: union type (string/ item)
  • Special Characters/ Markup
    • superscripts are standard in MS analysis
    • notes fields contain authorial itlics etc
    • workarounds:
      • special union characters
      • custom browser
      • include markup
  • Reciprocal relationships
    • son of> father of. how to say 'x is sponsor of y and y is sponsored by x' fairly automatically?
    • PB uses proper, gendered Spanish
  • GreenReaper says: I have it working, although there are issues that you need to be aware of.
  • See for example https://furry.wikibase.cloud/tools/cradle/?#/shex/E2
  • (does not have an example of a multiselect, you might need to use the Project:Cradle syntax for that potentially.
  • In particular you need to use wdt: as https://furry.wikibase.cloud/wiki/EntitySchema:E2 does (compare E1).
  • Zoe Dobbs says: I wonder if you can define a datatype akin to Mathematical expression except with MS markup
  • GreenReaper says: Have you looked into the lexicographical data extension for the gendered language stuff? Not sure it is quite what you are looking for
  • It's got stuff like forms and senses in it, but it is arguably more for recording data about words than using the words...
    • no not yet. will look into it
  • mw:Extension:WikibaseLexeme/Data Model
  • On reciprocal relationships
    • GreenReaper says: You can get that out with a query, that would be the expected way I guess.
    • Rather than having it explicit, like to find children you would find all those who have a father or mother in the reverse direction.
    • another alternative can be hierarchy of properties
    • GreenReaper says: Rather than having it explicit, like to find children you would find all those who have a father or mother in the reverse direction.
    • GreenReaper says: Perhaps that would work, or you could have "subclass of" and the son property is a subclass of a connection class.
    • Zoe Dobbs says: Yes there are many bots that do that on WD
    • GreenReaper says: wikibooks:SPARQL/Property paths might be useful with respect to determining relationships implicitly from explicit statements.
  • Will Hanley says: I have to leave. Thank you for the glimpses of your rich projects. I am an unsophisticated user, grateful to learn from all of you. If someone is interested in doing a bit of technical support for an hour or two for a bit of pay, drop me a line at whanley@pm.me. I hope to present in future. Thanks!
  • Zoe Dobbs says: I was just reading w:Help:Displaying a formula. For math, but can you mark up mss in latex? Also a note item might not be keyword searchable
  • GreenReaper says: That link got turned into having a smiley in it. Permanent link is https://en.wikipedia.org/w/index.php?title=Help:Displaying_a_formula&oldid=1079708572
  • https://github.com/wbstack/mediawiki/issues/57