Gathering a perennially acknowledged community of lexicologists around a relational lexicological data bank and related consultation services.
this project needs...
created on04:05, 26 October 2018 (UTC)

The Wikitrace project aims at gathering a perennially acknowledged community of lexicologists around a relational lexicological data bank and related consultation services.


This project find its root in members of the TWUG willing to provide services around Wiktionaries which can hardly be provided directly by the Wikitionary instances themselves, especially when it comes to cross data relations within and between Wiktionary instances.


The project aim at offering to Wikitionary communities technological facilities for:

  • coordinate actions beyond linguistic barriers;
  • ease crossing and reusing lexicologic data comming from wiktionaries;
  • make more accessible and thus visible:
    • the miscelleanous lexicologic classifications used,
    • respective covering of description of each lexical item in each language;
  • integrate thoroughly with flexibility the essential role of attestations of each lexical item which is attempted to be documented
  • ease the relationship creation and querying

Non goals:

  • hardwire a particular grammar or other linguistic model within the Wikitrace own model: the ontology of Wikitrace must be flexible enough to modelize them but not assume them as model requirement.


Being useful to Wiktionnary community is at heart of this project, including helping to foster the community beyond its current members. It also aim at making more useful data stored in Wikitionaries to the rest of the world. Of course compatibility with existing work, requirements and standards that the community developped is essential for Wikitrace to be successful, as well as staying open to relevant evolutions.

Analyze of needsEdit

Data ModelEdit

Here is a firt draft of constraints to guide the model specification:

  • main entities modeled are 1. attestations, 2. descriptions and 3. lexical items
  • synonymies, antonymies and so on link descriptions, not lexical items

<uml> class attestation { String transcription }

class attribution { Person attributor Person attributed Date reporting Date performance URL source }

class description { String utterance

} enum description_type { definition etymology other }

class lexical_item { Vector<Statement> } </uml>


  • communicate on the project toward potentienlly interested people
  • improve this page
    •   Done use project template
    • add roles which should be filled
    • create a dedicated logo, a picture already have been proposed as a temporary choice, but we should come with something more transcultural than a designed T.
  • looking for possible help in technical advice meetups
  • find people to fill each defined role
  • specify the data model ("ontology"), that is how should the wiktionnaries from wiktionaries should be integrated in a colloction of predicative triplet
  • create a Wikibase instance to host the data bank:
    • determine where and how to install it
  • integrate Wiktionary dumps in the instance
  • build querying external services for
  • make the database queriable from Wiktionaries


Open rolesEdit

Anyone interested is welcome to join, particpation is especially encouraged in te following manners:

  • Analyzing needs and matching an initial relevant data model and handle its evolution
  • Development to tweak Mediawiki and Wikibase to feat our needs
  • System administration, to install and maintain our Mediawiki instance, as well as a Sparql engine
  • Data analyse coupled with user experience designing with a focus on data output, visualization and query engine.

Please add yourself in the participants list bellow if you feel interested to play at least on of this role.

Note: special thanks to Laura Hale which came with the initial list of roles.



  • Other suggested names for the project :
    • WikibaseLex
    • WikiLex

Related documentsEdit