Community Wishlist Survey 2019/Citations/Improve document type prediction from citoid

Improve document type prediction from citoid

  • Problem: Smoothing the addition of sources is super-important for the wikipedias UX.
  • Who would benefit: Citoid users -- some of the most important Wikimedians, measured by improvements to the encyclopedia
  • Proposed solution: The ORES team has made a generalized predictions-as-a-service interface called JADE; this is a machine learning task which would likely benefit
  • More comments: by Wednesday November 14th
  • Phabricator tickets: to be created upon wish approval
  • Proposer: James Salsman (talk) 02:55, 4 November 2018 (UTC)

Please see also wikitech:Citoid. The task is to correctly predict the document type (i.e., Journal, Book, News item, Web page, etc.) from the URL.


Please ask me questions about this proposal. James Salsman (talk) 18:06, 13 November 2018 (UTC)

My general impression is that this would be quite a lot of work (you'd need to build a type-predicting API) and then it is not totally clear to me how we'd even use the API. Zotero determines the type manually and there's no easy way to "change" the type of an item in the middle of the request. And then at the end of all of this I'm sceptical that the type matters all that much - maybe for subtle difference in citation style, but, for instance, Template:Citation can format any citation type. Even if mis-typed, the wrong citation template will still correctly display information about the right item, which is the most important part. I get that the mis-typing is annoying though. It might make a cool GSoC or Outreachy project. Mvolz (WMF) (talk) 14:03, 16 November 2018 (UTC)