Community Wishlist Survey 2021/Wiktionary

Wiktionary
4 proposals, 140 contributors, 218 support votes
The survey has closed. Thanks for your participation :)



Tool for recording voiced pronunciation of words

  • Problem: There is a section "Pronunciation" in Wiktionary entries. Now it is possible (1) to record audio file, (2) upload file to Wikimedia Commons, (3) write and fill the template with a description of this file at Wikimedia Commons, (4) insert the link to this file into Wiktionary entry. This is too long. A more fun and interesting way is to click the button "record" in the Wiktionary article and record (1) the pronunciation of the word or phrase, (2) the pronunciation of the sentence in quotation. It will be a great step in Wiktionary development.
  • Who would benefit: Everyone who can listen.
  • Proposed solution: ? I think the developers and programmers will search for the solution. I hope.
  • More comments:
  • Phabricator tickets:
  • Proposer: Andrew Krizhanovsky (talk) 07:55, 17 November 2020 (UTC)[reply]

Discussion

An integration of Lingua Libre could be a way to make it happens. I agree it is needed and could have great results in data collection. Audio is a way to capture diversity Noé (talk) 08:54, 17 November 2020 (UTC)[reply]
  • @AKA MBG: I think this would be a good improvement for wiktionary. It all makes sense. MemeGod27 8:49, 19 November 2020 (EST)
  • *Lingua Libre is a growing project hosted by Wikimédia France, which aims to record pronunciations and signs online. It allows easy massive recording, uploading the files to Wikimedia Commons, and a bot integrates the files to Wiktionaries (FR and OC for the moment, hopefully more soon). More than 330.000 audios have been recorded with Lingua Libre, in almost 100 different languages, thanks to 390 speakers. For more information, you can visit LinguaLibre:About. — WikiLucas (🖋️) 22:07, 20 November 2020 (UTC)[reply]

Comparison:

  • Spell4Wiki app help to record and upload audio for Wiktionary words to Wikimedia Commons. The app is also a multilingual dictionary based on Wiktionary. It is a F/LOSS tool being developed by Kaniyam foundation and VGLUG with few self-financed, Tamil F/LOSS volunteers and Wikipedians in Tamil Nadu, India. Actually this app allows you to record and upload .ogg audio(sample file) easily. Then, uploaded audio files automatically linked with appropriate Wiktionary word(sample word). All Uploaded files are categorized under Files uploaded by Spell4Wiki and particular language categories. You can download app from this link. We are under development of audio file linking with Wikidata and app improvements. More details you can visit Spell4Wiki - --Manimaran96 (talk) 20:41, 29 November 2020 (UTC)[reply]

Isn't there some kind of app that will sound out the International Phonetic Alphabet? Then you wouldn't have to upload anything. VaneWimsey (talk) 02:21, 12 December 2020 (UTC)[reply]

Voting

Adopt Lingua Libre Bot service as a WMF tool

  • Problem: Lingua Libre is a tool which allows to easily and quickly record a high quantity of words, from local lists, from Wiktionaries/Wikipedias categories, PetScan requests and, Lexemes from Wikidata. The files are automatically uploaded to Commons with metadata about the speaker, and Lingua Libre Bot (code here) adds them onto the corresponding entry, on FR and OC wiktionaries as well as on Wikidata lexicogaphical data. But this bot is mantained by a volunteer and is sometimes stopped for weeks.
This service should be adopted as a WMF Tool in order to make it more stable and adaptable to every Wiktionary.
  • Who would benefit: More than 300.000 audio files have been created via Lingua Libre, in ~100 languages, thanks to ~400 speakers. The project is growing and being adopted by communities around the world, and allows to illustrate entries on Wiktionaries. The adoption of Lingua Libre Bot as a WMF Tool would be beneficial to all speakers, readers and users of Wiktionaries and Lingua Libre.
  • Proposed solution: To make the bot adaptable to all wiktionaries, add it to Toolserver and run it independently.
  • More comments: This request was formulated last year by Theklan and adapted by WikiLucas00.
  • Phabricator tickets:
  • Proposer: — WikiLucas (🖋️) 17:24, 29 November 2020 (UTC)[reply]

Discussion

  • Lingua libre has, among others, 5000 recordings in gascon language (ISO 639-3 gsc, also called occitan gascon by occitanists) ; the ISO 639-3 gsc code has been merged (why ?) into oc, but is not deprecated and still active. However, Wikimedia does not take it into account. The use of gsc, at least oc-gsc or oci-gsc, shall be wellcome for lingua libre as for the Wiktionary.
  • It should also upload Lexeme audios to Wikidata, not only to Wiktionary. -Theklan (talk) 18:23, 11 December 2020 (UTC)[reply]

Voting


Something like Extension:Variables to simplify template calls

  • Who would benefit: The syntax of entries could be cleaner and more similar to the end result.
  • Proposed solution: This could be done with mw:Extension:Variables, but it is unavailable on Wikimedia. So… make it available or maybe develop a better extension of this kind?
  • More comments:
  • Phabricator tickets:
  • Proposer: PiotrekD (talk) 21:02, 17 November 2020 (UTC)[reply]

Discussion

  • Extension:Variables unfortunately will not be deployed to WMF wikis. Do any of the alternatives listed at mw:Extension:Variables#Alternatives work for you? If I understand you correctly, it sounds like you need the variable to persist across multiple template calls, which Extension:Variables can't do anyway. MusikAnimal (WMF) (talk) 23:27, 17 November 2020 (UTC)[reply]
  • Does the solution that the en:Module:Citation/CS1 for automatic date formatting work for you? (Basically, read the page itself to extract a particular bit of wikitext that is structured reasonably.) See particularly reformat_dates in en:Module:Citation/CS1/Date validation. --Izno (talk) 05:41, 18 November 2020 (UTC)[reply]
    • @Izno: technically, yes. To be clear, I assume you are referring to mw.title:getContent(). We already exploit this "feature" as a means to categorize pages according to their part of speech, which is encoded as plain text since we've never fully adopted the automatic categorization via headword templates as enwiktionary does in wikt:en:Template:en-noun, for instance. However, beyond making the transcluding page record itself in WhatLinksHere, this feels like a hack as it depends on successful wikitext-based page parsing. Note this is not a one-pass action: given the specific configuration of plwiktionary's entry layout, we also need to perform this once per language section (take, for example, wikt:pl:Angola: 40 sections means invoking our hackish Lua parser 40 times). I just wouldn't like to keep adding more layers on top of that. Peter Bowman (talk) 11:16, 18 November 2020 (UTC)[reply]
  • Expanding on PiotrekD's problem description, entry-based projects (such as Wiktionaries) may expect significant gains in enabling this feature, especially regarding stuff that can perform semantic categorization of entries - but currently doesn't, or at least not in the way categories are meant to work, rather by periodically inspecting page contents and maintaining large lists such as wikt:pl:Indeks:Francuski - Medycyna. This list collects all French entries related to medicine based on their transclusion of wikt:pl:Template:med, which doesn't accept a language parameter (precisely this would be nice for categorization purposes) and it will probably never do: we have tons of such templates used across the entire site, potentially making it quite tedious to update hundreds of thousands of tranclusions, also accounting for the process of making our veteran editors aware of this change. In contrast, we could easily upgrade {{med}} and similar to fetch the corresponding language code, conveniently exposed in a variable that relates to the language section this template is placed in, and use it to categorize the page - no need to alter the page contents at all. Peter Bowman (talk) 11:16, 18 November 2020 (UTC)[reply]
  • A variant of this wish is already in phabricator as T331906, which proposed climbing the heading tree to extract the language information, with an alternative proposed in T122934#9196348. See also Extension:ArrayFunctions for another take on this. Cscott (talk) 17:10, 3 May 2024 (UTC)[reply]

Voting

Display definitions from Wikisource dictionaries

  • Problem: Wiktionaries aims to offer for each meaning one definition but there are many ways to describe a meaning, many words - including local uses and very technical terms. Some other definition in other dictionaries may be mentioned as references but they are not accessible in Wiktionary despite being for some of them in Wikisource.
  • Who would benefit: Readers wanting more than one definition.
  • Proposed solution: Many dictionaries are already in Wikisource and we can use them to offer more definitions. A dedicated transclusion or paragraphs from Wikisource in Wiktionaries could be a solution, by hand/bot or with an automatic harvesting of entries with a specific tagging in the dictionaries hosted in Wikisources. They could come from several Wikisources, to be display in several Wiktionaries. It could be a new tab next to "Article" and "Talk", named "Dictionaries" with definition for the same sequence of letters from dictionaries published in Wikisource. For French, I can imagine at least a dozen of definitions from as much dictionaries. For underdescribed dictionary with at least on source in Wikisource, it could be an interesting way to compare the source and how it evolve after its inclusion in Wiktionary.
  • More comments: Some dictionaries are already properly tagged; for the others, it could be a good opportunity to do it accordingly to TEI Lex0 guidelines, so that they can more easily be reused in open source projects. Also, to undermine a tendency when someone talk about Wiktionary: No, Wikidata Lexeme could not be of any help here. It is pure content and not data, and fall under CC BY-SA 3.0 in Wiktionary and for Wikisource dictionaries. This proposal is similar as this proposal posted last year by DaraDaraDara.
  • Phabricator tickets: T240191
  • Proposer: Noé (talk) 11:43, 29 November 2020 (UTC)[reply]

Discussion

Voting