Talk:Community Wishlist Survey 2022/Generate Audio for IPA

Add topic
Active discussions

Project Announcement and FeedbackEdit

Contributors who engaged with this Wish's proposal

Rollo Rosewood Akathelollipopman Eptalon Noé Xavier Dengra Akathelollipopman Noé Pigsonthewing Ainali Modest Genius Pigsonthewing 1234qwer1234qwer4 Nachtbold Xaosflux Femkemilene Wskent Bischnu Akathelollipopman Vis M Yodin Matě MrMeAndMrMe UV Daud I.F. Argana Huji Sdkb Ottawajin Lectrician1 Tmv Tranhaian130809 Celerias Meiræ Spiros71 NguoiDungKhongDinhDanh Javiermes Aca Dexxor Ed6767 Lollipoplollipoplollipop Omnilaika02 ToBeFree


Thank you for all of your feedback and for engaging with the original proposal for this wish. I wanted to make you aware that we have begun our work on this wish and, if your capacity allows, we would love any input you have on our Open Questions as well as our initial investigations into the engines.

Here's a corpus of IPA audio we have tested. Please let us know if you have any words you would like to test in this testing corpus. We will work on adding those words to our corpus!
Here's technical investigation of the IPA options and the languages supported by each option.


Thanks again for engaging with this impactful wish and for participating on the wishlist.
Best, NRodriguez (WMF) (talk) 18:01, 20 May 2022 (UTC)

Contributors who engaged with this Wish's proposal

Nw520 Pelagic Wostr Gusfriend Ali Imran Awan TheInternetGnome Minorax Man77 NightWolf1223 HynekJanac L235 Libcub Teratix Penalba2000 JAn Dudí Lrkrol Sadads Bencemac Mbkv717 Stwalkerster Dave Braunschweig Trey314159 Labdajiwa Thingofme Pppery Hià Paradise Chronicle Serg! Camillu87 Geertivp Amorymeltzer Aimwin66166 Rotavdrag Paucabot WikiAviator Daniel Case Wutsje Ninepointturn Bilorv Pi.1415926535 DarwIn Feoffer Tomastvivlaren Kpjas SD0001 Lambsbridge Paul2520 Waldyrious Bestoernesto Michael Barera Vulphere Ericliu1912 Emaus KnowledgeablePersona Beta16 Bodhisattwa Pbsouthwood DaxServer Cybularny Quiddity Sunpriat Gaurav Jl sg Evrifaessa Valerio Bozzolan Brainulator9

NRodriguez (WMF) (talk) 18:08, 20 May 2022 (UTC)

Open QuestionsEdit

Can you help us build out the corpus of IPA words we will use to test the different libraries?Edit

  • Has any tonal languages been included? I don’t think I see Swedish or any Chinese language, for example, but maybe there are some tonal languages in the corpus that I don’t recognize. Also, is the current corpus including unusual consonants or vowels? I have tested eSpeak myself and know that it cannot handle Cantonese (it cannot pronounce the syllabic m; I tried to figure out how to fix it but there’s really no documentation). Al12si (talk) 14:44, 12 November 2022 (UTC)

Do you know of any open source libraries that we should consider while we investigate our options?Edit

Do you see any risks to introducing the video files inside the reader experiences?Edit

  • "Video"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:21, 27 May 2022 (UTC)
    I believe this is regarding the software extension used to play media files. There's a specific task for making the player display in a desirable way, at phab:T122901 (versus the full audio-player as currently used at d:shibboleth, or the icon+"listen" links as used at w:Shibboleth).
    The only risk I see is making sure the design is good: I.e. everyone (incl. screenreaders?) can access the audio-clip without leaving the page, but also still have access to the file/license info if desired. (@TheDJ:FYI) HTH. Quiddity (talk) 17:27, 27 May 2022 (UTC)
  • I think the main issue with this feature is that it could display a false standard accent, making English projects sound more USA-centred, French projects sound more France-centered, Spanish projects sound more Madrid-centered and so on. A scripted sound can be prototypical, with approximate sounds for each consonants and vowels, an audio can't, audio fixes one version, with subtile traits such as length, highness, openness of vowels, pitch and others. There is no generic or neutral pronunciation. One way to deal with this issue may be to display several audio for each IPA, with regional distinctions. In addition with a preset for users to have in first their own local use, it may be interesting and less oppressive. Anyway, I am interested by this feature and I really hope you will make your UX tests public -- Noé (talk) 15:55, 7 November 2022 (UTC)

Let us know any other thoughts you may have on the initial problem statement...Edit

The Wikivoyages have phrasebooks. They don't use IPA – see voy:en:Wikivoyage:Phrasebook article template#Pronunciation guide for the English version; the other languages are similar – but it might be a useful source of words, and it's possible that getting IPA-based audio would encourage people to add IPA there. In the past, we've talked about both the value of IPA to some readers and need for audio (specifically, being able to hear the IPA without loading another page or covering up the text you're reading). Whatamidoing (WMF) (talk) 18:15, 30 May 2022 (UTC)

Google Cloud dependency?Edit

Is it the case that this feature is dependent on closed-source software in the Google Cloud, or is it independent and self-hosted? HLHJ (talk) 16:56, 15 October 2022 (UTC)

Currently, yes. The open source solutions we found only supported a handful of languages, and didn't sound remotely as accurate as Google's TTS service. Rest assured this all done through the backend, and even then through a proxy, so no user data ever gets to Google. Longer-term we hope to switch back to open source once language support and quality is good enough. That is being tracked at phab:T317274. MusikAnimal (WMF) (talk) 03:13, 17 November 2022 (UTC)

ScheduleEdit

@MusikAnimal (WMF) and @Whatamidoing (WMF) and @NRodriguez (WMF), can you please fill in/update Community Wishlist Survey 2022/Generate Audio for IPA#Release timeline ? —TheDJ (talkcontribs) 12:35, 23 November 2022 (UTC)

@TheDJ: I've made a start and will do some poking   ~TheresNoTime-WMF (talk) 20:43, 23 November 2022 (UTC)
Return to "Community Wishlist Survey 2022/Generate Audio for IPA" page.