Community Wishlist Survey 2022/Generate Audio for IPA/ha

Audio for IPA notation
Audio for IPA notation
	Generate the audio for IPA notation for readers of the WMF projects
Group:	Wikimedia Language Engineering
Team members:	Pau Giner, Amir Aharoni, Santhosh Thottingal, Niklas Laxstrom, Kartik Mistry, Mary Munyoki, Nik Gkountas
Lead:	Pau Giner (product owner), Niklas Laxstrom (engineering manager)
Updates:	Updates

This page is a translated version of the page Community Wishlist Survey 2022/Generate Audio for IPA and the translation is 9% complete.

Wish Objective Summary: Generate the audio for International Phonetic Alphabet (IPA) notation for readers of the WMF projects.

Hello all, and thanks for coming to read more details about Audio for IPA Notation. This was the #9 wish in the Community Wishlist Survey 2022. This article will outline our approach to building a solution of this wish. We are asking for your feedback and insight so that we may make the best possible improvement.

Original Wish

Background & Problem Space

What is IPA?

The English Wikipedia page for International Phonetic Alphabet (IPA) describes it as an alphabetic system of phonetic notation based primarily on the Latin script. IPA is a standardized representation of the documented possible speech sounds that humans can generate making use of their vocal instruments in written form.

This markup allows for contributors to add IPA to our projects so that others who may not know how to pronounce a word can learn how. However, because only very few people on the planet read this notation, it is practically impossible for folks to discern how to pronounce something purely based on IPA notation. While IPA coverage is strong across the projects in the Wikiverse, this does not necessarily mean that folks will know how to pronounce a word.

Where does IPA Exist in our Projects?

IPA exists in multiple Templates inside the WMF-supported projects. As of today, the associated Wikidata object for Template:IPA lists that there are:

Wikipedia: 177 entries
Wikibooks: 21 entries
Wikinews: 1 entry
Wikiquote: 3 entries
Wikisource: 5 entries
Wikiversity: 4 entries
Wikivoyage: 5 entries
Wiktionary: 85 entries
Multilingual sites: 3 entries

Here are two illustrative instances of IPA in our projects:

Where is the hidden complexity of the work ahead of us?

Given that there are many instances of IPA coverage in the WMF products, a large portion of the complexity will be in understanding what libraries are available to us and have the best coverage across languages.

There is also the pre-existing reality that IPA has some languages that do not have dedicated keys for the speech sounds that a given language can make.

A large portion of the complexity will include:

Investigating, comparing, and testing which library gives the projects the most language coverage while still remaining a stable solution
- Here is a demo of the different Text to Speech engines and how they sound with a sample corpus of IPA input
- Here is the public documentation of the investigations into the strengths and weaknesses of each of the engine options
Ensuring that the phonetic library we end up deciding to use can benefit the most number of instances of IPA projects

Scope and Constraints

The focus of this project is around readers’ inability to read IPA markup.

The project will NOT address the pre-existing equity concerns associated with IPA. While we may do our best to mitigate the equity shortcomings of IPA, it is not part of scope to tackle this
This project will NOT generate new phonetic library, we will use a pre-existing audio library

The project will NOT address the content generation of IPA notation on the wikis, i.e. the Editor experience. This is mitigated by offering the rendered audio for IPA notation with no pre-existing audio associated with it

Data Investigations and Unknowns

How many users attempt to listen to the pronunciation of IPA notation?
How many instances of all IPA templates exist across the projects?
What’s the coverage of human generated pronunciation of IPA on our projects?
What content is missing IPA notation that needs it?

Why and how did we accept this wish?

This wish scored high in our prioritization process for 2022. It was very popular in terms of number of votes, impactful in terms of the benefit for the community, and had a relatively lower complexity estimate. Please read about our full process.

Release timeline

Release timeline
Item	Owner	Status	Actual Date	Target Date	Notes
Confirm all tasks merged and ready
Legal review
Security review of Phonos extension (signoff)	KSiebert (WMF)	Done	Sep 1 2022	N/A	This allowed us to deploy to the beta cluster only.
Deploy Phonos to beta cluster	TheresNoTime-WMF	Done	Oct 11 2022	N/A	Beta en.wiki, en.wikt and en-rtl.wiki only Unexpected delays due to T317195, T317417
Security review of Phonos extension (approval)		Done	2022-11-25	2022-11-25
Determine pilot projects for Phonos rollout	STei (WMF)	Done	Nov 15 2022		See phab:T314294
Announcement in Tech/News
Deploy to testwiki	MusikAnimal	Done	2022-11-29	2022-11-30	* Including a simulation of a larger-scale template deployment to test job queuing
Deploy to pilot wikis	HMonroy (WMF)	Done	2023-01-17	2023-01-09
Deploy to Group 0					The remaining wikis (outside of test and pilots) may not all be done by Group if we end up slow-rolling this.
Deploy to Group 1
Deploy to Group 2
Announcement on project page & any tool-specific pages
Announcement in #release-announcements Slack channel
Bugs identified and cut
Bugs triaged

Status Updates

Sabuntawa: An kunna mai kunna layi na phonos a wasu Wikis

Barkan mu,

Wasu Wikis na iya amfani da mai kunna sauti na cikin layi wanda ƙarin Phonos ya aiwatar. Hanyoyin haɗin sautin ku na iya kunna akan dannawa a cikin da Wikis inda aka kunna na'urar sauti ta layi.

Tare da na'urar mai jiwuwa ta layi, zaku iya ƙara snippets na sauti-zuwa-magana zuwa shafukan wiki ta hanyar amfani da alamar kawai:

<phonos file="audio file">Listen</phonos>

Tambarin da ke sama zai nuna rubutun kusa da alamar lasifikar, kuma danna shi zai kunna sautin nan take ba tare da kai ku zuwa wani shafi ba. Misali na gama-gari inda zaku iya amfani da wannan fasalin shine wajen ƙara furuci ga kalmomi kamar yadda aka kwatanta akan Wikin Turanci na ƙasa.

{{audio|en|en-uk-English.oga|Audio (UK)}}

Zai iya zama:

<phonos file="en-uk-English.oga">Audio (UK)</phonos>

Kuna iya karanta game da fasalin kuma ku ba mu ra'ayi ko yin tambayoyi game da shi a cikin wannan shafin magana.

June 2023 Update: IPA transition to Language Team

Hello all,

We have news about IPA Audio Renderer, the 9th most popular wish of the 2022 Community Wishlist Survey.

Community Tech will hand over the project to the Language Team this June. This decision is due to the Language team's expertise in localization, and their focus to create a suite of open language-supporting services such as the MinT machine translation service.

In addition to machine translation, the Language team plans to expand the offering of open language services with Text-to-Speech, creating a stable technological foundation for projects such as the IPA Audio Renderer.

Also, with the need to optimize Wikimedia's 2023-24 annual budget, the above makes them the right team to expand and maintain the IPA since they have resources for this kind of work.

Over the past few months, Community Tech has collaborated with some of you to build and deploy the feature on some pilot wikis. Thank you for working with us.

If you have any questions regarding the deployment of the IPA Audio Renderer, please ask on our talk page.

We hope you collaborate kindly with the Language Team in the subsequent phases of this project.

Update: Open Questions: We want to hear from you!

Can you help us build out the corpus of IPA words we will use to test the different libraries?
Do you know of any open source libraries that we should consider while we investigate our options?
Do you see any risks to introducing the audio files inside the reader experiences?
Let us know any other thoughts you may have on the initial problem statement inside the talk page!