WikiProject Language samples

The Wikipedia exists in many languages, in which we have articles about languages. We already have a lot of high quality wikipedia-articles about languages in which you can find information about the number of speakers, the vocabulary, stem and grammar of the language. But often one very natural question about a language is still left unanswered: "How does this language sound?".

This project seeks to change this. The goal of this project is to add a small sample of text spoken by a native speaker to all the articles about languages. The first article of the Universal Declaration of Human Rights (UDHR) is an appropriate choice for this, since it is translated in "all" languages of the world, public domain and of the right length to be included to a Wikipedia article.

This is an example of how this could look and sound for the Japanese language:

 
すべての人間(にんげん)は、()まれながらにして自由(じゆう)であり、かつ、尊厳(そんげん)権利(けんり)とについて平等(びょうどう)である。人間(にんげん)は、理性(りせい)良心(りょうしん)とを(さず)けられており、(たが)いに同胞(どうほう)精神(せいしん)をもって行動(こうどう)しなければならない。
subete no ningen wa, umarenagara ni shite jiyū de ari, katsu, songen to kenri to ni tsuite byōdō de aru. ningen wa, risei to ryōshin to o sazukerarete ori, tagai ni dōhō no seishin o motte kōdō shinakereba naranai.
All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

Project-History edit

The project benefited of a Librivox-project, which also had the goal of creating recordings of the Universal Declaration of Human Rights in over 50 languages. MichaelSchoenitzer imported those to commons, edited the files (see below) and extracted the first article. Through a project in the German Wikipedia they were included in the articles there.

Now it's time to make this into a international community-project – Wikipedians all around the globe can record the first article (or even the whole document) in their mother tongue, and the Wikipedia communities can add them to their language articles.

How-to create a recording edit

 

You want to read the first article of the UDHR in your mother tongue? Or you could convince some other person to do so? Awesome.

First: Get a microphone. Cheap Microphones have of course a lower sound quality, but if you follow the descriptions below even a cheap microphone will give reasonable results. If you live in a country with a local chapter you can ask there whether they can help you getting a microphone, for example in Germany you can borrow a microphone at Wikimedia Deutschland.

You can get the translation of the UDHR in your language at OHCHR.org. Find the first article and copy it to an editor or text program and format it in a way you can most comfortably read it. Before starting the recording read it two or three times loud and drink some water. If you misread a word simply read the word or group of words again and later cut out the wrong version.

Very important: when doing the recording make sure you also record at least 5 seconds of silence at the beginning or end of the recording – this is needed for editing. If you never did a recording, we recommend to use the free software Audacity. If you have a passive microphone (without power supply): activate the microphone boost and put the volume control to maximum. For an active microphone make sure the audio is not that high, that you reach the maximum gain when recording. After the recording, mark the part with silence you recoded and click on Effect -> Noise Reduction and click the Get Noise Profile button. After that select the whole recording (Edit > Select > All or the hotkey CTRL + A) and go again at Effect -> Noise Reduction and click the OK button. After that you can remove the silence and if there were any the misread sections by simply selecting them and pressing Del. After that use from the Effect-Menu the filters Compressor, Leveller and Normalizer in this order. The default settings should be fine. When you are done, go on File -> Export audio, choose Ogg Vorbis as format and save the file.

Upload your recording to Wikimedia Commons, put it in the Category Audiorecordings of Article 1 of the Universal Declaration of Human Rights and add it on the listing below.

More tips for high-quality audio samples can be found in: A short guide to the recording of high-quality audio samples for Wiktionary

Project Status edit

So far we have recordings of the following languages:

Language Full recording Recording of Artikel 1 German Wikipedia your Wikipedia…

edit

Afrikaans   Done   Done link 1

no
Arabic   Done   Done link 1

  Done
Acehnese   Done   Done link 1

  Done
Balinese   Done   Done link 1

??
Basque   Not done   Done link 1

  Done
Brazilian Portuguese   Done   Done link 1

  Done
Buginese   Done   Done link 1

  Done
Bulgarian   Done   Done link 1

  Done
Catalan   Done   Done link 1

  Done
Chinese (Mandarin)   Done, 2 Versions   Done link 1

Czech   Done   Done link 1

  Done
Danish   Done   Done link 1

  Done
Dutch   Done, 2 Versions   Done link 1 link 2

  Done
English   Done, 2 Versions   Done link 1


Esperanto   Done   Done link 1

  Done
Faroese   Done   Done link 1

no
Finnish   Done   Done link 1

no
French   Done, 3 Versions   Done link 1 link 2 link 3

  Done
German   Done   Done link 1

no
Modern Greek   Done   Done link 1

Hebrew   Done, 2 Versions   Done link 1

  Done
Hindi   Done   Done link 1

no
Hungarian   Done   Done link 1

  Done
Indonesian   Done, 2 Versions   Done link 1 link 2

no
Italian   Done, 2 Versions   Done link 1

  Done
Japanese   Done   Done link 1 link 2

  Done
Javanese   Done   Done link 1

  Done
Javanese (Semarang)   Done ToDo no article
Kapampangan   Done   Done link 1

  Done
Korean   Done   Done link 1

  Done
Latin   Done, 2 Versions   Done link 1

no
Latvian   Done   Done link 1

  Done
Luxembourgish   Done   Done link 1

  Done
Malay   Done   Done link 1 link 2

Minangkabauian   Done   Done link 1

  Done
Nynorsk   Done   Done link 1

Todo
"plain" ???   Done ToDo ???
Okzitanian (Languedocien)   Done   Done link 1

no
Oriya   Done   Done link 1

  Done
Persian   Not done   Done link 1


Polish   Done, 2 Versions   Done link 1

  Done
Portuguese   Done, 2 versions   Done link 1 link 2

  Done
Romanian   Done very bad Quality ToDo
Russian   Done   Done link 1

  Done
Swedish   Done, 2 Versions   Done link 1

  Done
Slovak   Done   Done link 1

  Done
Serbian   Not done   Done link 1

Todo
Sesotho / South Sotho   Not done   Done link 1

  Done
Spanish   Done, 2 Versions   Done link 1 link 2

  Done
Sundanese   Done   Done link 1

  Done
Tagalog   Done   Done link 1

  Done
Tamil   Done, 2 Versions   Done link 1

no
Turkisch   Not done   Done link 1

(bad quality)

Todo
Ukrainian   Done   Done link 1

  Done
Urdu   Done   Done link 1

  Done
Walloons   Done   Done link 1

  Done
West Frisian   Done   Done link 1

  Done
Yiddish   Done   Done link 1

no

Add language


Open Questions and Tasks edit

  • Should we also make recordings in different dialects?
  • How do we link the audio-files on Wikidata?
  • How do we reach native speakers of small languages?
  • Design a logo for this project