Language policies

For the policy on new language editions of existing projects, see Meta:Language proposal policy.
This is a discussion page for presenting current informal language policies, and potential ideas for the future.

There are currently few language policies on individual Wikipedia projects, on Meta, or for the Wikimedia Foundation in general. The lingua franca in use on Meta and for WMF documents is English; there is broad support for the general idea that content should be made available in as many languages as possible.

The status quo edit

Project policies edit

Most Wikipedias, and most other projects, have a policy that no entries should be in languages other than the official project language. Sometimes this takes an extreme form; such as English Wikibooks administrators suggesting that a book should not contain single pages that are not in English.

Meta and Commons policies edit

Meta, the Wikimedia Commons, and multilingual Wikisource are explicitly multilingual; people can post content in whatever language they please, and are encouraged to create language-specific portals. On Wikisource, however, editors are encouraged to put content into a language-specific subdomain if one exists.


New project creation edit

see Requests for new languages for details

The general policy is that new projects should be created in any living written language and any other language that can effectively make a case for itself.


Announcement dissemination edit

When Foundation announcements come out, they are broadcast for translation to a wide assortment of translators. No specific priority is currently given to one language over another, though languages with more active participants, and with more active participants on Meta or in the Foundation, have a greater chance of being translated promptly.

The largest regular translations are announcements about fundraisers and about Wikimedia-wide elections.

Press release dissemination edit

When press releases come out, they are usually written in a single language, with translation solicited and provided as an afterthought. Exceptions are made for all-project milestones intended for global release and announcement.

Ideas for the future edit

Language data and priorities edit

Things we may want to track:

  1. the set of all languages that might interest Wikimedia
  2. the size of the potential audience in each language,
  3. the size of the current contributor base in each language
  4. the availability of wikimedia content [online, snapshots, print] to speakers of those languages
  5. the extent and quality of content in those languages; the extent of localization of MediaWiki, and the various Wikimedia project customized interfaces, in those languages
  6. the importance of that language to Wikimedia (some agreed-upon prioritization, a combination of the above points, a desire for balance, and other priorities)
  7. our bridge-communicators for that language (multilingual editors, non-editing translators, other)
  8. the extent and quality of other reference works and educational content in those languages

PR and languages edit

From a PR <--Public Relations?--> standpoint, it would be very useful to know

  • in which languages Wikipedia is the only serious reference work available.
  • in which languages Wikipedia is the largest reference work
  • how many hits / distinct visits Wikimedia receives, by language
    • in which languages Wikipedia is the most popular site in the world

Tracking translations edit

Translations between articles should be explicitly noted. When you translate a section from Hindi into an article in Dutch, you shouldn't have to remember to add a link to the article history into the edit summary; the edit history of the target article should automatically note that some of its history is a derivative of work by others (deep link to the relevant revision of the source page).

Phrase and term translations should also be tracked; ideally chunked into strings and matched up source to target, in a way that can be stored as 'translation memory' and potentially reused later. One way to do this might be integrating an open source translation-workflow package with MediaWiki.

Some of the terms in such a package could come directly from interlanguage links on various projects. Others could come from Wiktionary content.

Improved interlanguage links edit

Right now, interlanguage links a) have to be updated dozens of times when a single new language is added to the list of languages for a topic, b) may not be consistent across different languages, and c) only support a single link into any particular language.

To demonstrate c) -- if en:wp has an article on Education and a redirect to it from Pedagogy, and nl:wp splits relevant content into distinct articles on both, there's no way to link to both articles from the English article... each Dutch article could link to en:Education, but the link back to nl could only go to one of them).

A better implementation would have a separate interlanguage links table in the database (rather than storing that content as wikitext), and would display the same set of links on all language-versions of a given topic.


Urgently needed edit

Repository for Translation Memories (TMs) edit

For translations of wiki contents (actually IT-NAP but also other combinations) I use OmegaT this is a tool than can and should be used for any translation to allow for easy context search etc. Therefore it makes sense to exchange translation memories created by OmegaT. I already have some and I would like to upload them somewhere so that others can use them when they translate. This will help in any translation project for the Mediawiki-projects. I propose to share these translation memories with .TMX/.tmx extension (tmx is the standard for Translation Memroy eXchante) and upload them on meta creatin an appropriate category per language combination.