Special language codes
The language of a Wikimedia wiki can be found in the lang="..."
and xml:lang="..."
attributes of the <html>
element of each page (or other elements for specific subcontents in multilingual pages); they are also used for styling in CSS language selectors. These language codes should generally be canonical language tags as defined by BCP 47.
In most cases, the subdomain names that we use for projects correspond to language codes, but there are some remaining exceptions. This usually occurs for historical reasons, where a valid ISO 639 code (or registered and non-deprecated BCP 47 variant code) was still not available at the time of creation of the project, but also because some former ISO 639 codes where deprecated or removed as they encompassed an group of languages that are now considered distinct.
Deprecated or removed ISO 639 codes are still considered valid in BCP 47 (where existing codes are not removed) most often as possible fallbacks for missing translations or to allow upward compatibility, even if they are no longer recommended for modern use and newly created contents (using these codes can potentially create unsolvable disputes in Wikimedia unless they are distinguished with distinct translations using newer codes). In some cases, some early distinctions in ISO 639 have also been removed because they were introduced artificially for a temporary time (sometimes for non-neutral political reasons) but not well supported by users, and when they unnecessarily complicated the task of translators, or when they too frequently required the use of language fallbacks or automatic transliterators (when a reliable standard and orthographic conventions was adopted between most users of different script variants), or because of development of education for better mutual understanding and acceptation of multiple variants in vernacular use.
Subdomains that do not match their lang
attribute
edit
Subdomain | Language | Project(s) | Notes |
---|---|---|---|
als |
Local name: Alemannisch English name: Alemannic Language family: Germanic |
Wikipedia, Wiktionary, Wikibooks, Wikiquote | Uses gsw which matches the language's ISO 639-3 code.
|
bh |
Local name: भोजपुरी English name: Bihari Language family: Indo-Aryan |
Wikipedia |
Ambiguous legacy code. Uses |
roa-rup |
Local name: armãneashti English name: Aromanian Language family: Italic |
Wikipedia, Wiktionary | Uses rup which matches the language's ISO 639-3 code.
|
simple |
Local name: Simple English English name: Simple English Language family: Germanic |
Wikipedia, Wiktionary | Uses en of ordinary English.
|
zh-classical |
Local name: 文言 English name: Classical Chinese Language family: Sinitic |
Wikipedia | Classical Chinese has ISO 639-3 code lzh .
|
zh-min-nan |
Local name: 閩南語 / Bân-lâm-gú English name: Minnan Language family: Sinitic |
Wikipedia, Wiktionary, Wikibooks, Wikiquote, Wikisource | Min Nan has ISO 639-3 code nan .
|
zh-yue |
Local name: 粵語 English name: Cantonese Language family: Sinitic |
Wikipedia | Cantonese has ISO 639-3 code yue .
|
Miscellaneous:
- All subdomains of wikimedia.org
Subdomains that do not conform to a valid ISO 639 language code
editSubdomain | Language | Project(s) | Notes |
---|---|---|---|
als |
Local name: Alemannisch English name: Alemannic Language family: Germanic |
Wikipedia, Wiktionary, Wikibooks, Wikiquote |
Alemannic has ISO 639-3 code |
bat-smg |
Local name: žemaitėška English name: Samogitian Language family: Baltic |
Wikipedia |
Samogitian has the ISO 639 code |
cbk-zam |
Local name: Chavacano de Zamboanga English name: Chavacano de Zamboanga Language family: Pidgin and Creole |
Wikipedia |
Chavacano de Zamboanga has no ISO 639 code as an individual language. ISO 639-3 code |
eml |
Local name: emiliàn e rumagnòl English name: Emilian-Romagnol Language family: Italic |
Wikipedia |
ISO 639-3 code |
fiu-vro |
Local name: võro English name: Võro Language family: Finno-Permic |
Wikipedia |
Võro has ISO 639-3 code |
iu |
Local name: ᐃᓄᒃᑎᑐᑦ / inuktitut English name: Inuktitut Language family: Eskimo-Aleut |
Wikipedia | ISO 639 considers iu /iku not a single language, but a macrolanguage comprising ike and ikt . MediaWiki agrees (see phabricator), but: falls back to ike , called ike-cans ; adds ike-latn ; has no ikt support. CLDR considers Cans an aspirational script.
|
ksh |
Local name: Ripoarisch English name: Ripuarian Language family: Germanic |
Wikipedia | ISO 639-3 code ksh is assigned to Kölsch, a subset of Ripuarian.
|
map-bms |
Local name: Basa Banyumasan English name: Banyumasan Language family: Sunda-Sulawesi |
Wikipedia | Banyumasan has no ISO 639 code as an individual language. ISO 639-1 code jv /jav is assigned to Javanese, a superset of Banyumasan.
|
nds-nl |
Local name: Nedersaksies English name: Dutch Low Saxon Language family: Germanic |
Wikipedia | Duplicated with Low German's nds .
|
nrm |
Local name: Nouormand English name: Norman Language family: Italic |
Wikipedia |
Norman has no ISO 639 code as an individual language (However, two dialects of Norman, Guernésiais and Jèrriais, are sharing ISO 639-3 code |
roa-rup |
Local name: armãneashti English name: Aromanian Language family: Italic |
Wikipedia, Wiktionary |
Aromanian has ISO 639-3 code |
roa-tara |
Local name: tarandíne English name: Tarantino Language family: Italic |
Wikipedia | Tarantino has no ISO 639 code as an individual language. ISO 639-3 lumps it with Italian, as with most varieties of northern Italy. |
sh |
Local name: srpskohrvatski / српскохрватски English name: Serbo-Croatian Language family: Slavic |
Wikipedia, Wiktionary |
|
simple |
Local name: Simple English English name: Simple English Language family: Germanic |
Wikipedia, Wiktionary |
Simple English has no ISO 639 code but has a registered IETF variant subtag |
zh-classical |
Local name: 文言 English name: Classical Chinese Language family: Sinitic |
Wikipedia |
Classical Chinese has ISO 639-3 code |
zh-min-nan |
Local name: 閩南語 / Bân-lâm-gú English name: Minnan Language family: Sinitic |
Wikipedia, Wiktionary, Wikibooks, Wikiquote, Wikisource |
Min Nan has ISO 639-3 code |
zh-yue |
Local name: 粵語 English name: Cantonese Language family: Sinitic |
Wikipedia |
Cantonese has ISO 639-3 code |
Miscellaneous:
tokipona
– defunct Wikipedia subdomainru-sib
– defunct Wikipedia subdomain, hoax in fictional “Siberian” languagebe-x-old
– fixed and redirected tobe-tarask
Wikipedia subdomain (see phab:T11823)
Other distinctions
editSubdomain | Language | Project(s) | Notes |
---|---|---|---|
ms |
Local name: Bahasa Melayu English name: Malay Language family: Sunda-Sulawesi |
Wikipedia, Wikibooks, Wiktionary | Malay language used to be "ms", just like Indonesian language is "id", but since the Malay Wikipedia inception, the code "ms" has become the code for macro language (not individual language).
There are many individual languages under "ms"/"msa", including Indonesian ("id"/"ind"), Banjar ("bjn"), Minang ("min"), three living languages with their own Wikimedia projects, as well as Malay (individual language) ("mly"-Deprecated 2008 or "zlm"-Malay or "zsm"-Standard Malay / Malaysian Malay / Malaysian language) It should be noted that the Malay Wikipedia, Wikibooks, and Wiktionary all predate the change in the language code in 18 February 2008, with the latest one, Malay Wikibooks, created on 24 August 2004. See also: |
ak |
Local name: ak English name: Akan Language family: Niger-Congo |
Closed: Wikipedia, Wikibooks, Wiktionary | Akan (ak/aka in ISO 639-3) is a macrolanguage consisting of two separate languages Twi (tw) and Fante (fat). The Akan Wikipedia was closed in April 2023 as redundant to the Twi and Fante Wikipedias. Akan Wikibooks and Wiktionary also existed but were closed in 2007/2008 due to never having any content. |
de-formal |
Local name: Deutsch English name: German Language family: Germanic |
— | Not used as host names but included as pseudo-variant subtags (unregistered) for some translations in translatewiki.net (used in Meta-Wiki for pages like policies when referring directly to wiki users according to their preferences): we should have used a private-use extension |
nl-informal |
Local name: Nederlands English name: Dutch Language family: Germanic |
— |
Technical language code
editThe special language code qqx
can be used to display the ids of all system messages used on a page.
See also
edit- Language code (in MediaWiki)
- Language codes (on this Meta-Wiki)
- Chapters per Wikimedia language
- Table of Wikimedia projects
- Phabricator task T21986: “Wikis waiting to be renamed (tracking)”
- Phabricator task T44396: “Duplicate/invalid language codes in Wikidata”
- CLDR language code aliases and Wikimedia codes missing