User:OrenBochman/I18n
This is an assignment for Simone's adoption program. You are welcome to edit this page if you notice any errors or have any additional information to add, but as a courtesy, please notify OrenBochman if you make any major changes to avoid any possible confusion between him and his adoptee(s). Thanks! |
I18n
editA Lead paragraph motivation, outline and basis in policy. Remember:
- what is internatzation
- where is the code
- [1]
2 concepts of language:
- user's $wgLang
- content - $wgContLang
will be removed due to removal of globalization
Language::factory('en')
languages/ folder
langiages/
General use (for developers)
editLanguage objects
editThere are two ways to get a language object. You can use the globals $wgLang and $wgContLang for user interface and content language respectively. For an arbitrary language you can construct an object by using Language::factory( 'en' )
, by replacing en
with the code of the language. The list of codes is in languages/Names.php.
Language objects are needed for doing language specific functions, most often to do number, time and date formatting, but also to construct lists and other things. There are multiple layers of caching and merging with fallback languages, but the details are irrelevant in normal use.
Using messages
editMediaWiki uses a central repository of messages which are referenced by keys in the code. This is different from, for example, Gettext, which just extracts the translatable strings from the source files. The key-based system makes some things easier, like refining the original texts and tracking changes to messages. The drawback is of course that the list of used messages and the list of source texts for those keys can get out of sync. In practice this isn't a big problem, sometimes extra messages which are not used anymore still stay up for translation.
To make message keys more manageable and easy to find, always write them completely and don't rely too much on creating them dynamically. You may concatenate key parts if you feel that it gives your code better structure, but put a comment nearby with a list of the possible resulting keys. For example:
// Messages that can be used here:
// * myextension-connection-success
// * myextension-connection-warning
// * myextension-connection-error
$text = wfMessage( 'myextension-connection-' . $status )->parse();
Message processing introduction
editThe message system in MediaWiki is quite complex, a bit too complex. One of the reasons for this is that MediaWiki is a web application. Messages can go through all kinds of processing. The four major ones covering almost all cases are:
- as-is, no processing at all
- light wiki-parsing, parserfunction references starting with
{{
are replaced with their results - full wiki-parsing
Case 1. is for processing, not really for user visible messages. Light wiki-parsing should always be combined with html-escaping.
Recommended ways
editLonger messages that are not used hundreds of times on a page:
- OutputPage::addWikiMsg
- OutputPage::wrapWikiMsg
- wfMessage()
OutputPage methods parse messages and add them directly to the output buffer. wfMessage
can be used when a message should not be added to the output buffer. ->parse()
removes enclosing html tags from the parsed result, usually <p>..</p>
, but can generate invalid code for example if there is no root tag in parsed result, for example <p>..</p><p>..</p>
. Usage examples:
$out->addWikiMsg( 'foobar', $user->formatNum( count( $items ) ) );
$out->wrapWikiMsg( '<div class="baz">\n$1\n</div>', array( 'foobar', $user->getName() ) );
$text = wfMessage( 'foobar', $language->date( $ts ) )->parse();
Other messages with light wiki-parsing can use wfMsg
and wfMessage
with ->text()
. wfMessage should always be used if the message has parts that depend on linguistic information, like {{PLURAL:$1}}. Do not use wfMsg, wfMsgHtml for those kind of messages! They seem to work but are broken.
$out = Xml::submitButton( wfMsg( 'foobar' ) ); # no linguistic information
$out = Xml::label( wfMessage( 'foobar', $wgLang->formatNum( $count ) )->text() ); # uses plural on $count
Some messages have mixed escaping and parsing. Most commonly when using raw links in messages that should not be escaped. The preferred way is to use wfMessage
with ->rawParams()
for the affected parameters. Be especially wary of using wfMsgHtml
, it only escapes the message, not parameters. This has caused at least one XSS in MediaWiki.
Short list of functions to avoid:
wfMsgHtml
(don't use unless you really want unescaped parameters)wfMsgWikiHtml
(breaks up linguistic functions, as does wfMsg)- OutputPage::parse and parseInline, addWikiText (if you know the message, use
addWikiMsg
orwrapWikiMsg
)
Remember that almost all Xml:: and Html::-functions escape everything fed into them, so avoid double-escaping and parsed text with those.
Using messages in JavaScript
editTo use the messages in client side, we need to use resourceloader to make sure that the messages are available at client side first. For this, in your resource loader modules, define the messages to be exported to client side.
Example:
$wgResourceModules['ext.foobar.core'] = array(
'scripts' => array( 'resources/ext.foobar.js'),
'styles' => 'resources/ext.extension.css',
'localBasePath' => $dir,
'remoteExtPath' => 'FooBar',
'messages' => array(
'message-key-foo',
'message-key-bar',
),
);
The messages defined in the above example message-key-foo, message-key-bar will be available at client side and can be accessed by mw.msg( 'message-key-foo'). Se the example given below:
$( '<a>' ).prop( 'href', '#' ).text( mw.msg( 'message-key-foo') );
We can also pass the dynamic parameters to the message(ie the values for $1, $2) etc) as shown below.
$( '<a>' ).prop( 'href', '#' ).text( mw.msg( 'message-key-foo', value1, value2 ) );
In the above examples, note that the message should be defined in an i18n.php file. If the messagekey is not found in any i18n.php file, the result of mw.msg will be the message key in agnle brackets - like <message-key-foo>.
When using localization messages, be sure to always make sure it is properly escaped to prevent potential html injections as well as preventing malformed markup with special characters.
- If using jQuery's
.html
, use.text( mw.msg( ... ) )
instead of.html( mw.msg( ... ) )
. jQuery will make sure to set the elements' inner text value instead of the raw html. This is the best option and is also fastest in performance because it avoids escaping all together because .text() goes almost straight into the browser, removing the need for escaping. - If using jQuery's
.append
, escape manually.append( '<li>' + mw.message( 'example' ).escaped() + '</li>' );
- If manually building an html string, escape manually by creating a message object and calling
.escaped()
(instead of themw.msg
shortcut, which doesmw.message(key).plain()
):'<foo>' + mw.message( 'example' ).escaped() + '</foo>';
PLURAL and GENDER support in JavaScript
editMediawiki 1.19 onwards, the messages for JavaScript can contain PLURAL and GENDER directives. This feature is optional and extensions which require this feature should define an additional dependency mediawiki.jqueryMsg
in the resourceloader module definition.
If you have a message , say, 'message-key-plural-foo' => 'There {{PLURAL:$1|is|are}} $1 {{PLURAL:$1|item|items}}'
, in JavaScript , you can use it as given below:
mw.msg( 'message-key-plural-foo', count ) ;
// returns 'There is 1 item' if count = 1
// returns 'There are 6 items' if count = 6
If you have a message , say, 'message-key-gender-foo' => '{{GENDER:$1|he|she}} created an article'
, in JavaScript, you can use it as given below:
mw.msg( 'message-key-gender-foo', 'male' ) ; // returns 'he created an article'
mw.msg( 'message-key-gender-foo', 'female' ) ; // returns 'she created an article'
Instead of passing the gender directly, we can pass an user object - ie mw.User object with a gender attribute to mw.msg. For eg, the current user object.
var user = mw.user; //current user
mw.msg( 'message-key-gender-foo', user ) ; // The message returned will be based on the gender of the current user.
If the gender passed to mw.msg is invalid or unknown, gender neutral form will be used as defined for each language.
The keywords GENDER, PLURAL are case insensitive.
GRAMMAR in JavaScript
editMediawiki 1.20 onwards, the messages for JavaScript can contain GRAMMAR directive. This feature is optional and extensions which require this feature should define an additional dependency mediawiki.language.data
in the resourceloader module definition.
The static grammar form rules can be defined in $wgGrammarForms gloabl. The dynamic language specific grammar rules in PHP has been ported to javascript. Once the dependency mediawiki.language.data
iis added mw.msg method can be used as usual to parse the messages with word where N is the name of the grammatical form needed and word is the word being operated on. More information about Grammar is available here
Adding new messages
edit- Decide a name (key) for the message. Try to follow global or local conventions for naming. For extensions, use a standard prefix, preferably the extension name in lower case, followed by a hyphen ("-"). Try to stick to lower case letters, numbers and dashes in message names; most others are between less practical or not working at all. See also Manual:Coding conventions#Messages.
- Make sure that you are using suitable handling for the message (parsing,
{{
-replacement, escaping for HTML, etc.) - Add it to languages/messages/MessageEn.php (core) or your extensions i18n file under '
en
'. - Take a pause and consider the wording of the message. Is it as clear as possible? Can it be understood wrong? Ask comments from other developers or from localizers if possible. Follow the #internationalization hints.
- Add documentation to
MessagesQqq.php
or your extensions i18n file under 'qqq
'. Read more about #message documentation. - If you added a message to core, add the message key also to
maintenance/language/messages.inc
(also add the section if you created a new one). This file will define the order and formatting of messages in all message files.
Removing existing messages
edit- Remove it from MessagesEn.php. Don't bother with other languages - updates from translatewiki.net will handle those automatically.
- Remove it from maintenance/language/messages.inc
Step 2 is not needed for extensions, so you only have to remove your English language messages from ExtensionName.i18n.php.
Changing existing messages
edit- Consider updating the message documentation (see Adding new messages).
- Change the message key if old translations are not suitable for the new meaning. This also includes changes in message handling (parsing, escaping). If in doubt, ask in #mediawiki-i18n or in the Support page at translatewiki.net.
- If the extension is supported by translatewiki, please only change the English source message and/or key. If needed, the internationalisation and localisation team will take care of updating the translations, marking them as outdated, cleaning up the file or renaming keys where possible. This also applies when you're only changing things like HTML tags that you could change in other languages without speaking those languages. Most of these actions will take place in translatewiki.net and will reach Git or Subversion with about one day of delay.
Localizing namespaces and special page aliases
editNamespaces and special page names (i.e. RecentChanges in Special:RecentChanges) are also translatable.
Namespaces
editTo allow custom namespaces introduced by your extension to be translated, create a MyExtension.namespaces.php file that looks like this:
<?php
/**
* Translations of the namespaces introduced by MyExtension.
*
* @file
*/
$namespaceNames = array();
// For wikis where the MyExtension extension is not installed.
if( !defined( 'NS_MYEXTENSION' ) ) {
define( 'NS_MYEXTENSION', 2510 );
}
if( !defined( 'NS_MYEXTENSION_TALK' ) ) {
define( 'NS_MYEXTENSION_TALK', 2511 );
}
/** English */
$namespaceNames['en'] = array(
NS_MYEXTENSION => 'MyNamespace',
NS_MYEXTENSION_TALK => 'MyNamespace_talk',
);
/** Finnish (Suomi) */
$namespaceNames['fi'] = array(
NS_MYEXTENSION => 'Nimiavaruuteni',
NS_MYEXTENSION_TALK => 'Keskustelu_nimiavaruudestani',
);
Then load the namespace translation file in MyExtension.php via $wgExtensionMessagesFiles['MyExtensionNamespaces'] = dirname( __FILE__ ) . '/MyExtension.namespaces.php';
When a user installs MyExtension on their Finnish (fi) wiki, the custom namespace will be translated into Finnish magically, and the user doesn't need to do a thing!
Special page aliases
editCreate a new file for the special page aliases in this format:
<?php
/**
* Aliases for the MyExtension extension.
*
* @file
* @ingroup Extensions
*/
$aliases = array();
/** English */
$aliases['en'] = array(
'MyExtension' => array( 'MyExtension' )
);
/** Finnish (Suomi) */
$aliases['fi'] = array(
'MyExtension' => array( 'Lisäosani' )
);
Then load it in the extension's setup file like this: $wgExtensionAliasesFiles['MyExtension'] = dirname( __FILE__ ) . '/MyExtension.alias.php';
When your special page code uses either SpecialPage::getTitleFor( 'MyExtension' ) or $this->getTitle() (in the class that provides Special:MyExtension), the localized alias will be used, if it's available.
Simulated article
editQUESTION |
---|
Question text |
HINT |
---|
|
SOLUTION |
---|
solution text |
Test yourself
editQUESTION |
---|
Question text |
HINT |
---|
|
SOLUTION |
---|
solution text |
Discussion
editAny questions or would you like to take the test?