WordPress Developers: Test your i18n (internationalization) knowledge!

Alex Kirk lives in Austria and is a developer on the i18n (internationalization) team at Automattic. We’re looking for talented people wherever they live —why not join our team

i8n-logo

Whenever we write plugins or themes, there is one thing that needs a little extra attention and is quite frankly hard to get right: Translatable text.

Be it a button or some explanatory text, you generally will want to make that text be translatable to other languages, so that even more people can use your piece of software. While there is a very extensive guide available in the WordPress Handbook, we have created a fun way to brush up your knowledge on how to get things right: a quiz.

If you’re reading this post via a feed reader or an e-mail subscription, we encourage you to view the post on our developers blog to take the test (there are no winners or losers, this is meant to help you learn!), as it uses a little JavaScript to tell you whether an answer is right or wrong.

For each answer, we also provide an explanation, whether it’s right or wrong. So after clicking the answer that you think is right, make sure to click the other ones to explore what might be wrong about them.

So without further ado, take the quiz below!

You want to output the username in a sentence. Assume that the $username has been escaped using esc_html(). How do you do that?
<?php printf( __( 'Howdy, %s!' ), $username ); ?>
Good! Some languages may need to switch the location of the username to the front of this string. This code provides needed flexibility by including both the placeholder and the punctuation mark. Check the other answers though, there is an even improved answer.
<?php /* translators: %s is a username */ printf( __( 'Howdy, %s!' ), $username ); ?>
Awesome, the comment for translators is the cherry on the cake, as they cannot see variable names. Some languages may need to switch the location of the username to the front of this string. This code provides needed flexibility by including both the placeholder and the punctuation mark.
<?php printf( __( 'Howdy, %s' ), $username ); ?>!
This is almost correct. The punctuation mark should be included in the translatable string.
<?php echo __( 'Howdy' ) . ', ' . $username; ?>!
Translators may need to put the username first in other languages. That’s not possible with this code because it isn’t using a placeholder and a function that does substitution such as printf.
<?php _e( 'Howdy, %s!', $username ); ?>
The _e() function can only output text. It does not substitute variables.
<?php _e( "Howdy, $username!" ); ?>
Variables in a string are a no-no because the translated text is loaded by using the original English text which needs to be the same for all possible outputs.
You need to include a link in a sentence. How can you do that?
printf( __( 'Publish something using our <a href="%s">Post by Email</a> feature.'), 'http://support.wordpress.com/post-by-email/' );
Correct. Embed HTML in the string when it is necessary to keep the sentence structure intact for translators. Some examples would be href tags or bold/italics around a mid-sentence word.
_e( 'Publish something using our <a href="http://support.wordpress.com/post-by-email/">Post by Email</a> feature.' );
We don’t want to include URLs in the translation because we don’t want to expose them as translatable to translators. Also, if the URL is hardcoded within the string and then we ever change it, the entire string will become a new translation which will require re-translation.
printf( __( 'Publish something using our %s feature.' ), sprintf( '<a href="http://support.wordpress.com/post-by-email/">%s</a>', __( 'Post by Email' ) ) );
This code breaks the sentence up which causes a loss of full context during translation. We always try to keep full sentences/phrases together because having the whole string leads to much better translations.
Which of these is the correct way to use the single/plural _n() function?
printf( _n( '%d person has seen this post.', '%d people have seen this post.', $view_count ), $view_count );
Correct. Always use a placeholder in both singular and plural strings.
printf( _n( 'One person has seen this post.', '%d people have seen this post.', $view_count ), $view_count );
The hardcoded “One” in the singular string is problematic. We always want to use a placeholder in both singular and plural strings. Some languages (such as Russian) have multiple plurals which require the flexibility provided by using the placeholder in the singular string (#).
“So and so many people have seen this post” should be output like this:
printf( _n( '%d person has seen this post.', '%d people have seen this post.', $view_count ), $view_count );
Correct. We use the variable twice: 1) we need the number for the _n() function to determine the correct singular/plural text and 2) we need the number for the subsequent substitution in the printf. Also, it’s very important that the %d placeholder is used in the singular string (and not a hardcoded “1”) because some languages, such as Russian, have multiple plural forms. Those languages rely on that flexibility in the singular string.
printf( __( '%d people have seen this post.' ), $view_count );
For strings like this containing a numerical count, we want to use _n() instead because we always need to include the singular form of the string–even if the singular case should never happen. Why? Some languages, such as Russian, have multiple plural forms and they rely on flexibility provided by the singular string.
printf( _n( '%d person has seen this post.', '%d people have seen this post.' ), $view_count );
Almost. The _n() function also needs to know about the count value via its third parameter so it can determine the correct text.
printf( 1 == $view_count ? __( '%d person has seen this post.' ) : __( '%d people have seen this post.' ), $view_count );
Some languages have multiple plural forms–not just the typical singular/plural distinction–so this approach is problematic. We need to use _n() instead as it accounts for those multiple plural form complexities.
echo _n( 'One person has seen this post', "$view_count people have seen this post." );
Several things are amiss here. First, the hardcoded “One” needs to be a %d placeholder because some languages have multiple plural forms–not just the typical singular/plural distinction–and _n() with proper placeholdering handles that. The second issue is that $view_count needs to be a %d placeholder as well. Finally, all the above means that we need to switch the echo to a printf to use the placeholders and we’ll also want to add $view_count as a third argument to _n() as it expects a count value to determine which string to use.
How do you deal with outputting a variable in the context of a translation?
<h1><?php printf( __( 'Hello %s' ), esc_html( $world ) ); ?></h1>
Correct. Here PHP 1) swaps in the translated string which also contains the %s placeholder, 2) escapes the $world var safely, and then 3) substitutes the now escaped $world value into the placeholder spot. Exactly what we want.
One reminder, though: if you use this piece of code you need to be sure that you have verified your translations, so that your translation of Hello %s doesn’t include malicious code. If you don’t trust your translations, you should use a esc_html(sprintf()) construction instead of the printf.
<h1><?php printf( esc_html__( 'Hello %s' ), $world ); ?></h1>
This code is unsafe because it isn’t escaping $world at all. PHP runs esc_html__ first which swaps in the translated string (eg, "Hola %s") and then escapes it. Unfortunately, after that, printf swaps the value of $world into the placeholder which is unescaped. Danger, Will Robinson, danger!
<h1><?php echo esc_html__( sprintf( 'Hello %s' ), $world ) ); ?></h1>
We never want a sprintf inside a translation function. Translation files are generated by a cron job that parses (not execute!) PHP files looking for the translation functions sprintf isn’t resolved when that parsing happens which means this code will just be garbage translation data.
<h1><?php esc_html_e( 'Hello %s', $world ); ?></h1>
The second parameter of esc_html_e() is for a context value. We need printf here to do the variable substitution.
What’s the best practice to include formatted numbers in strings?
printf( _n( 'Today you already got %s view.', 'Today you already got %s views.', $view_count ), number_format_i18n( $view_count ) );
Correct. Use _n() for the possibly singular/plural string and use number_format_i18n() to actually format the number to local rules (for example some locales have a different thousand separator). We do indeed use %s here for the number because number_format_i18n() returns a formatted string.
$views = number_format( $view_count );
printf( _n( 'Today you already got %d view.', 'Today you already got %d views.' ), $views );
There are a few problems here. We want to be using number_format_i18n(). Also, number_format_i18n() produces strings, not numbers, so we need to use %s. Finally, in addition to printf, we need to give the count number to the _n() function so it knows which string variant to use.
_en_fmt( 'Today you already got %d view.', 'Today you already got %d views.', $views );
Arrowed! There isn’t a _en_fmt() function.
How to deal with multiple variables in a translated string?
printf( __( 'Posted on %1$s by %2$s.' ), $date, $username );
Almost correct. The placeholders are numbered so their values can be re-arranged if need be in translations. The remaining problem, though: translators don’t see the variable names, therefore they can only guess that the one variable is a date and the otherone is a username.
/* translators: %1$s is a date, %2$s is a username */
printf( __( 'Posted on %1$s by %2$s.' ), $date, $username );
Perfect. We make sure to number our placeholders so their values can be re-arranged if need be in translations. Also we give additional info to translators so that they can know which variable means what.
printf( __( 'Posted on %(date)s by %(username)s.' ), $date, $username );
Good thinking, but this syntax unfortunately is not available in PHP.
printf( __( 'Posted on %s by %s.' ), $date, $username );
We want to make sure we use numbered placeholders (ie, %1$s, %2$s, etc) whenever there is more than one placeholder because translators may need to re-arrange their locations in their translations.
Which of these is correct?
switch ( $type ) {
    case 'date':
        printf( __( 'Sorted by date' ) );
        break;
    case 'comments':
        printf( __( 'Sorted by comments' ) );
        break;
}
Correct. We want to give translators full sentences/phrases.
switch ( $type ) {
    case 'date':
        printf( __( 'Sorted by %s.' ), __( 'date' ) );
        break;
    case 'comments':
        printf( __( 'Sorted by %s.' ), __( 'comments' ) );
        break;
}
Unnecessarily breaking up sentences/phrases is a problem for translators. “Date” by itself may be translated differently from when it is used in a sentence, so we want to keep complete sentences/phrases together whenever possible.
$pattern = __( 'Sorted by %s.' );
switch ( $type ) {
    case 'date':
        printf( $pattern, __( 'date' ) );
        break;
    case 'comments':
        printf( $pattern, __( 'comments' ) );
        break;
}
This looks so efficient but unfortunately it’s wrong: essentially this is a concatenation of strings, which can’t be done in translations, because a generic translation of “date” might be wrong in the context of sorting. Or it would need to be in another grammatical case. Or other reasons. Short: don’t do that.
printf( __( 'Sorted by %s.' ), __( $type ) );
The code here won’t work because translation functions cannot be fed PHP variables. Translation files are generated by a cron job that parses (not execute!) PHP files looking for the translation functions. It doesn’t execute any of the PHP so the variable is unresolved which leads to garbage translation data (actually, the parsing just rejects it).

Comments

9 responses to “WordPress Developers: Test your i18n (internationalization) knowledge!”

  1. Nice one Alex! I actually got quite a few of them wrong. 🙂 I had a question about the first one where you use a comment /* translaters: */. Is there a reason that this is preferred over using _x()?

    1. Thanks! With _x() a context is specified. The difference between context and a comment is this:

      A context adds information about the string itself (for example “verb”) or the area where the string is found (for example “column caption”).

      A comment like /* translators: */ (mind the spelling) is meant to provide help with translating the string, for example, as we showed, by explaining what the variables stand for. Think of it as a personal message to translators.

      I do see the context used a lot for explaining variables but in general the comment is a better place to do that.

      1. Makes sense! Thanks Alex.

  2. Your correct answer to the second question means that the link will always go the English URL even if a translated version of the link’s destination exists. Exposing the URL to translation at least gives the option to change it.

    1. Thanks for your input. It’s a good point to make.

      You are right, the code as it stands there does what you are saying. But this is not where it ends. It rather gives the developer the option to link to the correct page through code (think getSupportUrlByLocale( 'post-by-email', $locale ) instead of the hardcoded string of the URL). A translator can’t and shouldn’t need to know about the correct URL for the language they are translating to.

  3. Another way to include links in a sentence is to not include any HTML tags at all, but rather use placeholders, like this:


    printf( __( 'Please %supdate%s WordPress' ), '<a href="http://..." rel="nofollow">', '</a>' );

    This avoids any issues caused by translators accidentally messing up the HTML tags. It’s a pretty common pattern in WooCommerce, as far as I know.

    1. I’d like to discourage you from doing it like that. Messed up HTML can be easily caught through code (by comparing original with translation), while a placeholder is something very opaque. Translators are used to them being words or numbers. As a translator all you’d see is this:

      Please %supdate%s WordPress

      How would a translator know what the %s means here? It could be “your”, it could be a version number. Also, personally, I think it is very illegible to adjoin a placeholder to a word, it also makes it impossible to use a dictionary to highlight misspelled words (what’s a supdate?).

      Also, remember to number variables, so that translators can move the variables around in the string. I understand that in this case it doesn’t matter because the HTML tags will always be in that order, however I suggest to not think about whether it is necessary but make it a habit to always number variables like this: %1$s.

      1. I completely agree on numbering variables. Also, I do realise that highlighting misspelled words will take a hit with this approach. However, it’s quite trivial to indicate what a variable represents – simply use translator comments:


        // Translators: %1$s - <a> opening tag, %2$s - </a> closing tag
        printf( __( 'Please %supdate%s WordPress' ), '<a href="http://..." rel="nofollow">', '</a>' );

  4. As a translator I want to emphasize this: Please don’t use %stext%s for HTML links.