WordPress Developers: Test your i18n (internationalization) knowledge!

Alex Kirk lives in Austria and is a developer on the i18n (internationalization) team at Automattic. We’re looking for talented people wherever they live —why not join our team

i8n-logo

Whenever we write plugins or themes, there is one thing that needs a little extra attention and is quite frankly hard to get right: Translatable text.

Be it a button or some explanatory text, you generally will want to make that text be translatable to other languages, so that even more people can use your piece of software. While there is a very extensive guide available in the WordPress Handbook, we have created a fun way to brush up your knowledge on how to get things right: a quiz.

If you’re reading this post via a feed reader or an e-mail subscription, we encourage you to view the post on our developers blog to take the test (there are no winners or losers, this is meant to help you learn!), as it uses a little JavaScript to tell you whether an answer is right or wrong.

For each answer, we also provide an explanation, whether it’s right or wrong. So after clicking the answer that you think is right, make sure to click the other ones to explore what might be wrong about them.

So without further ado, take the quiz below!

You want to output the username in a sentence. Assume that the $username has been escaped using esc_html(). How do you do that?
<?php printf( __( 'Howdy, %s!' ), $username ); ?>
Good! Some languages may need to switch the location of the username to the front of this string. This code provides needed flexibility by including both the placeholder and the punctuation mark. Check the other answers though, there is an even improved answer.
<?php /* translators: %s is a username */ printf( __( 'Howdy, %s!' ), $username ); ?>
Awesome, the comment for translators is the cherry on the cake, as they cannot see variable names. Some languages may need to switch the location of the username to the front of this string. This code provides needed flexibility by including both the placeholder and the punctuation mark.
<?php printf( __( 'Howdy, %s' ), $username ); ?>!
This is almost correct. The punctuation mark should be included in the translatable string.
<?php echo __( 'Howdy' ) . ', ' . $username; ?>!
Translators may need to put the username first in other languages. That’s not possible with this code because it isn’t using a placeholder and a function that does substitution such as printf.
<?php _e( 'Howdy, %s!', $username ); ?>
The _e() function can only output text. It does not substitute variables.
<?php _e( "Howdy, $username!" ); ?>
Variables in a string are a no-no because the translated text is loaded by using the original English text which needs to be the same for all possible outputs.
You need to include a link in a sentence. How can you do that?
printf( __( 'Publish something using our <a href="%s">Post by Email</a> feature.'), 'http://support.wordpress.com/post-by-email/' );
Correct. Embed HTML in the string when it is necessary to keep the sentence structure intact for translators. Some examples would be href tags or bold/italics around a mid-sentence word.
_e( 'Publish something using our <a href="http://support.wordpress.com/post-by-email/">Post by Email</a> feature.' );
We don’t want to include URLs in the translation because we don’t want to expose them as translatable to translators. Also, if the URL is hardcoded within the string and then we ever change it, the entire string will become a new translation which will require re-translation.
printf( __( 'Publish something using our %s feature.' ), sprintf( '<a href="http://support.wordpress.com/post-by-email/">%s</a>', __( 'Post by Email' ) ) );
This code breaks the sentence up which causes a loss of full context during translation. We always try to keep full sentences/phrases together because having the whole string leads to much better translations.
Which of these is the correct way to use the single/plural _n() function?
printf( _n( '%d person has seen this post.', '%d people have seen this post.', $view_count ), $view_count );
Correct. Always use a placeholder in both singular and plural strings.
printf( _n( 'One person has seen this post.', '%d people have seen this post.', $view_count ), $view_count );
The hardcoded “One” in the singular string is problematic. We always want to use a placeholder in both singular and plural strings. Some languages (such as Russian) have multiple plurals which require the flexibility provided by using the placeholder in the singular string (#).
“So and so many people have seen this post” should be output like this:
printf( _n( '%d person has seen this post.', '%d people have seen this post.', $view_count ), $view_count );
Correct. We use the variable twice: 1) we need the number for the _n() function to determine the correct singular/plural text and 2) we need the number for the subsequent substitution in the printf. Also, it’s very important that the %d placeholder is used in the singular string (and not a hardcoded “1”) because some languages, such as Russian, have multiple plural forms. Those languages rely on that flexibility in the singular string.
printf( __( '%d people have seen this post.' ), $view_count );
For strings like this containing a numerical count, we want to use _n() instead because we always need to include the singular form of the string–even if the singular case should never happen. Why? Some languages, such as Russian, have multiple plural forms and they rely on flexibility provided by the singular string.
printf( _n( '%d person has seen this post.', '%d people have seen this post.' ), $view_count );
Almost. The _n() function also needs to know about the count value via its third parameter so it can determine the correct text.
printf( 1 == $view_count ? __( '%d person has seen this post.' ) : __( '%d people have seen this post.' ), $view_count );
Some languages have multiple plural forms–not just the typical singular/plural distinction–so this approach is problematic. We need to use _n() instead as it accounts for those multiple plural form complexities.
echo _n( 'One person has seen this post', "$view_count people have seen this post." );
Several things are amiss here. First, the hardcoded “One” needs to be a %d placeholder because some languages have multiple plural forms–not just the typical singular/plural distinction–and _n() with proper placeholdering handles that. The second issue is that $view_count needs to be a %d placeholder as well. Finally, all the above means that we need to switch the echo to a printf to use the placeholders and we’ll also want to add $view_count as a third argument to _n() as it expects a count value to determine which string to use.
How do you deal with outputting a variable in the context of a translation?
<h1><?php printf( __( 'Hello %s' ), esc_html( $world ) ); ?></h1>
Correct. Here PHP 1) swaps in the translated string which also contains the %s placeholder, 2) escapes the $world var safely, and then 3) substitutes the now escaped $world value into the placeholder spot. Exactly what we want.
One reminder, though: if you use this piece of code you need to be sure that you have verified your translations, so that your translation of Hello %s doesn’t include malicious code. If you don’t trust your translations, you should use a esc_html(sprintf()) construction instead of the printf.
<h1><?php printf( esc_html__( 'Hello %s' ), $world ); ?></h1>
This code is unsafe because it isn’t escaping $world at all. PHP runs esc_html__ first which swaps in the translated string (eg, "Hola %s") and then escapes it. Unfortunately, after that, printf swaps the value of $world into the placeholder which is unescaped. Danger, Will Robinson, danger!
<h1><?php echo esc_html__( sprintf( 'Hello %s' ), $world ) ); ?></h1>
We never want a sprintf inside a translation function. Translation files are generated by a cron job that parses (not execute!) PHP files looking for the translation functions sprintf isn’t resolved when that parsing happens which means this code will just be garbage translation data.
<h1><?php esc_html_e( 'Hello %s', $world ); ?></h1>
The second parameter of esc_html_e() is for a context value. We need printf here to do the variable substitution.
What’s the best practice to include formatted numbers in strings?
printf( _n( 'Today you already got %s view.', 'Today you already got %s views.', $view_count ), number_format_i18n( $view_count ) );
Correct. Use _n() for the possibly singular/plural string and use number_format_i18n() to actually format the number to local rules (for example some locales have a different thousand separator). We do indeed use %s here for the number because number_format_i18n() returns a formatted string.
$views = number_format( $view_count );
printf( _n( 'Today you already got %d view.', 'Today you already got %d views.' ), $views );
There are a few problems here. We want to be using number_format_i18n(). Also, number_format_i18n() produces strings, not numbers, so we need to use %s. Finally, in addition to printf, we need to give the count number to the _n() function so it knows which string variant to use.
_en_fmt( 'Today you already got %d view.', 'Today you already got %d views.', $views );
Arrowed! There isn’t a _en_fmt() function.
How to deal with multiple variables in a translated string?
printf( __( 'Posted on %1$s by %2$s.' ), $date, $username );
Almost correct. The placeholders are numbered so their values can be re-arranged if need be in translations. The remaining problem, though: translators don’t see the variable names, therefore they can only guess that the one variable is a date and the otherone is a username.
/* translators: %1$s is a date, %2$s is a username */
printf( __( 'Posted on %1$s by %2$s.' ), $date, $username );
Perfect. We make sure to number our placeholders so their values can be re-arranged if need be in translations. Also we give additional info to translators so that they can know which variable means what.
printf( __( 'Posted on %(date)s by %(username)s.' ), $date, $username );
Good thinking, but this syntax unfortunately is not available in PHP.
printf( __( 'Posted on %s by %s.' ), $date, $username );
We want to make sure we use numbered placeholders (ie, %1$s, %2$s, etc) whenever there is more than one placeholder because translators may need to re-arrange their locations in their translations.
Which of these is correct?
switch ( $type ) {
    case 'date':
        printf( __( 'Sorted by date' ) );
        break;
    case 'comments':
        printf( __( 'Sorted by comments' ) );
        break;
}
Correct. We want to give translators full sentences/phrases.
switch ( $type ) {
    case 'date':
        printf( __( 'Sorted by %s.' ), __( 'date' ) );
        break;
    case 'comments':
        printf( __( 'Sorted by %s.' ), __( 'comments' ) );
        break;
}
Unnecessarily breaking up sentences/phrases is a problem for translators. “Date” by itself may be translated differently from when it is used in a sentence, so we want to keep complete sentences/phrases together whenever possible.
$pattern = __( 'Sorted by %s.' );
switch ( $type ) {
    case 'date':
        printf( $pattern, __( 'date' ) );
        break;
    case 'comments':
        printf( $pattern, __( 'comments' ) );
        break;
}
This looks so efficient but unfortunately it’s wrong: essentially this is a concatenation of strings, which can’t be done in translations, because a generic translation of “date” might be wrong in the context of sorting. Or it would need to be in another grammatical case. Or other reasons. Short: don’t do that.
printf( __( 'Sorted by %s.' ), __( $type ) );
The code here won’t work because translation functions cannot be fed PHP variables. Translation files are generated by a cron job that parses (not execute!) PHP files looking for the translation functions. It doesn’t execute any of the PHP so the variable is unresolved which leads to garbage translation data (actually, the parsing just rejects it).

On API Correctness

Developing APIs is hard.

You pour your blood, sweat, and tears into this interface that bares the soul of your company and of your product to the world. The machinery under the hood, though, is often a lot less polished than the fancy paint job would lead the rest of the world to believe. You have to be careful, then, not to inflict your own rough edges on the people you expect to be consuming your API because…

Using APIs is hard.

As an app developer you’re trying to take someone else’s product and somehow integrate it into whatever vision you have in your head. Whether it’s simply getting a list of things from another service (such as embedding a reading list) or wrapping your entire product around another product (using Amazon S3 as your primary binary storage mechanism, for example), you have a lot of things to reconcile.

You have your own programming language (or languages) that you’re using. There’s the use case you have in mind, and the ones the remote devs had in mind for the API. There’s the programming language they used to create the API (and that they used to test it). Finally, don’t forget the encoding or representation of the data — and its limitations. Reconciling all of the slight (or major) differences between these elements is a real challenge sometimes. Despite years of attempts at best practices and industry standards, things just don’t always fit together like we pretend that they will.

As a developer providing an API it’s important to remember three things. There are obviously many other things to consider, but these three things are more universal than most.

#1 You want people to use your API.

Unless you’re developing a completely internal API, you’re hoping that the world sees your API as something amazing, and that your functionality starts popping up in other magical places without any further effort on your part.

#2 You have no control over what tools others are using.

Are you using a language that has little or no variable type enforcement? Some people aren’t. Some of those people still want to use your product. Did you come up with your own way of doing things with custom code instead of using widely-adopted industry standards (which, being widely deployed, come with battle-tested libraries in many languages)? Did that cause you to release a client in your own language (how about Clojure, how about Erlang, how about C++, how about Perl, how about…)?

#3 Your API is a promise.

It’s easy to forget (especially for those of us who spend our time in a forgiving language such as PHP or Python) that the API we provide is a promise to the rest of the world. What it promises is this: “When you provide me with ${this} I will provide you with ${that}”.

The super-important (and insidiously non-obvious) thing about this is that if you do not provide a written promise (in the form of your API’s documentation), then the behavior of your API becomes the implicit promise.

The most important thing to note here is that when your documentation is wrong, the promise of your actual behavior wins every single time.

Keep your promises

When your promises don’t match your actual results things get hairy.

Let’s take a look at a completely hypothetical situation.

  1. You have an API that is documented to return a json object with a success member which should be a boolean value.
  2. You have a case (maybe all cases) where success is actually rendered as an integer (0 for false, 1 for true).
  3. John has an app written in a strongly-typed language that works around this by defining success as an integer type instead of a boolean type. Because John was busy, he never got around to letting you know. Or maybe John never knew because he simply inspected your API and worked backwards from the responses that you gave. Now John’s app has 100k users depending on this functionality.
  4. Mary is writing an app, and because Mary doesn’t like to play fast and loose (and she doesn’t want her app to break later on) she submits an issue pointing out that you are returning the wrong type.

At this point you are trapped. The existing user base (and by extension their user base) is committed to integers. And you only have four options.

  1. You can cripple an existing and deployed application enjoyed by 100k users.
  2. You can version your API — an entire new version to correct what should be a boolean value.
  3. You can work with John to roll out a new version of the app which can handle both (but maybe his app is in the iOS app store, and getting everyone to update is impossible, takes a long time, and/or would require a lengthy, and potentially costly, review process by yet another party).
  4. You make a really sad face and change your promise — to reflect that you are going to do what is actually the less correct thing, forever.

Because you wrote an API whose promise was wrong, or whose promise was missing, you have painted yourself into a very undesirable corner. You’re now in a place where doing the right thing for the right reasons is the wrong move.

So do yourself, and everyone else, one of two favors — depending on the position in which you find yourself.

If you’re producing an API, take extra care to make sure that your results match your documentation (and you need to have documentation).

If you’re consuming an API, don’t be like John. Don’t work backwards from the data — work forwards from the docs. And if the docs are wrong you should submit a ticket and wait for it to be fixed (or at the very, very least, make sure your workaround deals with both the documented expectation and the actual incorrect return value).

In conclusion

Just like a child, it takes a village to raise a good, decent, hard working API.

REST Development Console — now open source!

For developers working with the REST API, the browser-based API console is an essential debugging tool. It allows you to test your API queries and interactively explore the results (or errors) that the API returns.

REST API console - exploring results

It also puts the documentation at your fingertips and allows you to build a custom query right from any method’s description.

REST API console - building a query

Like the REST API itself, this tool works for any blog on WordPress.com and for any self-hosted WordPress install using Jetpack.

With the addition of implicit OAuth, we’ve released an open-source version of the API console that you can run yourself.

First, you’ll want to create a WordPress.com application (or modify an existing one) and make sure to set the Javascript Origins option. This should be the fully-qualified URL (including http:// or https:// ) of the site you’ll be running the API console on. To run it locally, just use “http://localhost”.

REST API console - JS origins setting

Then, just head on over to the GitHub repository, clone your own copy, and put your application’s info in the file config.json.

Screen Shot 2014-06-10 at 5.23.59 PM

When you’re running the console locally, you can authenticate by clicking on the box in the lower-right corner.

dev console - auth button

Once you’re linked to a blog, the blog’s ID will be shown in the lower right. You can click on it to change which blog you’re working with.

Important note: When you’ve connected to a blog, the console is hooked up to the live database — any changes you make will be reflected on that blog! You might want to create a test blog if you’re planning to make any requests that will modify content.

The REST API console is located at https://developer.wordpress.com/console/.

If you’re using the API, we’d love to hear what you think! Have you used the development console? What’s great (or not so great) about it?

Meet Sulfur — a Media Manager App Built in JavaScript

Since Automattic is a distributed company and a lot of us work from home, we hold meetups to get face-to-face interaction. The whole company meets up once a year and individual teams get together more often. One component of those meetups is a “meetup project” that we all work on together.

The team I lead — “Team I/O*” — just finished a lovely week in Reykjavik, Iceland. Our team is responsible for partnerships and our APIs.

We spent the first day releasing better JavaScript support for our APIs. After that we decided to make an example app, mainly focusing on the new CORS support and implicit OAuth system.

We decided to build a media manager purely in the browser. We picked a codename and Sulfur was born.

Update: Check out the live demo!

Screen Shot 2014-05-20 at 5.18.13 PM

Sulfur is an app built using Backbone, Underscore.js, Plupload, jQuery, MomentJS, Bootstrap, RequireJS, and the WordPress.com JSON REST API.

It shows how you can use Implicit OAuth to do authentication without a server component. It also provides examples for using the API with Backbone and third party libraries like plupload for uploading media.

Sulfur allows you to upload images, view your entire media library contents, view meta data, and delete images. All the code is open source. Check out Sulfur on GitHub.

We learned a lot while building the app and hope to use that knowledge to improve our APIs both for internal and external use.

Screen Shot 2014-05-20 at 9.11.02 PM

Let us know what you think in the comments!

*Named after one of the moons of Jupiter and our love of APIs (input/output)

Querying Posts Without query_posts

Here at WordPress.com, we have over 200 themes (and even more plugins) running inside the biggest WordPress installation around (that we know of anyway!) With all of that code churning around our over 2,000 servers worldwide, there’s one particular WordPress function that we actually try to shy away from; query_posts()

If you think you need to use it, there is most likely a better approach. query_posts() doesn’t do what most of us probably think it does.

We think that it:

  • Resets the main query loop.
  • Resets the main post global.

But it actually:

  • Creates a new WP_Query object with whatever parameters you set.
  • Replaces the existing main query loop with a new one (that is no longer the main query)

Confused yet? It’s okay if you are, thousands of others are, too.

This is what query_posts actually looks like:

/**
 * Set up The Loop with query parameters.
 *
 * This will override the current WordPress Loop and shouldn't be used more than
 * once. This must not be used within the WordPress Loop.
 *
 * @since 1.5.0
 * @uses $wp_query
 *
 * @param string $query
 * @return array List of posts
 */
function &query_posts($query) {
	unset($GLOBALS['wp_query']);
	$GLOBALS['wp_query'] = new WP_Query();
	return $GLOBALS['wp_query']->query($query);
}

Rarely, if ever, should anyone need to do this. The most commonly used scenario is a theme that has featured posts that appear visually before the main content area. Below is a screen-grab of the iTheme2 theme for reference.

The thing to keep in mind, is by the time the theme is starting to display the featured posts, WordPress has already:

  • looked at the URL…
  • parsed out what posts fit the pattern…
  • retrieved those posts from the database (or cache)…
  • Filled the $wp_query and $post globals in PHP.

Let’s think about it like this:

The “Main Loop” consists of 3 globals, 2 of which actually matter.

  • $wp_the_query (does not matter)
  • $wp_query (matters)
  • $post (matters)

The reason $wp_the_query doesn’t matter is because you’ll *never* directly touch it, nor should you try. It’s designed to be the default main query regardless of how poisoned the $wp_query and $post globals might become.

Back to Featured Posts

When you want to query the database to get those featured posts, we all know it’s time to make a new WP_Query and loop through them, like so…

$featured_args = array(
	'post__in' => get_option( 'sticky_posts' ),
	'post_status' => 'publish',
	'no_found_rows' => true
);

// The Featured Posts query.
$featured = new WP_Query( $featured_args );

// Proceed only if published posts with thumbnails exist
if ( $featured->have_posts() ) {
	while ( $featured->have_posts() ) {
		$featured->the_post();
		if ( has_post_thumbnail( $featured->post->ID ) ) {
			/// do stuff here
		}
	}

	// Reset the post data
	wp_reset_postdata();
}

Great! Two queries, no conflicts; all is right in the world. You are remembering to use wp_reset_postdata(), right? ;) If not, the reason you do it is because every new WP_Query replaces the $post global with whatever iteration of whatever loop you just ran. If you don’t reset it, you might end up with $post data from your featured posts query, in your main loop query. Yuck.

Remember query_posts()? Look at it again; it’s replacing $wp_query and not looking back to $wp_the_query to do it. Lame, right? It just takes whatever parameters you passed it and assumes it’s exactly what you want.

I’ll let you stew on that for a second; let’s keep going…

What if, after your featured-posts query is done and you’ve dumped out all your featured posts, you want to *exclude* any featured posts from your main loop?

Think about this…

It makes sense that you would want to use query_posts() and replace the main $wp_query loop, right? I mean, how else would you know what to exclude, if you didn’t run the featured posts query BEFORE the main loop query happened?

EXACTLY!

Paradox, and WordPress and WP_Query are designed to handle this extremely gracefully with an action called ‘pre_get_posts

Think of it as the way to convince WordPress that what it wants to do, maybe isn’t really what it wants to do. In our case, rather than querying for posts a THIRD time (main loop, featured posts, query_posts() to exclude) we can modify the main query ahead of time, exclude what we don’t want, and run the featured query as usual. Genius!

This is how we’re doing it now in the iTheme2 theme:

/**
 * Filter the home page posts, and remove any featured post ID's from it. Hooked
 * onto the 'pre_get_posts' action, this changes the parameters of the query
 * before it gets any posts.
 *
 * @global array $featured_post_id
 * @param WP_Query $query
 * @return WP_Query Possibly modified WP_query
 */
function itheme2_home_posts( $query = false ) {

	// Bail if not home, not a query, not main query, or no featured posts
	if ( ! is_home() || ! is_a( $query, 'WP_Query' ) || ! $query->is_main_query() || ! itheme2_featuring_posts() )
		return;

	// Exclude featured posts from the main query
	$query->set( 'post__not_in', itheme2_featuring_posts() );

	// Note the we aren't returning anything.
	// 'pre_get_posts' is a byref action; we're modifying the query directly.
}
add_action( 'pre_get_posts', 'itheme2_home_posts' );

/**
 * Test to see if any posts meet our conditions for featuring posts.
 * Current conditions are:
 *
 * - sticky posts
 * - with featured thumbnails
 *
 * We store the results of the loop in a transient, to prevent running this
 * extra query on every page load. The results are an array of post ID's that
 * match the result above. This gives us a quick way to loop through featured
 * posts again later without needing to query additional times later.
 */
function itheme2_featuring_posts() {
	if ( false === ( $featured_post_ids = get_transient( 'featured_post_ids' ) ) ) {

		// Proceed only if sticky posts exist.
		if ( get_option( 'sticky_posts' ) ) {

			$featured_args = array(
				'post__in'      => get_option( 'sticky_posts' ),
				'post_status'   => 'publish',
				'no_found_rows' => true
			);

			// The Featured Posts query.
			$featured = new WP_Query( $featured_args );

			// Proceed only if published posts with thumbnails exist
			if ( $featured->have_posts() ) {
				while ( $featured->have_posts() ) {
					$featured->the_post();
					if ( has_post_thumbnail( $featured->post->ID ) ) {
						$featured_post_ids[] = $featured->post->ID;
					}
				}

				set_transient( 'featured_post_ids', $featured_post_ids );
			}
		}
	}

	// Return the post ID's, either from the cache, or from the loop
	return $featured_post_ids;
}

It reads like this:

  • Filter the main query.
  • Only proceed if we’re on the home page.
  • Only proceed if our query isn’t somehow messed up.
  • Only proceed if we want to filter the main query.
  • Only proceed if we actually have featured posts.
  • Featured posts? Let’s check for stickies.
  • Query for posts if they exist
  • (At this point, WP_Query runs again, and so does our ‘pre_get_posts’ filter. Thanks to our checks above, our query for featured posts won’t get polluted by our need to exclude things.
  • Take each post ID we get, and store them in an array.
  • Save that array as a transient so we don’t keep doing this on each page load.
  • We’re done with featured posts, and back in our main query filter again.
  • In our main query, exclude the post ID’s we just got.
  • Return the modified main query variables.
  • Let WordPress handle the rest.

With a little foresight into what we want to do, we’re able to architect ourselves a nice bit of logic to avoid creating a third, potentially costly WP_Query object.

Another, more simple example

The Depo Masthead theme wants to limit the home page to only 3 posts. We already learned earlier we *don’t* want to run query_posts() since it will create a new WP_Query object we don’t need. So, what do we do?

/**
 * Modify home query to only show 3 posts
 *
 * @param WP_Query $query
 * @return WP_Query
 */
function depo_limit_home_posts_per_page( $query = '' ) {

	// Bail if not home, not a query, not main query, or no featured posts
	if ( ! is_home() || ! is_a( $query, 'WP_Query' ) || ! $query->is_main_query() )
		return;

	// Home only gets 3 posts
	$query->set( 'posts_per_page', 3 );
}
add_action( 'pre_get_posts', 'depo_limit_home_posts_per_page' );

Stop me if you’ve heard this one. We hook onto ‘pre_get_posts’ and return a modified query! Woo woo!

Themes are the most common culprit, but they aren’t alone. More often than not, we all forget to clean up after ourselves, reset posts and queries when we’re done, etc… By avoiding query_posts() all together, we can be confident our code is behaving the way we intended, and that it’s playing nicely with the plugins and themes we’re running too.