On API Correctness

Developing APIs is hard.

You pour your blood, sweat, and tears into this interface that bares the soul of your company and of your product to the world. The machinery under the hood, though, is often a lot less polished than the fancy paint job would lead the rest of the world to believe. You have to be careful, then, not to inflict your own rough edges on the people you expect to be consuming your API because…

Using APIs is hard.

As an app developer you’re trying to take someone else’s product and somehow integrate it into whatever vision you have in your head. Whether it’s simply getting a list of things from another service (such as embedding a reading list) or wrapping your entire product around another product (using Amazon S3 as your primary binary storage mechanism, for example), you have a lot of things to reconcile.

You have your own programming language (or languages) that you’re using. There’s the use case you have in mind, and the ones the remote devs had in mind for the API. There’s the programming language they used to create the API (and that they used to test it). Finally, don’t forget the encoding or representation of the data — and its limitations. Reconciling all of the slight (or major) differences between these elements is a real challenge sometimes. Despite years of attempts at best practices and industry standards, things just don’t always fit together like we pretend that they will.

As a developer providing an API it’s important to remember three things. There are obviously many other things to consider, but these three things are more universal than most.

#1 You want people to use your API.

Unless you’re developing a completely internal API, you’re hoping that the world sees your API as something amazing, and that your functionality starts popping up in other magical places without any further effort on your part.

#2 You have no control over what tools others are using.

Are you using a language that has little or no variable type enforcement? Some people aren’t. Some of those people still want to use your product. Did you come up with your own way of doing things with custom code instead of using widely-adopted industry standards (which, being widely deployed, come with battle-tested libraries in many languages)? Did that cause you to release a client in your own language (how about Clojure, how about Erlang, how about C++, how about Perl, how about…)?

#3 Your API is a promise.

It’s easy to forget (especially for those of us who spend our time in a forgiving language such as PHP or Python) that the API we provide is a promise to the rest of the world. What it promises is this: “When you provide me with ${this} I will provide you with ${that}”.

The super-important (and insidiously non-obvious) thing about this is that if you do not provide a written promise (in the form of your API’s documentation), then the behavior of your API becomes the implicit promise.

The most important thing to note here is that when your documentation is wrong, the promise of your actual behavior wins every single time.

Keep your promises

When your promises don’t match your actual results things get hairy.

Let’s take a look at a completely hypothetical situation.

  1. You have an API that is documented to return a json object with a success member which should be a boolean value.
  2. You have a case (maybe all cases) where success is actually rendered as an integer (0 for false, 1 for true).
  3. John has an app written in a strongly-typed language that works around this by defining success as an integer type instead of a boolean type. Because John was busy, he never got around to letting you know. Or maybe John never knew because he simply inspected your API and worked backwards from the responses that you gave. Now John’s app has 100k users depending on this functionality.
  4. Mary is writing an app, and because Mary doesn’t like to play fast and loose (and she doesn’t want her app to break later on) she submits an issue pointing out that you are returning the wrong type.

At this point you are trapped. The existing user base (and by extension their user base) is committed to integers. And you only have four options.

  1. You can cripple an existing and deployed application enjoyed by 100k users.
  2. You can version your API — an entire new version to correct what should be a boolean value.
  3. You can work with John to roll out a new version of the app which can handle both (but maybe his app is in the iOS app store, and getting everyone to update is impossible, takes a long time, and/or would require a lengthy, and potentially costly, review process by yet another party).
  4. You make a really sad face and change your promise — to reflect that you are going to do what is actually the less correct thing, forever.

Because you wrote an API whose promise was wrong, or whose promise was missing, you have painted yourself into a very undesirable corner. You’re now in a place where doing the right thing for the right reasons is the wrong move.

So do yourself, and everyone else, one of two favors — depending on the position in which you find yourself.

If you’re producing an API, take extra care to make sure that your results match your documentation (and you need to have documentation).

If you’re consuming an API, don’t be like John. Don’t work backwards from the data — work forwards from the docs. And if the docs are wrong you should submit a ticket and wait for it to be fixed (or at the very, very least, make sure your workaround deals with both the documented expectation and the actual incorrect return value).

In conclusion

Just like a child, it takes a village to raise a good, decent, hard working API.

REST Development Console — now open source!

For developers working with the REST API, the browser-based API console is an essential debugging tool. It allows you to test your API queries and interactively explore the results (or errors) that the API returns.

REST API console - exploring results

It also puts the documentation at your fingertips and allows you to build a custom query right from any method’s description.

REST API console - building a query

Like the REST API itself, this tool works for any blog on WordPress.com and for any self-hosted WordPress install using Jetpack.

With the addition of implicit OAuth, we’ve released an open-source version of the API console that you can run yourself.

First, you’ll want to create a WordPress.com application (or modify an existing one) and make sure to set the Javascript Origins option. This should be the fully-qualified URL (including http:// or https:// ) of the site you’ll be running the API console on. To run it locally, just use “http://localhost”.

REST API console - JS origins setting

Then, just head on over to the GitHub repository, clone your own copy, and put your application’s info in the file config.json.

Screen Shot 2014-06-10 at 5.23.59 PM

When you’re running the console locally, you can authenticate by clicking on the box in the lower-right corner.

dev console - auth button

Once you’re linked to a blog, the blog’s ID will be shown in the lower right. You can click on it to change which blog you’re working with.

Important note: When you’ve connected to a blog, the console is hooked up to the live database — any changes you make will be reflected on that blog! You might want to create a test blog if you’re planning to make any requests that will modify content.

The REST API console is located at https://developer.wordpress.com/console/.

If you’re using the API, we’d love to hear what you think! Have you used the development console? What’s great (or not so great) about it?

Meet Sulfur — a Media Manager App Built in JavaScript

Since Automattic is a distributed company and a lot of us work from home, we hold meetups to get face-to-face interaction. The whole company meets up once a year and individual teams get together more often. One component of those meetups is a “meetup project” that we all work on together.

The team I lead — “Team I/O*” — just finished a lovely week in Reykjavik, Iceland. Our team is responsible for partnerships and our APIs.

We spent the first day releasing better JavaScript support for our APIs. After that we decided to make an example app, mainly focusing on the new CORS support and implicit OAuth system.

We decided to build a media manager purely in the browser. We picked a codename and Sulfur was born.

Update: Check out the live demo!

Screen Shot 2014-05-20 at 5.18.13 PM

Sulfur is an app built using Backbone, Underscore.js, Plupload, jQuery, MomentJS, Bootstrap, RequireJS, and the WordPress.com JSON REST API.

It shows how you can use Implicit OAuth to do authentication without a server component. It also provides examples for using the API with Backbone and third party libraries like plupload for uploading media.

Sulfur allows you to upload images, view your entire media library contents, view meta data, and delete images. All the code is open source. Check out Sulfur on GitHub.

We learned a lot while building the app and hope to use that knowledge to improve our APIs both for internal and external use.

Screen Shot 2014-05-20 at 9.11.02 PM

Let us know what you think in the comments!

*Named after one of the moons of Jupiter and our love of APIs (input/output)

Querying Posts Without query_posts

Here at WordPress.com, we have over 200 themes (and even more plugins) running inside the biggest WordPress installation around (that we know of anyway!) With all of that code churning around our over 2,000 servers worldwide, there’s one particular WordPress function that we actually try to shy away from; query_posts()

If you think you need to use it, there is most likely a better approach. query_posts() doesn’t do what most of us probably think it does.

We think that it:

  • Resets the main query loop.
  • Resets the main post global.

But it actually:

  • Creates a new WP_Query object with whatever parameters you set.
  • Replaces the existing main query loop with a new one (that is no longer the main query)

Confused yet? It’s okay if you are, thousands of others are, too.

This is what query_posts actually looks like:

/**
 * Set up The Loop with query parameters.
 *
 * This will override the current WordPress Loop and shouldn't be used more than
 * once. This must not be used within the WordPress Loop.
 *
 * @since 1.5.0
 * @uses $wp_query
 *
 * @param string $query
 * @return array List of posts
 */
function &query_posts($query) {
	unset($GLOBALS['wp_query']);
	$GLOBALS['wp_query'] = new WP_Query();
	return $GLOBALS['wp_query']->query($query);
}

Rarely, if ever, should anyone need to do this. The most commonly used scenario is a theme that has featured posts that appear visually before the main content area. Below is a screen-grab of the iTheme2 theme for reference.

The thing to keep in mind, is by the time the theme is starting to display the featured posts, WordPress has already:

  • looked at the URL…
  • parsed out what posts fit the pattern…
  • retrieved those posts from the database (or cache)…
  • Filled the $wp_query and $post globals in PHP.

Let’s think about it like this:

The “Main Loop” consists of 3 globals, 2 of which actually matter.

  • $wp_the_query (does not matter)
  • $wp_query (matters)
  • $post (matters)

The reason $wp_the_query doesn’t matter is because you’ll *never* directly touch it, nor should you try. It’s designed to be the default main query regardless of how poisoned the $wp_query and $post globals might become.

Back to Featured Posts

When you want to query the database to get those featured posts, we all know it’s time to make a new WP_Query and loop through them, like so…

$featured_args = array(
	'post__in' => get_option( 'sticky_posts' ),
	'post_status' => 'publish',
	'no_found_rows' => true
);

// The Featured Posts query.
$featured = new WP_Query( $featured_args );

// Proceed only if published posts with thumbnails exist
if ( $featured->have_posts() ) {
	while ( $featured->have_posts() ) {
		$featured->the_post();
		if ( has_post_thumbnail( $featured->post->ID ) ) {
			/// do stuff here
		}
	}

	// Reset the post data
	wp_reset_postdata();
}

Great! Two queries, no conflicts; all is right in the world. You are remembering to use wp_reset_postdata(), right? ;) If not, the reason you do it is because every new WP_Query replaces the $post global with whatever iteration of whatever loop you just ran. If you don’t reset it, you might end up with $post data from your featured posts query, in your main loop query. Yuck.

Remember query_posts()? Look at it again; it’s replacing $wp_query and not looking back to $wp_the_query to do it. Lame, right? It just takes whatever parameters you passed it and assumes it’s exactly what you want.

I’ll let you stew on that for a second; let’s keep going…

What if, after your featured-posts query is done and you’ve dumped out all your featured posts, you want to *exclude* any featured posts from your main loop?

Think about this…

It makes sense that you would want to use query_posts() and replace the main $wp_query loop, right? I mean, how else would you know what to exclude, if you didn’t run the featured posts query BEFORE the main loop query happened?

EXACTLY!

Paradox, and WordPress and WP_Query are designed to handle this extremely gracefully with an action called ‘pre_get_posts

Think of it as the way to convince WordPress that what it wants to do, maybe isn’t really what it wants to do. In our case, rather than querying for posts a THIRD time (main loop, featured posts, query_posts() to exclude) we can modify the main query ahead of time, exclude what we don’t want, and run the featured query as usual. Genius!

This is how we’re doing it now in the iTheme2 theme:

/**
 * Filter the home page posts, and remove any featured post ID's from it. Hooked
 * onto the 'pre_get_posts' action, this changes the parameters of the query
 * before it gets any posts.
 *
 * @global array $featured_post_id
 * @param WP_Query $query
 * @return WP_Query Possibly modified WP_query
 */
function itheme2_home_posts( $query = false ) {

	// Bail if not home, not a query, not main query, or no featured posts
	if ( ! is_home() || ! is_a( $query, 'WP_Query' ) || ! $query->is_main_query() || ! itheme2_featuring_posts() )
		return;

	// Exclude featured posts from the main query
	$query->set( 'post__not_in', itheme2_featuring_posts() );

	// Note the we aren't returning anything.
	// 'pre_get_posts' is a byref action; we're modifying the query directly.
}
add_action( 'pre_get_posts', 'itheme2_home_posts' );

/**
 * Test to see if any posts meet our conditions for featuring posts.
 * Current conditions are:
 *
 * - sticky posts
 * - with featured thumbnails
 *
 * We store the results of the loop in a transient, to prevent running this
 * extra query on every page load. The results are an array of post ID's that
 * match the result above. This gives us a quick way to loop through featured
 * posts again later without needing to query additional times later.
 */
function itheme2_featuring_posts() {
	if ( false === ( $featured_post_ids = get_transient( 'featured_post_ids' ) ) ) {

		// Proceed only if sticky posts exist.
		if ( get_option( 'sticky_posts' ) ) {

			$featured_args = array(
				'post__in'      => get_option( 'sticky_posts' ),
				'post_status'   => 'publish',
				'no_found_rows' => true
			);

			// The Featured Posts query.
			$featured = new WP_Query( $featured_args );

			// Proceed only if published posts with thumbnails exist
			if ( $featured->have_posts() ) {
				while ( $featured->have_posts() ) {
					$featured->the_post();
					if ( has_post_thumbnail( $featured->post->ID ) ) {
						$featured_post_ids[] = $featured->post->ID;
					}
				}

				set_transient( 'featured_post_ids', $featured_post_ids );
			}
		}
	}

	// Return the post ID's, either from the cache, or from the loop
	return $featured_post_ids;
}

It reads like this:

  • Filter the main query.
  • Only proceed if we’re on the home page.
  • Only proceed if our query isn’t somehow messed up.
  • Only proceed if we want to filter the main query.
  • Only proceed if we actually have featured posts.
  • Featured posts? Let’s check for stickies.
  • Query for posts if they exist
  • (At this point, WP_Query runs again, and so does our ‘pre_get_posts’ filter. Thanks to our checks above, our query for featured posts won’t get polluted by our need to exclude things.
  • Take each post ID we get, and store them in an array.
  • Save that array as a transient so we don’t keep doing this on each page load.
  • We’re done with featured posts, and back in our main query filter again.
  • In our main query, exclude the post ID’s we just got.
  • Return the modified main query variables.
  • Let WordPress handle the rest.

With a little foresight into what we want to do, we’re able to architect ourselves a nice bit of logic to avoid creating a third, potentially costly WP_Query object.

Another, more simple example

The Depo Masthead theme wants to limit the home page to only 3 posts. We already learned earlier we *don’t* want to run query_posts() since it will create a new WP_Query object we don’t need. So, what do we do?

/**
 * Modify home query to only show 3 posts
 *
 * @param WP_Query $query
 * @return WP_Query
 */
function depo_limit_home_posts_per_page( $query = '' ) {

	// Bail if not home, not a query, not main query, or no featured posts
	if ( ! is_home() || ! is_a( $query, 'WP_Query' ) || ! $query->is_main_query() )
		return;

	// Home only gets 3 posts
	$query->set( 'posts_per_page', 3 );
}
add_action( 'pre_get_posts', 'depo_limit_home_posts_per_page' );

Stop me if you’ve heard this one. We hook onto ‘pre_get_posts’ and return a modified query! Woo woo!

Themes are the most common culprit, but they aren’t alone. More often than not, we all forget to clean up after ourselves, reset posts and queries when we’re done, etc… By avoiding query_posts() all together, we can be confident our code is behaving the way we intended, and that it’s playing nicely with the plugins and themes we’re running too.