How Canaries Help Us Merge Good Pull Requests

At WordPress.com we strive to provide a consistent and reliable user experience as we merge and release hundreds of code changes each week.

We run automated unit and component tests for our Calypso user interface on every commit against every pull request (PR).

We also have 32 automated end-to-end (e2e) test scenarios that, until recently, we would only automatically run across our platform after merging and deploying to production. While these e2e scenarios have found regressions fairly quickly after deploying (the 32 scenarios execute in parallel in just 10 minutes), they don’t prevent us from merging and releasing regressions to our customer experience.

Introducing our Canaries

Earlier this year we decided to identify three of our 32 automated end-to-end test scenarios that would act as our “canaries”: a minimal subset of automated tests to quickly tell us if our most important flows are broken. These tests execute after a pull request is merged and deployed to our staging environment, but before we deploy the changes to all our customers in production.

These canaries have been very successful in preventing us from deploying regressions to production, however, running these after merging to master (and automatically deploying code to staging) means we’d have to revert code changes if something was wrong. This wasn’t good enough.

Last month we took our canaries to the next level. Instead of just running canaries on merging to master, we now execute canaries against live pull requests and provide feedback to the pull request itself about the canary test status.

How does it work?

Our process is that if you’re a developer working on a pull request for Calypso and it’s ready to review, you add the “[Status] Needs Review” label to alert someone to review your code. Adding this label automatically triggers the e2e canary tests against your pull request:

The results are separate from the unit and component tests which already run against every pull request (on every push).

How does this technically work?

Our automated e2e tests are open-source, but they reside separately from our Calypso GitHub code repository. This is because the e2e scenarios represent the entire WordPress.com customer experience: they’re not just automated Calypso user interface tests. For example, our tests include verifying that our customers receive appropriate emails that are not part of the Calypso code base.

We “connect” our two projects using CircleCI builds and a custom “bridge” written in Node.js (which is also open-source). This bridge provides webhooks for GitHub pull requests to execute CircleCI builds using the CircleCI API. It reports the status of these builds using the GitHub status API. We do apply a little bit of cleverness in that we can match branch names so we can make changes to our e2e tests that correspond to changes to our Calypso changes. Our bridge runs on Automattic’s VIP Go platform.

A summary and what’s next?

Running our canaries on pull requests has been a great success. Developers love the confidence the canaries give them in knowing that our key end-to-end scenarios won’t regress when introducing changes rapidly.

We’d now like to expand the bridge’s scope to optionally run the full set of 32 end-to-end automated tests on pull requests that have a broader impact, changes like upgrading a dependency or refactoring a framework design pattern. This again will give our developers even greater confidence in the ability to merge code and provide a consistent and reliable experience to our customers.

Get involved!

Feel free to check out our e2e tests repository, or our bridge repository, make a fork, and provide us with any feedback or suggestions. Pull requests are always welcomed.

***

Alister Scott is an Excellence Wrangler for Automattic and blogs regularly about software testing at his blog WatirMelon.

Querying Posts Without query_posts

Here at WordPress.com, we have over 200 themes (and even more plugins) running inside the biggest WordPress installation around (that we know of anyway!) With all of that code churning around our over 2,000 servers worldwide, there’s one particular WordPress function that we actually try to shy away from; query_posts()

If you think you need to use it, there is most likely a better approach. query_posts() doesn’t do what most of us probably think it does.

We think that it:

  • Resets the main query loop.
  • Resets the main post global.

But it actually:

  • Creates a new WP_Query object with whatever parameters you set.
  • Replaces the existing main query loop with a new one (that is no longer the main query)

Confused yet? It’s okay if you are, thousands of others are, too.

This is what query_posts actually looks like:

/**
 * Set up The Loop with query parameters.
 *
 * This will override the current WordPress Loop and shouldn't be used more than
 * once. This must not be used within the WordPress Loop.
 *
 * @since 1.5.0
 * @uses $wp_query
 *
 * @param string $query
 * @return array List of posts
 */
function &query_posts($query) {
	unset($GLOBALS['wp_query']);
	$GLOBALS['wp_query'] = new WP_Query();
	return $GLOBALS['wp_query']->query($query);
}

Rarely, if ever, should anyone need to do this. The most commonly used scenario is a theme that has featured posts that appear visually before the main content area. Below is a screen-grab of the iTheme2 theme for reference.

The thing to keep in mind, is by the time the theme is starting to display the featured posts, WordPress has already:

  • looked at the URL…
  • parsed out what posts fit the pattern…
  • retrieved those posts from the database (or cache)…
  • Filled the $wp_query and $post globals in PHP.

Let’s think about it like this:

The “Main Loop” consists of 3 globals, 2 of which actually matter.

  • $wp_the_query (does not matter)
  • $wp_query (matters)
  • $post (matters)

The reason $wp_the_query doesn’t matter is because you’ll *never* directly touch it, nor should you try. It’s designed to be the default main query regardless of how poisoned the $wp_query and $post globals might become.

Back to Featured Posts

When you want to query the database to get those featured posts, we all know it’s time to make a new WP_Query and loop through them, like so…

$featured_args = array(
	'post__in' => get_option( 'sticky_posts' ),
	'post_status' => 'publish',
	'no_found_rows' => true
);

// The Featured Posts query.
$featured = new WP_Query( $featured_args );

// Proceed only if published posts with thumbnails exist
if ( $featured->have_posts() ) {
	while ( $featured->have_posts() ) {
		$featured->the_post();
		if ( has_post_thumbnail( $featured->post->ID ) ) {
			/// do stuff here
		}
	}

	// Reset the post data
	wp_reset_postdata();
}

Great! Two queries, no conflicts; all is right in the world. You are remembering to use wp_reset_postdata(), right? 😉 If not, the reason you do it is because every new WP_Query replaces the $post global with whatever iteration of whatever loop you just ran. If you don’t reset it, you might end up with $post data from your featured posts query, in your main loop query. Yuck.

Remember query_posts()? Look at it again; it’s replacing $wp_query and not looking back to $wp_the_query to do it. Lame, right? It just takes whatever parameters you passed it and assumes it’s exactly what you want.

I’ll let you stew on that for a second; let’s keep going…

What if, after your featured-posts query is done and you’ve dumped out all your featured posts, you want to *exclude* any featured posts from your main loop?

Think about this…

It makes sense that you would want to use query_posts() and replace the main $wp_query loop, right? I mean, how else would you know what to exclude, if you didn’t run the featured posts query BEFORE the main loop query happened?

EXACTLY!

Paradox, and WordPress and WP_Query are designed to handle this extremely gracefully with an action called ‘pre_get_posts

Think of it as the way to convince WordPress that what it wants to do, maybe isn’t really what it wants to do. In our case, rather than querying for posts a THIRD time (main loop, featured posts, query_posts() to exclude) we can modify the main query ahead of time, exclude what we don’t want, and run the featured query as usual. Genius!

This is how we’re doing it now in the iTheme2 theme:

/**
 * Filter the home page posts, and remove any featured post ID's from it. Hooked
 * onto the 'pre_get_posts' action, this changes the parameters of the query
 * before it gets any posts.
 *
 * @global array $featured_post_id
 * @param WP_Query $query
 * @return WP_Query Possibly modified WP_query
 */
function itheme2_home_posts( $query = false ) {

	// Bail if not home, not a query, not main query, or no featured posts
	if ( ! is_home() || ! is_a( $query, 'WP_Query' ) || ! $query->is_main_query() || ! itheme2_featuring_posts() )
		return;

	// Exclude featured posts from the main query
	$query->set( 'post__not_in', itheme2_featuring_posts() );

	// Note the we aren't returning anything.
	// 'pre_get_posts' is a byref action; we're modifying the query directly.
}
add_action( 'pre_get_posts', 'itheme2_home_posts' );

/**
 * Test to see if any posts meet our conditions for featuring posts.
 * Current conditions are:
 *
 * - sticky posts
 * - with featured thumbnails
 *
 * We store the results of the loop in a transient, to prevent running this
 * extra query on every page load. The results are an array of post ID's that
 * match the result above. This gives us a quick way to loop through featured
 * posts again later without needing to query additional times later.
 */
function itheme2_featuring_posts() {
	if ( false === ( $featured_post_ids = get_transient( 'featured_post_ids' ) ) ) {

		// Proceed only if sticky posts exist.
		if ( get_option( 'sticky_posts' ) ) {

			$featured_args = array(
				'post__in'      => get_option( 'sticky_posts' ),
				'post_status'   => 'publish',
				'no_found_rows' => true
			);

			// The Featured Posts query.
			$featured = new WP_Query( $featured_args );

			// Proceed only if published posts with thumbnails exist
			if ( $featured->have_posts() ) {
				while ( $featured->have_posts() ) {
					$featured->the_post();
					if ( has_post_thumbnail( $featured->post->ID ) ) {
						$featured_post_ids[] = $featured->post->ID;
					}
				}

				set_transient( 'featured_post_ids', $featured_post_ids );
			}
		}
	}

	// Return the post ID's, either from the cache, or from the loop
	return $featured_post_ids;
}

It reads like this:

  • Filter the main query.
  • Only proceed if we’re on the home page.
  • Only proceed if our query isn’t somehow messed up.
  • Only proceed if we want to filter the main query.
  • Only proceed if we actually have featured posts.
  • Featured posts? Let’s check for stickies.
  • Query for posts if they exist
  • (At this point, WP_Query runs again, and so does our ‘pre_get_posts’ filter. Thanks to our checks above, our query for featured posts won’t get polluted by our need to exclude things.
  • Take each post ID we get, and store them in an array.
  • Save that array as a transient so we don’t keep doing this on each page load.
  • We’re done with featured posts, and back in our main query filter again.
  • In our main query, exclude the post ID’s we just got.
  • Return the modified main query variables.
  • Let WordPress handle the rest.

With a little foresight into what we want to do, we’re able to architect ourselves a nice bit of logic to avoid creating a third, potentially costly WP_Query object.

Another, more simple example

The Depo Masthead theme wants to limit the home page to only 3 posts. We already learned earlier we *don’t* want to run query_posts() since it will create a new WP_Query object we don’t need. So, what do we do?

/**
 * Modify home query to only show 3 posts
 *
 * @param WP_Query $query
 * @return WP_Query
 */
function depo_limit_home_posts_per_page( $query = '' ) {

	// Bail if not home, not a query, not main query, or no featured posts
	if ( ! is_home() || ! is_a( $query, 'WP_Query' ) || ! $query->is_main_query() )
		return;

	// Home only gets 3 posts
	$query->set( 'posts_per_page', 3 );
}
add_action( 'pre_get_posts', 'depo_limit_home_posts_per_page' );

Stop me if you’ve heard this one. We hook onto ‘pre_get_posts’ and return a modified query! Woo woo!

Themes are the most common culprit, but they aren’t alone. More often than not, we all forget to clean up after ourselves, reset posts and queries when we’re done, etc… By avoiding query_posts() all together, we can be confident our code is behaving the way we intended, and that it’s playing nicely with the plugins and themes we’re running too.