WordPress.com publishers and visitors produce thousands of new posts and comments every hour. With the addition of IntenseDebate comments, this is a lot of data. These content streams are available in three real-time formats from redundant servers. These streams are intended for partners like search engines and market intelligence providers who would like to ingest a real-time stream of new content from a wide spectrum of publishers.
WordPress.com Firehose data is available exclusively from these partners, please contact them for access:
- Datasift: Extract insights from a universe of human-created data.
- Gnip/Twitter: Twitter’s enterprise API platform delivers real-time and historical social data to power your business at scale.
- Posts Firehose: the Posts Firehose is a stream of posts—averaging 1 million/day—from the tens of millions of websites published on WordPress.com. Posts are also available for Jetpack-powered WordPress(.org) sites, through a separate feed.
- Comments Firehose: the Comments Firehose streams hundreds of thousands of comments every day from WordPress.com and our IntenseDebate commenting platform. Comments are also available for Jetpack-powered WordPress(.org) sites, through a separate feed.
- Likes Firehose: the Likes Firehose streams engagement data from WordPress.com’s “like” feature.
- PubSub: An extension of the popular Jabber/XMPP instant messaging protocol. WordPress.com operates a Jabber service at im.wordpress.com that allows all WordPress.com users to subscribe to the blogs of their choice and receive instant notification of new items. However, the full streams are access-controlled.
- XML Stream: Delivers the same pubsub-style XML streams by the much simpler mechanism of an HTTP GET request. This makes implementing the streams as simple as can be. You can view a limited sample of these streams in your browser: posts and comments. (If your browser doesn’t display anything after a few seconds, try Esc or Stop. You can also use command line tools such as
- JSON Stream: A stream of JSON formatted data delivered over HTTP. You can view a limited sample of these streams in your browser: posts and comments.