WordPress publishers and visitors produce thousands of new posts and comments every hour. With the addition of IntenseDebate comments, this is a lot of data. These content streams are available in three real-time formats from redundant servers. These streams are intended for partners like search engines and market intelligence providers who would like to ingest a real-time stream of new content from a wide spectrum of publishers.
Firehose data is available from our partner Datasift, please contact them for access.
- Posts Firehose: the Posts Firehose is a stream of posts—averaging 1 million/day—from the tens of millions of websites published on WordPress.com. Posts are also available for Jetpack-powered WordPress(.org) sites, through a separate feed.
- Comments Firehose: the Comments Firehose streams hundreds of thousands of comments every day from WordPress.com and our IntenseDebate commenting platform. Comments are also available for Jetpack-powered WordPress(.org) sites, through a separate feed.
- Likes Firehose: the Likes Firehose streams engagement data from WordPress.com’s “like” feature.
- PubSub: An extension of the popular Jabber/XMPP instant messaging protocol. WordPress.com operates a Jabber service at im.wordpress.com that allows all WordPress.com users to subscribe to the blogs of their choice and receive instant notification of new items. However, the full streams are access-controlled.
- JSON Stream: A stream of JSON formatted data delivered over HTTP. You can view a very limited sample stream by using
curlin a terminal:
- XML Stream: Delivers the same pubsub-style XML streams by the much simpler mechanism of an HTTP GET request. This makes implementing the streams as simple as can be. You can view a very limited sample stream by using
curlin a terminal:
Firehose Terms of Service
Permitted Uses. You may use Firehose to develop a product or service that searches, displays, analyzes, retrieves, and views information available. You may also use the WordPress.com name or logos and other brand elements that Automattic makes available in order to identify the source of the information, provided the use doesn’t suggest any endorsement by Automattic.
Prohibited Uses. If you use Firehose, you agree not to:
- Engage in, encourage, or facilitate activity that is malicious or illegal under applicable law.
- Interfere with, disrupt, or attack any service or network, including Automattic’s.
- Republish the content, provide any third parties with your access to Firehose, or enable third parties to distribute Firehose data.bq
- Substantially replicate products or services offered by Automattic, or create a competing service, such as by creating a separate publishing platform.
- Display, distribute, or otherwise make available content or data to governmental entities for intelligence gathering or surveillance purposes.
- Use the information in a biased, misleading, or dishonest manner, for example, to promote or publicize a biased political point of view.
- Use the content to profile, or create profiles of, individuals, or directly target individuals with advertisements or other messages.
Termination. If Automattic believes, in its sole discretion, that you have violated or attempted to violate these conditions or the spirit of these Terms, or our Guidelines for Responsible Use of Automattic’s APIs, your ability to use and access Firehose may be temporarily or permanently revoked, with or without notice.