O'Reilly logo

Storm Real-time Processing Cookbook by Quinton Anderson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Creating a URL stream using a Twitter filter

There are many approaches to sourcing input documents for the TF-IDF implementation. This recipe will present an approach using Twitter.

Twitter provides a stream API that allows you to receive a sample of the total tweets within Twitter. The approach of using a sample is more than sufficient for most applications, as more data may not improve your results, especially in any meaningful way relative to the costs involved. For this reason, this is the only way Twitter allows you to consume the data without special agreements in place.

Tweet status streams can be filtered using the Twitter streaming API, so that only a subset of the population is sampled and delivered in a stream. This enables one to listen ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required