Skip to Content
Big Data
book

Big Data

by James Warren, Nathan Marz
April 2015
Beginner to intermediate
328 pages
11h 1m
English
Manning Publications
Content preview from Big Data

Chapter 9. An example batch layer: Implementation

This chapter covers

  • Ingesting new data into the master dataset
  • Managing the details of a batch workflow
  • Integrating Thrift-based graph schemas, Pail, and JCascalog

In the last chapter you saw the architecture and algorithms for the batch layer for SuperWebAnalytics.com. Let’s now translate that to a complete working implementation using the tools you’ve learned about like Thrift, Pail, and JCascalog. In the process, you’ll see that the code matches the pipe diagrams and workflows developed in the previous chapter very closely. This is a sign that the abstractions used are sound, because you can write code similar to how you think about the problems.

As always happens with real-world tools, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Big Data For Dummies

Big Data For Dummies

Judith Hurwitz, Alan Nugent, Dr. Fern Halper, Marcia Kaufman

Publisher Resources

ISBN: 9781617290343Publisher SupportOtherPublisher WebsiteSupplemental ContentPurchase Link