Skip to Content
Programming Pig, 2nd Edition
book

Programming Pig, 2nd Edition

by Alan Gates, Daniel Dai
November 2016
Intermediate to advanced content levelIntermediate to advanced
368 pages
9h 59m
English
O'Reilly Media, Inc.
Content preview from Programming Pig, 2nd Edition

Preface

Data is addictive. Our ability to collect and store it has grown massively in the last several decades, yet our appetite for ever more data shows no sign of being satiated. Scientists want to be able to store more data in order to build better mathematical models of the world. Marketers want better data to understand their customers’ desires and buying habits. Financial analysts want to better understand the workings of their markets. And everybody wants to keep all their digital photographs, movies, emails, etc.

Before the computer and Internet revolutions, the US Library of Congress was one of the largest collections of data in the world. It is estimated that its printed collections contain approximately 10 terabytes (TB) of information. Today, large Internet companies collect that much data on a daily basis. And it is not just Internet applications that are producing data at prodigious rates. For example, the Large Synoptic Survey Telescope (LSST) under construction in Chile is expected to produce 15 TB of data every day.

Part of the reason for the massive growth in available data is our ability to collect much more data. Every time someone clicks a website’s links, the web server can record information about what page the user was on and which link he clicked. Every time a car drives over a sensor in the highway, its speed can be recorded. But much of the reason is also our ability to store that data. Ten years ago, telescopes took pictures of the sky every night. But ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Programming Pig

Programming Pig

Alan Gates
Pig Design Patterns

Pig Design Patterns

Pradeep Pasupuleti
Probabilistic Models for Dynamical Systems, 2nd Edition

Probabilistic Models for Dynamical Systems, 2nd Edition

Haym Benaroya, Seon Mi Han, Mark Nagurka

Publisher Resources

ISBN: 9781491937082Errata PageSupplemental Content