Preface

Technology is helping us generate data at a rate so fast that we have to continually invent words to describe the scale. For example, a zettabyte (ZB) is ten bytes to the 21st power (1021). To give you a sense of the magnitude if that number, consider this:

If a byte were the size of a postage stamp, approximately one square inch, and the surface of the planet Earth is 12,476,143,744,000 square inches (12 trillion), a zettabyte of stamps would cover the Earth 6 trillion times, +/−.

Don’t even try to figure out how deep the layer of stamps would be.

In 2012, the world database held 2 zetabytes. We are generating 2 trillion gigabytes every day and will double the world database in 1.2 years. Then it will accelerate compounding every year or faster. Eighty percent of the data is unstructured. This is the biggest technology revolution since the movable type printing press 500 years ago. In every revolution, there are opportunities: opportunities that will be seized by those armed with new tools and a new way of thinking.

The point is that the amount of data being generated daily is unimaginable. The good news is we don’t have to deal with all that, because we can’t. The vast majority of data spinning out daily will never be used. On the other side, each of those bytes has potential value if collected, organized, related to other data, modeled, and applied to predicting the outcome of some investment decision. Clearly, we are standing amid the greatest accumulation of data ...

Get Predictive Analytics for Human Resources now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.