O'Reilly logo

Seven Databases in Seven Weeks by Jim R. Wilson, Eric Redmond

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Day 2: Working with Big Data

With Day 1’s table creation and manipulation under our belts, it’s time to start adding some serious data to our wiki table. Today, we’ll script against the HBase APIs, ultimately streaming Wikipedia content right into our wiki! Along the way, we’ll pick up some performance tricks for making faster import jobs. Finally, we’ll poke around in HBase’s internals to see how it partitions data into regions, achieving both performance and disaster recovery goals.

Importing Data, Invoking Scripts

One common problem people face when trying a new database system is how to migrate data into it. Handcrafting Put operations with static strings, like we did in Day 1, is all well and good, but we can do better.

Fortunately, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required