book

Understanding Big Data Scalability: Big Data Scalability Series, Part I

Name: Understanding Big Data Scalability: Big Data Scalability Series, Part I
Author: Cory Isaacson
ISBN: 9780133599121

by Cory Isaacson

July 2014

Beginner to intermediate

123 pages

3h 17m

English

Pearson

Read now

Unlock full access

About This eBook
Title Page
Copyright Page
Praise for Understanding Big Data Scalability
Contents
Preface
About the Author
1. Introduction
What You Will LearnThe Challenge of Big DataToday’s Big Data ExplosionBackground for This BookWhy the Focus on Database Sharding?Summary
2. Why Databases Slow Down
The Database Slowdown CurveA Hard-Won LessonThe Enemies of Database PerformanceHow to Identify Database Slowdown IssuesSummary
3. What is Big Data?
What Is Big Data Anyhow?Sources of Big DataSummary

4. Big Data in the Real World
Some Real-World Examples of Big DataFullContactSocial PointSummary
5. Scaling Your Application
The Goals of a Scalable Application PlatformThe Excitement of a High-Growth SuccessApplication Scalability FundamentalsA Typical Online Application ArchitectureAnalytics Application ArchitecturesScaling an Analytics ApplicationHow to Scale a Traditional Online ApplicationSummary
6. When to Scale Your Database
The Last Mile of Application ScalabilityHow Do You Know When to Scale Your Database?Options for Increasing Database PerformanceIndications of the Need for ScaleSummary
7. All Data Is Relational
Relational Data OverviewThe Meaning of DataRelationships MatterWhy Data Modelling Is Critical to SuccessSummary
8. It’s All About Sharding
Sharding: The Ultimate Answer to Database SlowdownThe Laws of DatabasesSharding DefinedBlack-Box ShardingRelational ShardingSummary
9. Scaling Big Data: The Endgame
The Game of Big Data ScalabilityScaling Big Data TheoryThe Big Data EndgameData LocalitySummary
Index

Overview

Get Started Scaling Your Database Infrastructure for High-Volume Big Data Applications

“Understanding Big Data Scalability presents the fundamentals of scaling databases from a single node to large clusters. It provides a practical explanation of what ‘Big Data’ systems are, and fundamental issues to consider when optimizing for performance and scalability. Cory draws on many years of experience to explain issues involved in working with data sets that can no longer be handled with single, monolithic relational databases.... His approach is particularly relevant now that relational data models are making a comeback via SQL interfaces to popular NoSQL databases and Hadoop distributions.... This book should be especially useful to database practitioners new to scaling databases beyond traditional single node deployments.” —Brian O’Krafka, software architect

Understanding Big Data Scalability presents a solid foundation for scaling Big Data infrastructure and helps you address each crucial factor associated with optimizing performance in scalable and dynamic Big Data clusters.

Database expert Cory Isaacson offers practical, actionable insights for every technical professional who must scale a database tier for high-volume applications. Focusing on today’s most common Big Data applications, he introduces proven ways to manage unprecedented data growth from widely diverse sources and to deliver real-time processing at levels that were inconceivable until recently.

Isaacson explains why databases slow down, reviews each major technique for scaling database applications, and identifies the key rules of database scalability that every architect should follow.

You’ll find insights and techniques proven with all types of database engines and environments, including SQL, NoSQL, and Hadoop. Two start-to-finish case studies walk you through planning and implementation, offering specific lessons for formulating your own scalability strategy. Coverage includes

Understanding the true causes of database performance degradation in today’s Big Data environments

Scaling smoothly to petabyte-class databases and beyond

Defining database clusters for maximum scalability and performance

Integrating NoSQL or columnar databases that aren’t “drop-in” replacements for RDBMSes

Scaling application components: solutions and options for each tier

Recognizing when to scale your data tier—a decision with enormous consequences for your application environment

Why data relationships may be even more important in non-relational databases

Why virtually every database scalability implementation still relies on sharding, and how to choose the best approach

How to set clear objectives for architecting high-performance Big Data implementations

The Big Data Scalability Series is a comprehensive, four-part series, containing information on many facets of database performance and scalability. Understanding Big Data Scalability is the first book in the series.

Learn more and join the conversation about Big Data scalability at bigdatascalability.com.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9780133599121

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills