book

Time Series Databases: New Ways to Store and Access Data

by Ted Dunning, Ellen Friedman

December 2014

Intermediate to advanced

60 pages

1h 55m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
In This Book
1. Time Series Data: Why Collect It?
Time Series Data Is an Old IdeaTime Series Data Sets Reveal TrendsA New Look at Time Series Databases
2. A New World for Time Series Databases
Stock Trading and Time Series DataMaking Sense of SensorsTalking to Towers: Time Series and TelecomData Center MonitoringEnvironmental Monitoring: Satellites, Robots, and MoreThe Questions to Be Asked
3. Storing and Processing Time Series Data
Simplest Data Store: Flat FilesMoving Up to a Real Database: But Will RDBMS Suffice?NoSQL Database with Wide TablesNoSQL Database with Hybrid DesignGoing One Step Further: The Direct Blob Insertion DesignWhy Relational Databases Aren’t Quite RightHybrid Design: Where Can I Get One?
4. Practical Time Series Tools
Introduction to Open TSDB: Benefits and LimitationsArchitecture of Open TSDBValue Added: Direct Blob Loading for High PerformanceA New Twist: Rapid Loading of Historical DataSummary of Open Source Extensions to Open TSDB for Direct Blob LoadingAccessing Data with Open TSDBWorking on a Higher LevelAccessing Open TSDB Data Using SQL-on-Hadoop ToolsUsing Apache Spark SQLWhy Not Apache Hive?Adding Grafana or Metrilyx for Nicer DashboardsPossible Future Extensions to Open TSDBCache Coherency Through Restart Logs
5. Solving a Problem You Didn’t Know You Had
The Need for Rapid Loading of Test DataUsing Blob Loader for Direct Insertion into the Storage Tier
6. Time Series Data in Practical Machine Learning
Predictive Maintenance Scheduling
7. Advanced Topics for Time Series Databases
Stationary DataWandering SourcesSpace-Filling Curves
8. What’s Next?
A New Frontier: TSDBs, Internet of Things, and MoreNew Options for Very High-Performance TSDBsLooking to the Future
A. Resources
Tools for Working with NoSQL Time Series DatabasesMore Information About Use Cases Mentioned in This BookAdditional O’Reilly Publications by Dunning and Friedman

About the Authors
Colophon
Copyright

Content preview from Time Series Databases: New Ways to Store and Access Data

Chapter 5. Solving a Problem You Didn’t Know You Had

Whenever you build a system, it’s good practice to do testing before you begin using it, especially before it goes into production. If your system is designed to store huge amounts of time series data—such as two years’ worth of sensor data—for critical operations or analysis, it’s particularly important to test it. The failure of a monitoring system for drilling or pump equipment on an oil rig, for manufacturing equipment, medical equipment, or an airplane, can have dire consequences in terms financial loss and physical damage, so it is essential that your time series data storage engine is not only high performance, but also robust. Sometimes people do advance testing on a small data sample, but tests at this small scale are not necessarily reliable predictors of how your system will function at scale. For serious work, you want a serious test, using full-scale data. But how can you do that?

The Need for Rapid Loading of Test Data

Perhaps you have preexisting data for a long time range that could be used for testing, and at least you can fairly easily build a program to generate synthetic data to simulate your two years of information. Either way, now you’re faced with a problem you may not have realized you have: if your system design was already pushing the limits on data ingestion to handle the high-velocity data expected in production, how will you deal with loading two years’ worth of such data in a reasonable time? If you ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781491920909Errata

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Time Series Databases: New Ways to Store and Access Data

by Ted Dunning, Ellen Friedman

Chapter 5. Solving a Problem You Didn’t Know You Had

The Need for Rapid Loading of Test Data

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.