Book description
Gain hands-on experience with HDF5 for storing scientific data in Python. This practical guide quickly gets you up to speed on the details, best practices, and pitfalls of using HDF5 to archive and share numerical datasets ranging in size from gigabytes to terabytes.
Through real-world examples and practical exercises, you’ll explore topics such as scientific datasets, hierarchically organized groups, user-defined metadata, and interoperable files. Examples are applicable for users of both Python 2 and Python 3. If you’re familiar with the basics of Python data analysis, this is an ideal introduction to HDF5.
- Get set up with HDF5 tools and create your first HDF5 file
- Work with datasets by learning the HDF5 Dataset object
- Understand advanced features like dataset chunking and compression
- Learn how to work with HDF5’s hierarchical structure, using groups
- Create self-describing files by adding metadata with HDF5 attributes
- Take advantage of HDF5’s type system to create interoperable files
- Express relationships among data with references, named types, and dimension scales
- Discover how Python mechanisms for writing parallel code interact with HDF5
Publisher resources
Table of contents
- Preface
- 1. Introduction
- 2. Getting Started
- 3. Working with Datasets
- 4. How Chunking and Compression Can Help You
- 5. Groups, Links, and Iteration: The “H” in HDF5
- 6. Storing Metadata with Attributes
- 7. More About Types
- 8. Organizing Data with References, Types, and Dimension Scales
- 9. Concurrency: Parallel HDF5, Threading, and Multiprocessing
- 10. Next Steps
- Index
- About the Author
- Colophon
- Copyright
Product information
- Title: Python and HDF5
- Author(s):
- Release date: November 2013
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781449367831
You might also like
book
Object-Oriented Python
Object-Oriented Python (OOP) is a paradigm that combines data and code into cohesive units, allowing you …
book
Essential SQLAlchemy, 2nd Edition
Dive into SQLAlchemy, the popular, open-source code library that helps Python programmers work with relational databases …
book
Fundamentals of Data Visualization
Effective visualization is the best way to communicate information from the increasingly large and complex datasets …
book
Deep Learning from Scratch
With the resurgence of neural networks in the 2010s, deep learning has become essential for machine …