Chapter 7. More About Types

One of the best features of HDF5 is the huge variety of datatypes it supports. In some cases, the HDF5 feature set goes beyond NumPy. To maintain performance and create interoperable files, it’s important to understand exactly what’s going on when you use each type.

The HDF5 Type System

As with NumPy, all data in HDF5 has an associated type. The HDF5 type system is quite flexible and includes the usual suspects like integers and floats of various precisions, as well as strings and vector types.

Table 7-1 shows the native HDF5 datatypes and how they map to NumPy. Keep in mind that most of the types (integers and floats, for example) support a number of different precisions. For example, on most NumPy installations integers come in 1-, 2-, 4-, and 8-byte widths.

Table 7-1. HDF5 types
Native HDF5 typeNumPy equivalent

Integer

dtype("i")

Float

dtype("f")

Strings (fixed width)

dtype("S10")

Strings (variable width)

h5py.special_dtype(vlen=bytes)

Compound

dtype([ ("field1": "i"), ("field2": "f") ])

Enum

h5py.special_dtype(enum=("i",{"RED":0, "GREEN":1, "BLUE":2}))

Array

dtype("(2,2)f")

Opaque

dtype("V10")

Reference

h5py.special_dtype(ref=h5py.Reference)

The h5py package (and PyTables) implement a few additional types on top of this system. Table 7-2 lists additions made by h5py that are described in this chapter.

Table 7-2. Additional Python-side types
Python typeNumPy expressionStored as

Boolean

np.dtype("bool")

HDF5 enum with FALSE=0, TRUE=1

Complex

np.dtype("complex")

HDF5 compound with ...

Get Python and HDF5 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.