November 2017
Beginner to intermediate
290 pages
7h 34m
English
The state of the operator needs to be extracted and saved to a durable storage. The StorageAgent interface makes this pluggable behavior. The default implementation uses the Kryo serialization framework to turn the operator state into bytes and relies on HDFS to save the serialized state to files. HDFS is already part of the Hadoop installation, and so no additional external system dependency is required by default. Any alternative FS implementation supporting the same interface could optionally be used.
The default storage agent performs a checkpoint as follows:
Read now
Unlock full access