Serializing Data Using the pickle and cPickle Modules
Credit: Luther Blissett
Problem
You have a Python data structure, which may include fundamental Python objects, and possibly classes and instances, and you want to serialize it and reconstruct it at a reasonable speed.
Solution
If you don’t want to assume that your data is
composed of only fundamental Python objects, or you need portability
across versions of Python, or you need to transmit the serialized
form as text, the best way of serializing your data is with the
cPickle module (the pickle
module is a pure-Python equivalent, but it’s far
slower and not worth using except if you’re missing
cPickle). For example:
data = {12:'twelve', 'feep':list('ciao'), 1.23:4+5j, (1,2,3):u'wer'}You can serialize data to a text string:
import cPickle text = cPickle.dumps(data)
or to a binary string, which is faster and takes up less space:
bytes = cPickle.dumps(data, 1)
You can now sling text or bytes
around as you wish (e.g., send it across a network, put it as a BLOB
in a database, etc.), as long as you keep it intact. In the case of
bytes, this means keeping its arbitrary binary
bytes intact. In the case of text, this means
keeping its textual structure intact, including newline characters.
Then you can
reconstruct the data
at any time, regardless of machine architecture or Python release:
redata1 = cPickle.loads(text) redata2 = cPickle.loads(bytes)
Either call reconstructs a data structure that compares equal to
data. In other words, the order ...