Serializing Data Using the marshal Module
Credit: Luther Blissett
Problem
You have a Python data structure composed of only fundamental Python objects (e.g., lists, tuples, numbers, and strings, but no classes, instances, etc.), and you want to serialize it and reconstruct it later as fast as possible.
Solution
If you know that your data is composed entirely of fundamental Python
objects, the lowest-level, fastest approach to serializing it (i.e.,
turning it into a string of bytes and later reconstructing it from
such a string) is via the marshal module. Suppose
that data is composed of only elementary Python
data types. For example:
data = {12:'twelve', 'feep':list('ciao'), 1.23:4+5j, (1,2,3):u'wer'}You can serialize data to a byte string at top
speed as follows:
import marshal bytes = marshal.dumps(data)
You
can now sling bytes around as you wish (e.g., send
it across a network, put it as a BLOB in a database, etc.), as long
as you keep its arbitrary binary bytes intact. Then you can
reconstruct the data any time you’d like:
redata = marshal.loads(bytes)
This reconstructs a
data structure that compares equal (==) to
data. In other words, the order of keys in
dictionaries is arbitrary in both the original and reconstructed data
structures, but order in any kind of sequence is meaningful, and thus
it is preserved. Note that loads works
independently of machine architecture, but you must guarantee that it
is used by the same release of Python under which
bytes was originally generated ...