Dask provides other data structures for automatic generation of computation graphs. In this subsection, we'll take a look at dask.bag.Bag, a generic collection of elements that can be used to code MapReduce-style algorithms, and dask.dataframe.DataFrame, a distributed version of pandas.DataFrame.
A Bag can be easily created from a Python collection. For example, you can create a Bag from a list using the from_sequence factory function. The level of parallelism can be specified using the npartitions argument (this will distribute the Bag content into a number of partitions). In the following example, we create a Bag containing numbers from 0 to 99, partitioned into four chunks:
import dask.bag as dab dab.from_sequence(range(100), ...