We may batch instances of a problem to be solved all at once, as opposed to the more usual assumption of online algorithms, in which each instance must be solved as soon as it is presented.

As an example of the improvements in allowing offline algorithms, assume we intend to implement a dictionary in which we insert a set of *n* numbers *y*_{1} ... *y*_{n} into an initially empty dictionary and then perform *n*/2 membership queries `contains`

(*x*_{i}) for numbers *x*_{1} ... *x*_{n}/2. An optimal data structure to perform *n* insert operations followed by a single `contains`

(*x*_{i}) operation is to insert each *y*_{j} into an unordered array *Y*, at a total cost O(*n*), and then implement the `contains`

(*x*_{i}) query with a Sequential Search of *x*_{i} in array *Y* at a worst-case cost of O(*n*). The total worst-case cost of the *n*+1 operations is O(*n*).

Performing a sequence of *n*/2 executions of Sequential Search incurs a total cost of O(*n*^{2}). Since there is no way to predict the queries that are to be performed, an online algorithm cannot proactively take steps to minimize the costs of a specific future query (note that an adversary can always thwart such speedup attempts). However, if we batch the sequence of *n*/2 `contains`

queries for offline processing, then we could sort the array *Y* containing *y*1 ... *y*_{n} and sort an array *X* containing *x*_{1} ... *x*_{n}_{/2}, each at a worst-case cost of O(*n* log *n*), and then scan the two sorted arrays to seek duplicates, at a worst-case cost of O(*n*). By permitting an offline algorithm to batch the *n*/2 ...

Start Free Trial

No credit card required