July 2002
Intermediate to advanced
608 pages
15h 46m
English
Credit: Iuri Wickert, Duncan Grisby, Steve Holden, Alex Martelli
You need to consume, in random order, the items of a rather long list, and the most direct approach is painfully slow.
While it’s a common mistake to be overly concerned with speed, you should not ignore the different performances of various algorithms. Suppose we must process all of the items in a long list in random order, without repetition. For example:
import random # an example of a processing function def process(datum): print datum # an example of a set of data to process data = range(10000)
The simplest version is very slow:
def simple( ):
while data:
# returns element, not index (doesn't help)
elem = random.choice(data)
# needs to search for the element throughout the list!
data.remove(elem)
process(elem)Here is a faster version:
def faster( ):
while data:
index = random.randrange(len(data))
elem = data[index]
# direct deletion, no search needed
del data[index]
process(elem)
# the same, but preserving the data list
def faster_preserve( ):
aux = range(len(data))
while aux:
posit = random.randrange(len(aux))
index = aux[posit]
elem = data[index]
# alters the auxiliary list only
del aux[posit]
process(elem)However, the key improvement is to switch to an O(N) algorithm:
def improved( ):
size = len(data)
while size:
index = random.randrange(size)
elem = data[index]
data[index] = data[size-1]
size = size - 1
process(elem)Of course, you can also ...