One of the first assignments frequently given to new programming
students is the Fibonacci function: given a number n,
return the nth number in the
Fibonacci sequence. A typical response to this assignment might be coded as
follows:
def fib(n):
if n == 0: return 0
elif n == 1: return 1
else: fib(n−1) + fib(n−2)Unfortunately, this code has a trick: trying to compute
fib(n) any numbers larger than 40 or so results in
incredibly long running times. In algorithmic terms, this code has
complexity O(2n). In other words, the running
time increases exponentially as the number requested goes up. Before
n gets very large, the running time is too long to be
feasible.
The reason why the fib function is so expensive is because it redoes
the work each time. If you ask for fib(5), the function
also works out fib(4) and fib(3).
The calculation of fib(4) in turn works out
fib(3) and fib(2). The calculation
of fib(3) works out fib(2) and
fib(1), etc. There is a lot of duplicated effort, and
that duplicated effort takes work and time.
One technique for speeding up this function is memoization. Memoization
works by caching the results of each call in a lookup table. The first time
a function is called with certain arguments, the memoized function computes
the result and associates the function arguments with the result value. When
the function is called later with the same arguments, the memoized function
returns the cached value rather than spending time and processing power
computing the results again.
For example, a memoized Fibonacci function might look like the
following:
memo = {0:0, 1:1}
def memo_fib(n):
if n in memo: return memo[n]
else:
result = memo_fib(n−1) + memo_fib(n−2)
memo[n] = result
return resultUnlike the original fib function, this function
can be called with very large numbers and will return almost
instantly.
It is important to note that a memoized function is
always slower than a normal function the first time it
is run. The functions associated with memoization—the value lookup, cache
miss, and association of the result with the argument list—all take time and
processing power. Nevertheless, memoization is a critical optimization tool.
If the cost of creating the result is much greater than the cost of the
caching machinery, returning a cached result is much cheaper than
re-creating the result for each function call.