86 | Chapter 8: Performance Improvement
When you use Buffering, data goes into the Buffer and then is used once
and removed from the Buffer. With Caching, data goes into the Cache and then
is used potentially multiple times and is not removed from the Cache unless
its memory space is needed for something else. There are three critical design
issues for a Cache:
How big to make the caches
How to organize the data in the caches
What algorithm to use to select items to be removed from the cache when s
it is full
One would like a cache to be large enough that it signiﬁcantly reduces
the number of disk I/O operations required. The Hit Ratio is the percentage of
data requests that can be found in the cache without doing a disk I/O. If one
assumes that all data items are equally probable to be requested in the future,
then the Hit Ratio H is
where M is the size of the cache and k is the number of data elements.
Thus, the larger the cache, the better the performance. Of course, there is
no beneﬁt to having a cache that is larger than the number of data elements.
However, because the amount of memory is limited, you cannot always make
the cache as large as you would like. Thus, we have a space–time tradeoff. In
fact, there are a couple of simple guidelines:
For randomly accessed data, if a 1 KB record will be accessed again within s
5 minutes, then the performance improvement of caching the data will
justify the additional cost of the memory to hold the data. Conversely, if the
data will not be accessed again within 5 minutes, it should not be cached.
For a 100-byte record, caching is appropriate if the record will be accessed
again within 1 hour.
For sequentially accessed data, the data should be accessed again within
1 minute to justify caching.
The simplest way to organize the data in the cache is as an array (see
Example 8.5). Then you can use an integer index that can be calculated from
a loop index.
This approach works very well if every data element in the cache needs to be
accessed equally. However, consider a sparse matrix such as in Example 8.6.
3.6 5.0 10.5 7.6 1.3 4.3 6.7 3.2
Example 8.5 Data Array Cache