Chaining multiple values into a hash
One of the bigger problems when using a DBM file
with the storage mechanism of
DB_HASH
is that the keys against which the data
is stored must be unique. For example, if we stored two different
values with the key of ``Wiltshire,'' say for
Stonehenge and Avebury, generally the last value inserted into the
hash would get stored in the database. This is a bit problematic, to
say the least.
In a good database design, the primary key of any data structure generally should be unique in order to speed up searches. But quick and dirty databases, badly designed ones, or databases with a suboptimal data quality may not be able to enforce this uniqueness. Similarly, using referential hashtables to provide nonprimary key searching of the database also triggers this problem.
A Perl solution to this problem is to push the multiple values onto an array that is stored within the hash element. This technique works fine while the program is running, because the array references are still valid, but when the database is written out and reloaded, the data is invalid.
Therefore, to solve this problem, we need to look at using the
different Berkeley DB storage management method of
DB_BTREE
, which orders its keys prior to
insertion. With this mechanism, it is possible to have duplicate
keys, because the underlying DBM file is in the form of an array
rather than a hashtable. Fortunately, you still reference the DBM
file via a Perl hashtable, so DB_BTREE
is not any ...
Get Programming the Perl DBI now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.