Finding the Index for Partially Matched Strings
The problem considered here
concerns a large number of string keys that need to be accessed by
full or partial match. Each string is unique, so the full-match
access can easily be handled by a standard hash-table structure
(e.g., java.util.HashMap
). But the partial-match
access needs to collect all objects that have string keys starting
with a particular substring.
So, for example, if you had the hash consisting of keys and values:
"hello" 1 "bye" 2 "hi" 3
Then the full match for key "hi"
retrieves
3
, and the partial match against strings starting
with "h"
retrieves the collection
{1,3}
. Using a hash-table structure for the
partial-match access is expensive because it requires that all keys
be iterated over, and then each key matching the corresponding object
needs to be collated.
Of course, I am considering here a large collection of strings. Alternatives are not usually necessary for a few (or even a few thousand) strings. But for large collections, performance-tuning techniques become necessary.
The tuning procedure here should be to look for data structures that
quickly match any partial string. The task is somewhat simpler than
the most generic version of this type of problem, because you need to
match only the first few consecutive characters. This means that some
sort of tree structure is probably ideal. Of the structures available
from the JDK, TreeMap
looks like it can provide exactly the required functionality; it gives a minimal ...
Get Java Performance Tuning now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.