Strings Versus char Arrays
In one of my first programming
courses, in the language
C, our instructor made an interesting
comment. He said, “C has lightning-fast string handling because
it has no string type.” He went on to explain this oxymoron by
pointing out that in C, any null-terminated
sequence of bytes can be considered a string: this convention is
supported by all string-handling functions. The point is that since
the convention is adhered to fairly rigorously, there is no need to
use only the standard string-handling functions. Any string
manipulation you want to do can be executed directly on the
byte array, allowing you to bypass or rewrite any
string-handling functions you need to speed up. Because you are not
forced to run through a restricted set of manipulation functions, it
is always possible to optimize code using your own hand-crafted
functions. Furthermore, some string-manipulating functions operate
directly on the original byte array rather than
creating a copy of this array. This can be a source of bugs, but is
another reason speed can be optimized.
In Java, the inability to subclass String or
access its internal char array means you cannot
use the techniques applied in C. Even if you could subclass
String, this does not avoid the second problem:
many other methods operate on or return copies of a
String. Generally, there is no way to avoid using
String objects for code external to your
application classes. But internally, you can provide your own
char array ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access