The biggest difference between time and space is that you can’t reuse time.
“I thought that I didn’t need to worry about memory allocation. Java is supposed to handle all that for me.” This is a common perception, which is both true and false. Java handles low-level memory allocation and deallocation and comes with a garbage collector. Further, it prevents access to these low-level memory-handling routines, making the memory safe. So memory access should not cause corruption of data in other objects or in the running application, which is potentially the most serious problem that can occur with memory access violations. In a C or C++ program, problems of illegal pointer manipulations can be a major headache (e.g., deleting memory more than once, runaway pointers, bad casts). They are very difficult to track down and are likely to occur when changes are made to existing code. Java deals with all these possible problems and, at worst, will throw an exception immediately if memory is incorrectly accessed.
However, Java does not prevent you from using excessive amounts of memory nor from cycling through too much memory (e.g., creating and dereferencing many objects). Contrary to popular opinion, you can get memory leaks by holding on to objects without releasing references. This stops the garbage collector from reclaiming those objects, resulting in increasing amounts of memory being used.[27] In addition, Java does not provide for large numbers of objects to be created simultaneously (as you could do in C by allocating a large buffer), which eliminates one powerful technique for optimizing object creation.
Creating objects costs time and CPU effort for an application.
Garbage collection and memory recycling cost more time and CPU
effort. The difference in object usage between two algorithms can
make a huge difference. In Chapter 5, I cover
algorithms for appending basic data types to
StringBuffer
objects. These can be an order of
magnitude faster than some of the conversions supplied with Java. A
significant portion of the speedup is obtained by avoiding extra
temporary objects used and discarded during the data
conversions.[28]
Here are a few general guidelines for using object memory efficiently:
Avoid creating objects in frequently used routines. Because these routines are called frequently, you will likely be creating objects frequently, and consequently adding heavily to the overall burden of object cycling. By rewriting such routines to avoid creating objects, possibly by passing in reusable objects as parameters, you can decrease object cycling.
Try to presize any collection object to be as big as it will need to be. It is better for the object to be slightly bigger than necessary than to be smaller than it needs to be. This recommendation really applies to collections that implement size increases in such a way that objects are discarded. For example,
Vector
grows by creating a new larger internal array object, copying all the elements from and discarding the old array. Most collection implementations have similar implementations for growing the collection beyond its current capacity, so presizing a collection to its largest potential size reduces the number of objects discarded.When multiple instances of a class need access to a particular object in a variable local to those instances, it is better to make that variable a static variable rather than have each instance hold a separate reference. This reduces the space taken by each object (one less instance variable) and can also reduce the number of objects created if each instance creates a separate object to populate that instance variable.
Reuse exception instances when you do not specifically require a stack trace (see Section 6.1).
This chapter presents many other standard techniques to avoid using too many objects, and identifies some known inefficiencies when using some types of objects.
Objects need to be created before they can be used, and garbage-collected when they are finished with. The more objects you use, the heavier this garbage-cycling impact becomes. General object-creation statistics are actually quite difficult to measure decisively, since you must decide exactly what to measure, what size to pregrow the heap space to, how much garbage collection impacts the creation process if you let it kick in, etc.
For example, on a medium Pentium II, with heap space pregrown so that
garbage collection does not have to kick in, you can get around half
a million to a million simple objects created per second. If the
objects are very simple, even more can be garbage-collected in one
second. On the other hand, if the objects are complex, with
references to other objects, and include arrays (like
Vector
and StringBuffer
) and
nonminimal constructors, the statistics plummet to less than a
quarter of a million created per second, and garbage collection can
drop way down to below 100,000 objects per second. Each object
creation is roughly as expensive as a malloc in
C, or a new in
C++, and there is no easy way of
creating many objects together, so you cannot take advantage of
efficiencies you get using bulk allocation.
There are already runtime systems that use generational garbage
collection, minimize object-creation overhead, and optimize
native-code compilation. By doing this they reach up to three million
objects created and collected per second (on a Pentium II), and it is
likely that the average Java system should improve to get closer to
that kind of performance over time. But these figures are for basic
tests, optimized to show the maximum possible object-creation
throughput. In a normal application with varying size objects and
constructor chains, these sorts of figures cannot be obtained or even
approached. Also bear in mind that you are doing nothing else in
these tests apart from creating objects. In most applications, you
are usually doing something with all those objects, making everything
much slower but significantly more useful. Avoidable object creation
is definitely a significant overhead for most applications, and you
can easily run through millions of temporary objects using
inefficient
algorithms
that create too many objects. In
Chapter 5, we look at an example that uses the
StreamTokenizer
class. This class creates and
dereferences a huge number of objects while it parses a stream, and
the effect is to slow down processing to a crawl. The example in
Chapter 5 presents a simple alternative to using a
StreamTokenizer
, which is 100 times faster: a
large percentage of the speedup is gained from avoiding cycling
through objects.
Note that different VM environments produce different figures. If you
plot object size against object-creation time for various
environments, most plots are monotonically increasing, i.e., it takes
more time to create larger objects. But there are discrepancies here
too. For example, Netscape Version 4 running on Windows has the
peculiar behavior that objects of size 4 and 12
int
s are created fastest (refer to http://www.javaworld.com/javaworld/jw-09-1998/jw-09-speed.html). Also, note that JIT VMs actually have a worse problem
with object creation relative to other VM activities, because JIT VMs
can speed up almost every other activity, but object creation is
nearly as slow as if the JIT compiler was not there.
[27] Ethan Henry and Ed Lycklama have written a nice article discussing Java memory leaks in the February 2000 issue of Dr. Dobb’s Journal. This article is available online from the Dr. Dobb’s web site, http://www.ddj.com.
[28] Up to Java 1.3. Data-conversion performance is targeted by JavaSoft, however, so some of the data conversions may speed up after 1.3.
Get Java Performance Tuning now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.