Chapter 4. The Java Type System
In this chapter, we move beyond basic object-oriented programming with classes and into the additional concepts required to work effectively with Java’s type system.
Note
A statically typed language is one in which variables have definite types, and where it is a compile-time error to assign a value of an incompatible type to a variable. Languages that only check type compatibility at runtime are called dynamically typed.
Java is a fairly classic example of a statically typed language. JavaScript is an example of a dynamically typed language that allows any variable to store any type of value.
The Java type system involves not only classes and primitive types but also other kinds of reference type that are related to the basic concept of a class, but which differ in some way and are usually treated in a special way by javac
or the JVM.
We have already met arrays and classes, two of Java’s most widely used kinds of reference type. This chapter starts by discussing another very important kind of reference type—interfaces. We then move on to discuss Java’s generics, which have a major role to play in Java’s type system. With these topics under our belts, we can discuss the differences between compile-time and runtime types in Java.
To complete the full picture of Java’s reference types, we look at specialized kinds of classes and interfaces—known as enums and annotations. We conclude the chapter by looking at lambda expressions and nested types and then reviewing how enhanced type inference has allowed Java’s nondenotable types to become usable by programmers.
Let’s get started by taking a look at interfaces—probably the most important of Java’s reference types after classes and a key building block for the rest of Java’s type system.
Interfaces
In Chapter 3, we met the idea of inheritance. We also saw that a Java class can inherit only from a single class. This is quite a big restriction on the kinds of object-oriented programs that we want to build. The designers of Java knew this, but they also wanted to ensure that Java’s approach to object-oriented programming was less complex and error-prone than, for example, that of C++.
The solution that they chose was to introduce the concept of an interface to Java. Like a class, an interface defines a new reference type. As its name implies, an interface is intended to represent only an API—so it provides a description of a type and the methods (and signatures) that classes that implement that API must provide.
In general, a Java interface does not provide any implementation code for the methods that it describes. These methods are considered mandatory—any class that wishes to implement the interface must provide an implementation of these methods.
However, an interface may wish to mark that some API methods are optional and that implementing classes do not need to implement them if they choose not to.
This is done with the default
keyword—and the interface must provide an implementation of these optional methods, which will be used by any implementing class that elects not to implement them.
Note
The ability to have optional methods in interfaces was new in Java 8. It is not available in any earlier version. See “Records and Interfaces” for a full description of how optional (also called default) methods work.
It is not possible to directly instantiate an interface and create a member of the interface type. Instead, a class must implement the interface to provide the necessary method bodies.
Any instances of the implementing class are compatible with both the type defined by the class and the type defined by the interface. This means that the instances may be substituted at any point in the code that requires an instance of either the class type or the interface type. This extends the Liskov principle as seen in “Reference Type Conversions”.
Another way of saying this is that two objects that do not share the same class or superclass may still both be compatible with the same interface type if both objects are instances of classes that implement the interface.
Defining an Interface
An interface definition is somewhat like a class definition in which all the (mandatory) methods are abstract and the keyword class
has been replaced with
interface
.
For example, this code shows the definition of an interface named Centered
(a Shape
class, such as those defined in Chapter 3, might implement this interface if it wants to allow the coordinates of its center to be set and queried):
interface
Centered
{
void
setCenter
(
double
x
,
double
y
);
double
getCenterX
();
double
getCenterY
();
}
A number of restrictions apply to the members of an interface:
-
All mandatory methods of an interface are implicitly
abstract
and must have a semicolon in place of a method body. Theabstract
modifier is allowed but by convention is usually omitted. -
An interface defines a public API. By convention, members of an interface are implicitly
public
, and it is conventional to omit the unnecessarypublic
modifier. -
An interface may not define any instance fields. Fields are an implementation detail, and an interface is a specification, not an implementation. The only fields allowed in an interface definition are constants that are declared both
static
andfinal
. -
An interface cannot be instantiated, so it does not define a constructor.
-
Interfaces may contain nested types. Any such types are implicitly
public
andstatic
. See “Nested Types” for a full description of nested types. -
As of Java 8, an interface may contain static methods. Previous versions of Java did not allow this, which is widely believed to have been a flaw in the design of the Java language.
-
As of Java 9, an interface may contain
private
methods. These have limited use cases, but with the other changes to the interface construct, it seems arbitrary to disallow them. -
It is a compile-time error to try to define a
protected
method in an interface.
Extending Interfaces
Interfaces may extend other interfaces, and, like a class definition,
an interface definition indicates this by including an extends
clause. When one interface extends another, it inherits all the methods and constants of its superinterface and can define new methods and constants.
Unlike classes, however, the extends
clause of an interface definition may include more than one superinterface.
For example, here are some interfaces that extend other interfaces:
interface
Positionable
extends
Centered
{
void
setUpperRightCorner
(
double
x
,
double
y
);
double
getUpperRightX
();
double
getUpperRightY
();
}
interface
Transformable
extends
Scalable
,
Translatable
,
Rotatable
{}
interface
SuperShape
extends
Positionable
,
Transformable
{}
An interface that extends more than one interface inherits all the methods and constants from each of those interfaces and can define its own additional methods and constants. A class that implements such an interface must implement the abstract methods defined directly by the interface, as well as all the abstract methods inherited from all the superinterfaces.
Implementing an Interface
Just as a class uses extends
to specify its superclass, it can use implements
to name one or more interfaces it supports.
The implements
keyword can appear in a class declaration following the extends
clause.
It should be followed by a comma-separated list of interfaces that the class implements.
When a class declares an interface in its implements
clause, it is
saying that it provides an implementation (i.e., a body) for each
mandatory method of that interface. If a class implements an interface
but does not provide an implementation for every mandatory interface
method, it inherits those unimplemented abstract
methods from the
interface and must itself be declared abstract
. If a class implements
more than one interface, it must implement every mandatory method of
each interface it implements (or be declared abstract
).
The following code shows how to define a CenteredRectangle
class
that extends the Rectangle
class from
Chapter 3 and implements our Centered
interface:
public
class
CenteredRectangle
extends
Rectangle
implements
Centered
{
// New instance fields
private
double
cx
,
cy
;
// A constructor
public
CenteredRectangle
(
double
cx
,
double
cy
,
double
w
,
double
h
)
{
super
(
w
,
h
);
this
.
cx
=
cx
;
this
.
cy
=
cy
;
}
// We inherit all the methods of Rectangle but must
// provide implementations of all the Centered methods.
public
void
setCenter
(
double
x
,
double
y
)
{
cx
=
x
;
cy
=
y
;
}
public
double
getCenterX
()
{
return
cx
;
}
public
double
getCenterY
()
{
return
cy
;
}
}
Suppose we implement CenteredCircle
and CenteredSquare
just as we
have implemented this CenteredRectangle
class. Each class extends
Shape
, so instances of the classes can be treated as instances of the
Shape
class, as we saw earlier. Because each class implements the
Centered
interface, instances can also be treated as instances of that
type. The following code demonstrates how objects can be members of
both a class type and an interface type:
Shape
[]
shapes
=
new
Shape
[
3
]
;
// Create an array to hold shapes
// Create some centered shapes, and store them in the Shape[]
// No cast necessary: these are all compatible assignments
shapes
[
0
]
=
new
CenteredCircle
(
1.0
,
1.0
,
1.0
);
shapes
[
1
]
=
new
CenteredSquare
(
2.5
,
2
,
3
);
shapes
[
2
]
=
new
CenteredRectangle
(
2.3
,
4.5
,
3
,
4
);
// Compute average area of the shapes and
// average distance from the origin
double
totalArea
=
0
;
double
totalDistance
=
0
;
for
(
int
i
=
0
;
i
<
shapes
.
length
;
i
=
i
+
1
)
{
totalArea
+=
shapes
[
i
]
.
area
();
// Compute the area of the shapes
// Be careful, in general, the use of instanceof to determine the
// runtime type of an object is quite often an indication of a
// problem with the design
if
(
shapes
[
i
]
instanceof
Centered
)
{
// The shape is a Centered shape
// Note the required cast from Shape to Centered (no cast would
// be required to go from CenteredSquare to Centered, however).
Centered
c
=
(
Centered
)
shapes
[
i
]
;
double
cx
=
c
.
getCenterX
();
// Get coordinates of the center
double
cy
=
c
.
getCenterY
();
// Compute distance from origin
totalDistance
+=
Math
.
sqrt
(
cx
*
cx
+
cy
*
cy
);
}
}
System
.
out
.
println
(
"Average area: "
+
totalArea
/
shapes
.
length
);
System
.
out
.
println
(
"Average distance: "
+
totalDistance
/
shapes
.
length
);
Note
Interfaces are data types in Java, just like classes. When a class implements an interface, instances of that class can be assigned to variables of the interface type.
Don’t interpret this example to imply that you must assign a CenteredRectangle
object to a Centered
variable before you can invoke the setCenter()
method or to a Shape
variable before invoking the area()
method. Instead, because the
CenteredRectangle
class defines setCenter()
and inherits area()
from its Rectangle
superclass, you can always invoke these methods.
As we could see by examining the bytecode (e.g., by using the javap
tool we will meet in Chapter 13), the JVM calls the setCenter()
method slightly differently depending on whether the local variable holding the shape is of the type CenteredRectangle
or Centered
, but this is not a distinction that matters most of the time when you’re writing Java code.
Records and Interfaces
Records, being a special case of classes, can implement interfaces, just like any other class. The body of the record must contain implementation code for all of the mandatory methods of the interface, and it may contain overriding implementations for any of the default methods of the interface.
Let’s look at an example as applied to the Point
record we met in the last chapter. Given an interface defined like this:
interface
Translatable
{
Translatable
deltaX
(
double
dx
);
Translatable
deltaY
(
double
dy
);
Translatable
delta
(
double
dx
,
double
dy
);
}
then we can update the Point
type like this:
public
record
Point
(
double
x
,
double
y
)
implements
Translatable
{
public
Translatable
deltaX
(
double
dx
)
{
return
delta
(
dx
,
0.0
);
}
public
Translatable
deltaY
(
double
dy
)
{
return
delta
(
0.0
,
dy
);
}
public
Translatable
delta
(
double
dx
,
double
dy
)
{
return
new
Point
(
x
+
dx
,
y
+
dy
);
}
}
Note that because records are immutable, it is not possible to mutate instances in-place and so, if we need a modified object, we have to create one and return it explicitly. This implies that not every interface will be suitable for implementation by a record type.
Sealed Interfaces
We met the sealed
keyword in the last chapter, as applied to classes. It can also be applied to interfaces, like this:
sealed
interface
Rotate90
permits
Circle
,
Rectangle
{
void
clockwise
();
void
antiClockwise
();
}
This sealed interface represents the capability for a shape to be rotated by 90 degrees. Note that the declaration also contains a permits
clause that specifies the only classes that are allowed to implement this interface—in this case just the Circle
and Rectangle
for simplicity. The Circle
is modified like this:
public
final
class
Circle
extends
Shape
implements
Rotate90
{
// ...
@Override
public
void
clockwise
()
{
// No-op, circles are rotation-invariant
}
@Override
public
void
antiClockwise
()
{
// No-op, circles are rotation-invariant
}
// ...
}
whereas the Rectangle
has been modified like this:
public
final
class
Rectangle
extends
Shape
implements
Rotate90
{
// ...
@Override
public
void
clockwise
()
{
// Swap width and height
double
tmp
=
w
;
w
=
h
;
h
=
tmp
;
}
@Override
public
void
antiClockwise
()
{
// Swap width and height
double
tmp
=
w
;
w
=
h
;
h
=
tmp
;
}
// ...
}
As it stands, we don’t want to deal with the complexity of allowing other shapes to have rotational behavior, so we restrict the interface so that it can only be implemented by the two simplest cases: circles and rectangles.
There is also an interesting interplay between sealed interfaces and records, which we will discuss in Chapter 5.
Default Methods
From Java 8 onward, it is possible to declare methods in interfaces that include an implementation. In this section, we’ll discuss these methods, which should be understood as optional methods in the API the interfaces represent—they’re usually called default methods. Let’s start by looking at the reasons why we need the default mechanism in the first place.
Backward compatibility
The Java platform has always been very concerned with backward compatibility. This means that code that was written (or even compiled) for an earlier version of the platform must continue to work with later releases of the platform. This principle allows development groups to have a high degree of confidence that an upgrade of their JDK or Java Runtime Environment (JRE) will not break currently working applications.
Backward compatibility is a great strength of the Java platform, but in order to achieve it, some constraints are placed on the platform. One of them is that interfaces may not have new mandatory methods added to them in a new release of the interface.
For example, let’s suppose that we want to update the Positionable
interface with the ability to add a bottom-left bounding point as well:
public
interface
Positionable
extends
Centered
{
void
setUpperRightCorner
(
double
x
,
double
y
);
double
getUpperRightX
();
double
getUpperRightY
();
void
setLowerLeftCorner
(
double
x
,
double
y
);
double
getLowerLeftX
();
double
getLowerLeftY
();
}
With this new definition, if we try to use this new interface with code
developed for the old, it just won’t work, as the existing code is
missing the mandatory methods setLowerLeftCorner()
, getLowerLeftX()
,
and getLowerLeftY()
.
Note
You can see this effect quite easily in your own code. Compile a class
file that depends on an interface. Then add a new mandatory method to
the interface and try to run the program with the new version of the
interface, together with your old class file. You should see the program
crash with a NoClassDefError
.
This limitation was a concern for the designers of Java 8—as one of their goals was to be able to upgrade the core Java Collections libraries and introduce methods that used lambda expressions.
To solve this problem, a new mechanism was needed, essentially to allow interfaces to evolve by allowing new methods to be added without breaking backward compatibility.
Implementation of default methods
Adding new methods to an interface without breaking backward
compatibility requires providing some implementation for the
older implementations of the interface so that they can continue to
work. This mechanism is a default
method, and it was first added to
the platform in JDK 8.
Note
A default method (sometimes called an optional method) can be added to any interface. This must include an implementation, called the default implementation, which is written inline in the interface definition.
The basic behavior of a default method is:
-
An implementing class may (but is not required to) implement the default method.
-
If an implementing class implements the default method, then the implementation in the class is used.
-
If no other implementation can be found, then the default implementation is used.
An example default method is the sort()
method. It’s been added to
the interface java.util.List
in JDK 8, and is defined as:
// The <E> syntax is Java's way of writing a generic type - see
// the next section for full details. If you aren't familiar with
// generics, just ignore that syntax for now.
interface
List
<
E
>
{
// Other members omitted
public
default
void
sort
(
Comparator
<?
super
E
>
c
)
{
Collections
.
<
E
>
sort
(
this
,
c
);
}
}
Thus, from Java 8 upward, any object that implements List
has an instance method sort()
that can be used to sort the list using a suitable Comparator
.
As the return type is void
, we might expect that this is an in-place sort, and this is indeed the case.
One consequence of default methods is that when implementing multiple interfaces, it’s possible that two or more interfaces may contain a default method with a completely identical name and signature.
For example:
interface
Vocal
{
default
void
call
()
{
System
.
out
.
println
(
"Hello!"
);
}
}
interface
Caller
{
default
void
call
()
{
Switchboard
.
placeCall
(
this
);
}
}
public
class
Person
implements
Vocal
,
Caller
{
// ... which default is used?
}
These two interfaces have very different default semantics for call()
and could cause a potential implementation clash—a colliding default method.
In versions of Java prior to 8, this could not occur, as the language permitted only single inheritance of implementation.
The introduction of default methods means that Java now permits a limited form of multiple inheritance (but only of method implementations).
Java still does not permit (and has no plans to add) multiple inheritance of object state.
Default methods have a simple set of rules to help resolve any potential ambiguities:
-
If a class implements multiple interfaces in such a way as to cause a potential clash of default method implementations, the implementing class must override the clashing method and provide a definition of what is to be done.
-
Syntax is provided to allow the implementing class to simply call one of the interface default methods if that is what is required:
public
class
Person
implements
Vocal
,
Caller
{
public
void
call
()
{
// Can do our own thing
// or delegate to either interface
// e.g.,
// Vocal.super.call();
// or
// Caller.super.call();
}
}
As a side effect of the design of default methods, there is a slight, unavoidable usage issue that may arise in the case of evolving interfaces with colliding methods.
Consider the case where a bytecode version 51.0 (Java 7) class implements two interfaces A
and B
with version numbers a.0
and b.0
, respectively.
As defaults are not available in Java 7, this class will work correctly.
However, if at a later time either or both interfaces adopt a default implementation of a colliding method, then compile-time breakage can occur.
For example, if version a.1
introduces a default method in A
, then the implementing class will pick up the implementation when run with the new version of the dependency.
If version b.1
now introduces the same method, it causes a collision:
-
If
B
introduces the method as a mandatory (i.e., abstract) method, then the implementing class continues to work—both at compile time and at runtime. -
If
B
introduces the method as a default method, then this is not safe and the implementing class will fail both at compile and at runtime.
This minor issue is very much a corner case and in practice is a very small price to pay in order to have usable default methods in the language.
When working with default methods, we should be aware that there is a slightly restricted set of operations we can perform from within a default method:
-
Call another method present in the interface’s public API (whether mandatory or optional); some implementation for such methods is guaranteed to be available.
-
Call a private method on the interface (Java 9 and up).
-
Call a static method, whether on the interface or defined elsewhere.
-
Use the
this
reference (e.g., as an argument to method calls).
The biggest takeaway from these restrictions is that even with default methods, Java interfaces still lack meaningful state; we cannot alter or store state within the interface.
Default methods have had a profound impact on the way that Java practitioners approach object-oriented programming. When combined with the rise of lambda expressions, they have upended many previous conventions of Java coding; we will discuss this in detail in the next chapter.
Marker Interfaces
Occasionally it is useful to define an interface that is entirely empty.
A class can implement this interface simply by naming it in its implements
clause without having to implement any methods.
In this case, any instances of the class become valid instances of the interface as well and can be cast to the type. Java code can check whether an object is an instance of the interface using the instanceof
operator, so this technique is a useful way to provide additional information about an object. It can be thought of as providing additional, auxiliary type information about a class.
Tip
Marker interfaces are much less widely used than they once were. Java’s annotations (which we shall meet presently) have largely replaced them due to their much greater flexibility at conveying extended type information.
The interface java.util.RandomAccess
is an example of a marker interface: java.util.List
implementations use this interface to advertise that they provide fast random access to the elements of the list.
For example, ArrayList
implements RandomAccess
, while LinkedList
does not.
Algorithms that care about the performance of random-access operations can test for RandomAccess
like this:
// Before sorting the elements of a long arbitrary list, we may want
// to make sure that the list allows fast random access. If not,
// it may be quicker to make a random-access copy of the list before
// sorting it. Note that this is not necessary when using
// java.util.Collections.sort().
List
l
=
...;
// Some arbitrary list we're given
if
(
l
.
size
()
>
2
&&
!
(
l
instanceof
RandomAccess
))
{
l
=
new
ArrayList
(
l
);
}
sortListInPlace
(
l
);
As we will see later, Java’s type system is very tightly coupled to the names that types have—an approach called nominal typing. A marker interface is a great example of this: it has nothing at all except a name.
Java Generics
One of the great strengths of the Java platform is the standard library it ships. It provides a great deal of useful functionality—and in particular robust implementations of common data structures. These implementations are relatively simple to develop with and are well documented. The libraries are known as the Java Collections, and we will spend a big chunk of Chapter 8 discussing them. For a far more complete treatment, see the book Java Generics and Collections by Maurice Naftalin and Philip Wadler (O’Reilly).
Although they were still very useful, the earliest versions of the collections had a fairly major limitation: the data structure (sometimes called the container) essentially obscured the type of the data being stored in it.
Note
Data hiding and encapsulation is a great principle of object-oriented programming, but in this case, the opaque nature of the container caused a lot of problems for the developer.
Let’s kick off the section by demonstrating the problem and showing how the introduction of generic types solved it and made life much easier for Java developers.
Introduction to Generics
If we want to build a collection of Shape
instances, we can use a
List
to hold them, like this:
List
shapes
=
new
ArrayList
();
// Create a List to hold shapes
// Create some centered shapes, and store them in the list
shapes
.
add
(
new
CenteredCircle
(
1.0
,
1.0
,
1.0
));
// This is legal Java-but is a very bad design choice
shapes
.
add
(
new
CenteredSquare
(
2.5
,
2
,
3
));
// List::get() returns Object, so to get back a
// CenteredCircle we must cast
CenteredCircle
c
=
(
CenteredCircle
)
shapes
.
get
(
0
);
// Next line causes a runtime failure
CenteredCircle
c
=
(
CenteredCircle
)
shapes
.
get
(
1
);
A problem with this code stems from the requirement to perform a cast to get the shape objects back out in a usable form—the List
doesn’t know what type of objects it contains.
Not only that, but it’s actually possible to put different types of objects into the same container, and everything will work fine until an illegal cast is used and the program crashes.
What we really want is a form of List
that understands what type it contains.
Then, javac
could detect when an illegal argument was passed to the methods of List
and cause a compilation error, rather than deferring the issue to runtime.
Note
Collections that have all elements of the same type are called homogeneous, while the collections that can have elements of potentially different types are called heterogeneous (sometimes called “mystery meat collections”).
Java provides a simple syntax to cater to homogeneous collections. To indicate that a type is a container that holds instances of another reference type, we enclose the payload type that the container holds within angle brackets:
// Create a List-of-CenteredCircle
List
<
CenteredCircle
>
shapes
=
new
ArrayList
<
CenteredCircle
>
();
// Create some centered shapes, and store them in the list
shapes
.
add
(
new
CenteredCircle
(
1.0
,
1.0
,
1.0
));
// Next line will cause a compilation error
shapes
.
add
(
new
CenteredSquare
(
2.5
,
2
,
3
));
// List<CenteredCircle>::get() returns a CenteredCircle, no cast needed
CenteredCircle
c
=
shapes
.
get
(
0
);
This syntax ensures that a large class of unsafe code is caught by the compiler, before it gets anywhere near runtime. This is, of course, the whole point of static type systems—to use compile-time knowledge to help eliminate runtime problems wherever possible.
The resulting types, which combine an enclosing container type and a payload type, are usually called generic types, and they are declared like this:
interface
Box
<
T
>
{
void
box
(
T
t
);
T
unbox
();
}
This indicates that the Box
interface is a general construct, which can hold any type of payload.
It isn’t really a complete interface by itself—it’s more like a general description of a whole family of interfaces, one for each type that can be used in place of T
.
Generic Types and Type Parameters
We’ve seen how to use a generic type to provide enhanced program safety by using compile-time knowledge to prevent simple type errors. In this section, let’s dig deeper into the properties of generic types.
The syntax <T>
has a special name, type parameter,
and another name for a generic type is a parameterized type. This
should convey the sense that the container type (e.g., List
) is
parameterized by another type (the payload type). When we write a type
like Map<String, Integer>
, we are assigning concrete values to the
type parameters.
When we define a type that has parameters, we need to do so in a way
that does not make assumptions about the type parameters. So the List
type is declared in a generic way as List<E>
, and the type parameter
E
is used all the way through to stand as a placeholder for the actual
type that programmers will use for the payload when they use the List
data structure.
Tip
Type parameters always stand in for reference types. It is not possible to use a primitive type as a value for a type parameter.
The type parameter can be used in the signatures and bodies of methods as though it is a real type, for example:
interface
List
<
E
>
extends
Collection
<
E
>
{
boolean
add
(
E
e
);
E
get
(
int
index
);
// other methods omitted
}
Note how the type parameter E
can be used as a parameter for both return types and method arguments.
We don’t assume that the payload type has any specific properties and only make the basic assumption of consistency—that the type we put in is the same type that we will later get back out.
This enhancement has effectively introduced a new kind of type to Java’s type system. By combining the container type with the value of the type parameter, we are making new types.
Diamond Syntax
When we create an instance of a generic type, the righthand side of the assignment statement repeats the value of the type parameter. This is usually unnecessary, as the compiler can infer the values of the type parameters. In modern versions of Java, we can leave out the repeated type values in what is called diamond syntax.
Let’s look at an example of how to use diamond syntax, by rewriting one of our earlier examples:
// Create a List-of-CenteredCircle using diamond syntax
List
<
CenteredCircle
>
shapes
=
new
ArrayList
<>
();
This is a small improvement in the verbosity of the assignment statement—we’ve managed to save a few characters of typing. We’ll return to the topic of type inference when we discuss lambda expressions later in this chapter.
Type Erasure
In “Default Methods”, we discussed the Java platform’s strong preference for backward compatibility. The addition of generics in Java 5 was another example of where backward compatibility was an issue for a new language feature.
The central question was how to make a type system that allowed older, nongeneric collection classes to be used alongside with newer, generic collections. The design decision was to achieve this by the use of casts:
List
someThings
=
getSomeThings
();
// Unsafe cast, but we know that the
// contents of someThings are really strings
List
<
String
>
myStrings
=
(
List
<
String
>
)
someThings
;
This means that List
and List<String>
are compatible as types, at least at some level.
Java achieves this compatibility by type erasure. This means that generic type parameters are only visible at compile time—they are stripped out by javac
and are not reflected in the bytecode.1
Warning
The nongeneric type List
is usually called a raw type. It is still
perfectly legal Java to work with the raw form of types, even for types that are now generic.
This is almost always a sign of poor-quality code, however.
The mechanism of type erasure gives rise to a difference in the type
system seen by javac
and that seen by the JVM—we will discuss this
fully in “Compile and Runtime Typing”.
Type erasure also prohibits some other definitions, which would otherwise seem legal. In this code, we want to count the orders as represented in two slightly different data structures:
// Won't compile
interface
OrderCounter
{
// Name maps to list of order numbers
int
totalOrders
(
Map
<
String
,
List
<
String
>>
orders
);
// Name maps to total orders made so far
int
totalOrders
(
Map
<
String
,
Integer
>
orders
);
}
This seems like perfectly legal Java code, but it will not compile. The issue is that although the two methods seem like normal overloads, after type erasure, the signature of both methods becomes:
int
totalOrders
(
Map
);
All that is left after type erasure is the raw type of the container—in
this case, Map
. The runtime would be unable to distinguish between the
methods by signature, and so the language specification makes this
syntax illegal.
Bounded Type Parameters
Consider a simple generic box:
public
class
Box
<
T
>
{
protected
T
value
;
public
void
box
(
T
t
)
{
value
=
t
;
}
public
T
unbox
()
{
T
t
=
value
;
value
=
null
;
return
t
;
}
}
This is a useful abstraction, but suppose we want to have a restricted form of box that holds only numbers. Java allows us to achieve this by using a bound on the type parameter. This is the ability to restrict the types that can be used as the value of a type parameter, for example:
public
class
NumberBox
<
T
extends
Number
>
extends
Box
<
T
>
{
public
int
intValue
()
{
return
value
.
intValue
();
}
}
The type bound T extends Number
ensures that T
can only be substituted with a type that is compatible with the type Number
.
As a result of this, the compiler knows that value
will definitely have a method intValue()
available on it.
Note
Notice that because the value
field has protected access, it can be accessed directly in the subclass.
If we attempt to instantiate NumberBox
with an invalid value for the type parameter, the result will be a compilation error:
NumberBox
<
Integer
>
ni
=
new
NumberBox
<>
();
// This compiles fine
NumberBox
<
Object
>
no
=
new
NumberBox
<>
();
// Won't compile
Beginning Java programmers should avoid using raw types altogether. Even experienced Java programmers can run into problems when using them. For example, when using raw types when working with a type bound, then the type bound can be evaded, but in doing so, the code is left vulnerable to a runtime exception:
// Compiles
NumberBox
n
=
new
NumberBox
();
// This is very dangerous
n
.
box
(
new
Object
());
// Runtime error
System
.
out
.
println
(
n
.
intValue
());
The call to intValue()
fails with a java.lang.ClassCastException
—as javac
has inserted an unconditional cast of value
to Number
before calling the method.
In general, type bounds can be used to write better generic code and libraries. With practice, some fairly complex constructions can be built, for example:
public
class
ComparingBox
<
T
extends
Comparable
<
T
>>
extends
Box
<
T
>
implements
Comparable
<
ComparingBox
<
T
>>
{
@Override
public
int
compareTo
(
ComparingBox
<
T
>
o
)
{
if
(
value
==
null
)
return
o
.
value
==
null
?
0
:
-
1
;
return
value
.
compareTo
(
o
.
value
);
}
}
The definition might seem daunting, but the ComparingBox
is really just a Box
that contains a Comparable
value. The type also extends the comparison operation to the ComparingBox
type itself, just by comparing the contents of the two boxes.
Introducing Covariance
The design of Java’s generics contains the solution to an old problem. In the earliest versions of Java, before the collections libraries were even introduced, the language had been forced to confront a deep-seated type system design issue.
Put simply, the question is this:
Should an array of strings be compatible with a variable of type array-of-object?
In other words, should this code be legal?
String
[]
words
=
{
"Hello World!"
};
Object
[]
objects
=
words
;
Without this, then even simple methods like Arrays::sort
would have been very difficult to write in a useful way, as this would not work as expected:
Arrays
.
sort
(
Object
[]
a
);
The method declaration would work only for the type Object[]
and not for any other array type. As a result of these complications, the very first version of the Java Language Standard determined that:
If a value of type
C
can be assigned to a variable of typeP
, then a value of typeC[]
can be assigned to a variable of typeP[]
.
That is, arrays’ assignment syntax varies with the base type that they hold, or arrays are covariant.
This design decision is rather unfortunate, as it leads to immediate negative consequences:
String
[]
words
=
{
"Hello"
,
"World!"
};
Object
[]
objects
=
words
;
// Oh, dear, runtime error
objects
[
0
]
=
new
Integer
(
42
);
The assignment to objects[0]
attempts to store an Integer
into a piece of storage that is expecting to hold a String
.
This obviously will not work and will throw an ArrayStoreException
.
Warning
The usefulness of covariant arrays led to them being seen as a necessary evil in the very early days of the platform, despite the hole in the static type system that the feature exposes.
However, more recent research on modern open-source codebases indicates that array covariance is extremely rarely used and is a language misfeature.2 You should avoid it when writing new code.
When considering the behavior of generics in the Java platform, a very similar question can be asked: “Is List<String>
a subtype of List<Object>
?” That is, can we write this:
// Is this legal?
List
<
Object
>
objects
=
new
ArrayList
<
String
>
();
At first glance, this seems entirely reasonable—String
is a subclass of Object
, so we know that any String
element in our collection is also a valid Object
.
However, consider the following code (which is just the array covariance code translated to use List
):
// Is this legal?
List
<
Object
>
objects
=
new
ArrayList
<
String
>
();
// What do we do about this?
objects
.
add
(
new
Object
());
As the type of objects
was declared to be List<Object>
, then it should be legal to add an Object
instance to it.
However, as the actual instance holds strings, then trying to add an Object
would not be compatible, and so this would fail at runtime.
This would have changed nothing from the case of arrays, and so the resolution is to realize that although this is legal:
Object
o
=
new
String
(
"X"
);
that does not mean that the corresponding statement for generic container types is also true, and as a result:
// Won't compile
List
<
Object
>
objects
=
new
ArrayList
<
String
>
();
Another way of saying this is that List<String>
is not a subtype of
List<Object>
or that generic types are invariant, not covariant.
We will have more to say about this when we discuss bounded wildcards.
Wildcards
A parameterized type, such as ArrayList<T>
, is not instantiable; we
cannot create instances of them. This is because <T>
is just a type
parameter, merely a placeholder for a genuine type. It is only when we
provide a concrete value for the type parameter (e.g.,
ArrayList<String>
) that the type becomes fully formed and we can
create objects of that type.
This poses a problem if the type that we want to work with is unknown at compile time.
Fortunately, the Java type system is able to accommodate this concept.
It does so by having an explicit concept of the unknown type, which is represented as <?>
. This is the simplest example of Java’s wildcard types.
We can write expressions that involve the unknown type:
ArrayList
<?>
mysteryList
=
unknownList
();
Object
o
=
mysteryList
.
get
(
0
);
This is perfectly valid Java: ArrayList<?>
is a complete type that a variable can have, unlike ArrayList<T>
.
We don’t know anything about mysteryList
’s payload type, but that may not be a problem for our code.
For example, when we get an item out of mysteryList
, it has a completely unknown type.
However, we can be sure that the object is assignable to Object
—because all valid values of a generic type parameter are reference types and all reference values can be assigned to a variable of type Object
.
On the other hand, when we’re working with the unknown type, there are some limitations on its use in user code. For example, this code will not compile:
// Won't compile
mysteryList
.
add
(
new
Object
());
The reason for this is simple: we don’t know what the payload type of
mysteryList
is! For example, if mysteryList
was really a instance of
ArrayList<String>
, then we wouldn’t expect to be able to put an
Object
into it.
The only value that we know we can always insert into a container is
null
, as we know that null
is a possible value for any reference type.
This isn’t that useful, and for this reason, the Java language spec also
rules out instantiating a container object with the unknown type as
payload, for example:
// Won't compile
List
<?>
unknowns
=
new
ArrayList
<?>
();
The unknown type may seem to be of limited utility, but one very important use for it is as a starting point for resolving the covariance question. We can use the unknown type if we want to have a subtyping relationship for containers, like this:
// Perfectly legal
List
<?>
objects
=
new
ArrayList
<
String
>
();
This means that List<String>
is a subtype of List<?>
—although when we use an assignment like the preceding one, we have lost some type information.
For example, the return type of objects.get()
is now effectively Object
.
Note
For any value of the type parameter T
, List<?>
is not a subtype of the type List<T>
.
The unknown type sometimes confuses developers—provoking questions like,
“Why wouldn’t you just use Object
instead of the unknown type?”
However, as we’ve seen, the need to have subtyping relationships between
generic types essentially requires us to have a notion of the unknown
type.
Bounded wildcards
In fact, Java’s wildcard types extend beyond just the unknown type, with the concept of bounded wildcards.
These are used to describe the inheritance hierarchy of a mostly unknown
type—effectively making statements like, for example, “I don’t know
anything about this type, except that it must implement List
.”
This would be written as ? extends List
in the type parameter. This
provides a useful lifeline to programmers. Instead of being restricted
to the totally unknown type, they know that at least the capabilities of
the type bound are available.
Warning
The extends
keyword is always used, regardless of whether the constraining type is a class or interface type.
This is an example of a concept called type variance, which is the general theory of how inheritance between container types relates to the inheritance of their payload types.
- Type covariance
-
This means that the container types have the same relationship to each other as the payload types do. This is expressed using the
extends
keyword. - Type contravariance
-
This means that the container types have the inverse relationship to each other as the payload types. This is expressed using the
super
keyword.
These ideas tend to appear when discussing container types.
For example, if Cat
extends Pet
, then List<Cat>
is a subtype of List<? extends Pet>
, and so:
List
<
Cat
>
cats
=
new
ArrayList
<
Cat
>
();
List
<?
extends
Pet
>
pets
=
cats
;
However, this differs from the array case, because type safety is maintained in the following way:
pets
.
add
(
new
Cat
());
// won't compile
pets
.
add
(
new
Pet
());
// won't compile
cats
.
add
(
new
Cat
());
The compiler cannot prove that the storage pointed at by pets
is capable of storing a Cat
and so it rejects the call to add()
.
However, as cats
definitely points at a list of Cat
objects, then it must be acceptable to add a new one to the list.
As a result, it is very commonplace to see these types of generic constructions with types that act as producers or consumers of payload types.
For example, when the List
is acting as a producer of Pet
objects, then the appropriate keyword is extends
.
Pet
p
=
pets
.
get
(
0
);
Note that for the producer case, the payload type appears as the return type of the producer method.
For a container type that is acting purely as a consumer of instances
of a type, we would use the super
keyword, and we would expect to see the payload type as the type of a method argument.
Note
This is codified in the Producer Extends, Consumer Super (PECS) principle coined by Joshua Bloch.
As we will discuss in Chapter 8, both covariance and contravariance appear throughout the Java Collections. They largely exist to ensure that the generics just “do the right thing” and behave in a manner that should not surprise the developer.
Generic Methods
A generic method is a method that is able to take instances of any reference type.
Let’s look at an example.
In Java, the comma is used to allow multiple declarations in a single line (usually referred to as a compound declaration).
Other languages, such as Javascript or C, have a comma operator that is much more general.
The JS comma operator (,)
evaluates both expressions provided to it (from left to right) and returns the value of the last expression.
The aim is to create a compound expression in which multiple expressions are evaluated, with the compound expression’s value being the value of the rightmost of its member expressions.
Note that any side effects from evaluating the expressions to the comma are always triggered, unlike in a short-circuiting logic operator.
Java’s comma is much more restrictive than this, by design. This is because the comma in other languages can lead to some very hard-to-understand code and can be a fantastic source of bugs. However, if we did want to emulate the behavior of the comma operator from other language, we could do so by creating a generic method:
// Note that this class is not generic
public
class
Utils
{
public
static
<
T
>
T
comma
(
T
a
,
T
b
)
{
return
b
;
}
}
Calling the method Utils.comma()
will cause the values of the expressions a
and b
to be computed, and any side effects to be triggered, before the method call, which is the behavior we want.
However, notice that even though a type parameter is used in the definition of the method, the class it is defined in (Utils
) is not generic.
Instead, we see that a new syntax is used to indicate that the method can be used freely, and that the return type is the same as the argument.
Let’s look at another example, from the Java Collections library.
In the ArrayList
class we can find a method to create a new array object from an arraylist instance:
@SuppressWarnings
(
"unchecked"
)
public
<
T
>
T
[]
toArray
(
T
[]
a
)
{
if
(
a
.
length
<
size
)
// Make a new array of a's runtime type, but my contents:
return
(
T
[]
)
Arrays
.
copyOf
(
elementData
,
size
,
a
.
getClass
());
System
.
arraycopy
(
elementData
,
0
,
a
,
0
,
size
);
if
(
a
.
length
>
size
)
a
[
size
]
=
null
;
return
a
;
}
This method uses the low-level arraycopy()
method to do the actual work.
Note
If we look at the class definition for ArrayList
we can see that it is a generic class—but the type parameter is <E>
, not <T>
, and the type parameter <E>
does not appear at all in the definition of toArray()
.
The toArray()
method provides one half of a bridge API between the collections and Java’s original arrays.
The other half of the API—moving from arrays to collections—involves a few additional subtleties, as we will discuss in Chapter 8.
Compile and Runtime Typing
Consider an example piece of code:
List
<
String
>
l
=
new
ArrayList
<>
();
System
.
out
.
println
(
l
);
We can ask the following question: what is the type of l
? The answer to that question depends on whether we consider l
at compile time (i.e., the type seen by javac
) or at runtime (as seen by the JVM).
javac
will see the type of l
as List-of-String
and will use that type information to carefully check for syntax errors, such as an attempted add()
of an illegal type.
Conversely, the JVM will see l
as an object of type ArrayList
, as we
can see from the println()
statement. The runtime type of l
is a
raw type due to type erasure.
The compile-time and runtime types are therefore slightly different from each other. The slightly strange thing is that in some ways, the runtime type is both more and less specific than the compile-time type.
The runtime type is less specific than the compile-time type, because the type information about the payload type is gone—it has been erased, and the resulting runtime type is just a raw type.
The compile-time type is less specific than the runtime type, because we
don’t know exactly what concrete type l
will be; all we know is that it
will be of a type compatible with List
.
The differences between compile-time and runtime typing sometimes confuse new Java programmers, but the distinction quickly comes to be seen as a normal part of working in the language.
Using and Designing Generic Types
When working with Java’s generics, it can be helpful to think in terms of two different levels of understanding:
- Practitioner
-
A practitioner needs to use existing generic libraries and to build some fairly simple generic classes. At this level, the developer should also understand the basics of type erasure, as several Java syntax features are confusing without at least an awareness of the runtime handling of generics.
- Designer
-
The designer of new libraries that use generics needs to understand much more of the capabilities of generics. There are some nastier parts of the spec, including a full understanding of wildcards, and advanced topics such as “capture-of” error messages.
Java generics are one of the most complex parts of the language specification with a lot of potential corner cases. Not every developer needs to fully understand this part of the language, at least not on their first encounter with this part of Java’s type system.
Enums and Annotations
We have already met records, but Java has additional specialized forms of classes and interfaces used to fulfill specific roles in the type system. They are known as enumerated types and annotation types, or normally just enums and annotations.
Enums
Enums are a variation of classes that have limited functionality and the specific semantic meaning that the type has only a small number of possible permitted values.
For example, suppose we want to define a type to represent the primary colors of red, green, and blue, and we want these to be the only possible values of the type.
We can do this by using the enum
keyword:
public
enum
PrimaryColor
{
// The ; is not required at the end of the list of instances
RED
,
GREEN
,
BLUE
}
The only available instances of the type PrimaryColor
can then be referenced as static fields: PrimaryColor.RED
, PrimaryColor.GREEN
, and PrimaryColor.BLUE
.
Note
In other languages, such as C++, the role of enum types is fulfilled by using constant integers, but Java’s approach provides better type safety and more flexiblity.
As enums are specialized classes, enums can have member fields and methods. If they do have a body (consisting of fields or methods), then the semicolon at the end of the list of instances is required, and the list of enum constants must precede the methods and fields.
For example, suppose that we want to have an enum that encompasses the suits of standard playing cards. We can achieve this by using an enum that takes a value as a parameter, like this:
public
enum
Suit
{
// ; at the end of list required for enums with parameters
HEART
(
'♥'
),
CLUB
(
'♣'
),
DIAMOND
(
'♦'
),
SPADE
(
'♠'
);
private
char
symbol
;
private
char
letter
;
public
char
getSymbol
()
{
return
symbol
;
}
public
char
getLetter
()
{
return
letter
;
}
private
Suit
(
char
symbol
)
{
this
.
symbol
=
symbol
;
this
.
letter
=
switch
(
symbol
)
{
case
'♥'
->
'H'
;
case
'♣'
->
'C'
;
case
'♦'
->
'D'
;
case
'♠'
->
'S'
;
default
->
throw
new
RuntimeException
(
"Illegal:"
+
symbol
);
};
}
}
The parameters (only one of them in this example) are passed to the constructor to create the individual enum instances. As the enum instances are created by the Java runtime, and can’t be instantiated from outside, the constructor is declared as private.
Enums have some special properties:
Annotations
Annotations are a specialized kind of interface that, as the name suggests, annotate some part of a Java program.
For example, consider the @Override
annotation. You may have seen it
on some methods in some of the earlier examples and may have asked the
following question: what does it do?
The short, and perhaps surprising, answer is that it does nothing at all.
The less short (and flippant) answer is that, like all annotations, it has no direct effect but instead acts as additional information about the method that it annotates; in this case, it denotes that a method overrides a superclass method.
This acts as a useful hint to compilers and integrated development
environments (IDEs)—if a developer has misspelled the name of a method intended to be an override of a superclass method, then the
presence of the @Override
annotation on the misspelled method (which
does not override anything) alerts the compiler to the fact that
something is not right.
Annotations, as originally conceived, were not supposed to alter program semantics; instead, they were to provide optional metadata. In its strictest sense, this means that they should not affect program execution and instead should only provide information for compilers and other pre-execution phases.
In practice, modern Java applications make heavy use of annotations, and this now includes many use cases that essentially render the annotated classes useless without additional runtime support.
For example, classes bearing annotations such as @Inject
, @Test
, or @Autowired
cannot realistically be used outside of a suitable container.
As a result, it is difficult to argue that such annotations do not violate the “no semantic meaning” rule.
The platform defines a small number of basic annotations in
java.lang
. The original set were @Override
, @Deprecated
, and
@SuppressWarnings
, which were used to indicate that a method was
overriden, deprecated, or that it generated some compiler warnings that
should be suppressed.
These were augmented by @SafeVarargs
in Java 7 (which provides extended warning suppression for varargs methods) and @FunctionalInterface
in Java 8.
This last annotation indicates an interface can be used as a target for a lambda expression—it is a useful marker annotation although not mandatory, as we will see.
Annotations have some special properties, compared to regular interfaces:
-
All (implicitly) extend
java.lang.annotation.Annotation
-
May not be generic
-
May not extend any other interface
-
May only define zero-arg methods
-
May not define methods that throw exceptions
-
Have restrictions on the return types of methods
-
Can have a default return value for methods
In practice, annotations do not typically have a great deal of functionality and instead are a fairly simple language concept.
Defining Custom Annotations
Defining custom annotation types for use in your own code is not that hard.
The @interface
keyword allows the developer to define a new annotation type, in much the same way that class
or interface
is used.
Note
The key to writing custom annotations is the use of “meta-annotations.” These are special annotations that appear on the definition of new (custom) annotation types.
The meta-annotations are defined in java.lang.annotation
and allow
the developer to specify policy for where the new annotation type is to
be used and how it will be treated by the compiler and runtime.
There are two primary meta-annotations that are both required when creating a new annotation type—@Target
and
@Retention
. These both take values that are represented as enums.
The @Target
meta-annotation indicates where the new custom annotation
can be legally placed within Java source code. The enum ElementType
has the possible values TYPE
, FIELD
, METHOD
,
PARAMETER
, CONSTRUCTOR
, LOCAL_VARIABLE
, ANNOTATION_TYPE
,
PACKAGE
, TYPE_PARAMETER
, and TYPE_USE
, and annotations can indicate that they intend to be used at one or more of these locations.
The other meta-annotation is @Retention
, which indicates how javac
and the Java runtime should process the custom annotation type. It can
have one of three values, which are represented by the enum
RetentionPolicy
:
SOURCE
-
Annotations with this retention policy are discarded by
javac
during compilation. CLASS
-
This means that the annotation will be present in the class file but will not necessarily be accessible at runtime by the JVM. This is rarely used but is sometimes seen in tools that do offline analysis of JVM bytecode.
RUNTIME
-
This indicates that the annotation will be available for user code to access at runtime (by using reflection).
Let’s take a look at an example, a simple annotation called @Nickname
,
which allows the developer to define a nickname for a method, which can
then be used to find the method reflectively at runtime:
@Target
(
ElementType
.
METHOD
)
@Retention
(
RetentionPolicy
.
RUNTIME
)
public
@interface
Nickname
{
String
[]
value
()
default
{};
}
This is all that’s required to define the annotation—a syntax element where the annotation can appear, a retention policy, and the name of the element. As we need to be able to supply the nickname we’re assigning to the method, we also need to define a method on the annotation. Despite this, defining new custom annotations is a remarkably compact undertaking.
In addition to the two primary meta-annotations, there are also the
@Inherited
and @Documented
meta-annotations. These are much less
frequently encountered in practice, and details on them can be found in
the platform documentation.
Type Annotations
With the release of Java 8, two new values for ElementType
were
added: TYPE_PARAMETER
and TYPE_USE
. These new values allow the use of
annotations in places where they were previously not legal, such as at
any site where a type is used. This enables the developer to write code
such as:
@NotNull
String
safeString
=
getMyString
();
The extra type information conveyed by the @NotNull
can then be used by a special type checker to detect problems (a possible NullPointerException
, in this example) and to perform additional static analysis.
The basic Java 8 distribution ships with some basic pluggable type checkers, but it also provides a framework for allowing developers and library authors to create their own.
In this section, we’ve met Java’s enum and annotation types. Let’s move on to consider the next important part of Java’s type system: lambda expressions.
Lambda Expressions
One of the most eagerly anticipated features of Java 8 was the introduction of lambda expressions (frequently referred to as just lambdas).
This major upgrade to the Java platform was driven by five goals, in roughly descending order of priority:
-
More expressive programming
-
Better libraries
-
Concise code
-
Improved programming safety
-
Potentially increased data parallelism
Lambdas have three key aspects that help define the essential nature of the feature:
-
They allow small bits of code to be written inline as literals in a program.
-
They relax the strict grammar of Java code by using type inference.
-
They facilitate a more functional style of programming Java.
As we saw in Chapter 2, the syntax for a lambda expression is to take a list of parameters (the types of which are typically inferred), and to attach that to a method body, like this:
(
p
,
q
)
->
{
/* method body */
}
This can provide a very compact way to represent what is effectively a single method. It is also a major departure from earlier versions of Java—until now, we always required a class declaration and then a complete method declaration, all of which add to the verboseness of the code.
In fact, before the arrival of lambdas, the only way to approximate this coding style was to use anonymous classes, which we will discuss later in this chapter. However, since Java 8, lambdas have proved to be very popular with Java programmers and now have mostly taken over the role of anonymous classes.
Note
Despite the similarities between lambda expressions and anonymous classes, lambdas are not simply syntactic sugar over anonymous classes.
In fact, lambdas are implemented using method handles (which we will meet in Chapter 11) and a special JVM bytecode called invokedynamic
.
Lambda expressions represent the creation of an object of a specific type. The type of the instance that is created is known as the target type of the lambda.
Only certain types are eligible to be the target of a lambda.
Target types are also called functional interfaces and they must:
-
Be interfaces
-
Have only one nondefault method (but may have other methods that are default)
Some developers also like to use the single abstract method (or SAM) type to refer to the interface type that the lambda is converted into. This draws attention to the fact that to be usable by the lambda expression mechanism, an interface must have only a single nondefault method.
Note
A lambda expression has almost all of the component parts of a method, with the obvious exception that a lambda doesn’t have a name. In fact, many developers like to think of lambdas as “anonymous methods.”
As a result, this means that the single line of code:
Runnable
r
=
()
->
System
.
out
.
println
(
"Hello"
);
does not result in the execution of the println()
but instead creates an object, which is assigned to a variable r
, of type Runnable
.
This object, r
, will execute the println()
statement, but only when r.run()
is called, and not until then.
Lambda Expression Conversion
When javac
encounters a lambda expression, it interprets it as the
body of a method with a specific signature—but which method?
To resolve this question, javac
looks at the surrounding code. To be
legal Java code, the lambda expression must satisfy the following properties:
-
The lambda must appear where an instance of an interface type is expected.
-
The expected interface type should have exactly one mandatory method.
-
The expected interface method should have a signature that exactly matches that of the lambda expression.
If this is the case, then an instance is created of a type that implements the expected interface and uses the lambda body as the implementation for the mandatory method.
This slightly complex conversion approach comes from the desire to keep Java’s type system as purely nominative (based on names). The lambda expression is said to be converted to an instance of the correct interface type.
From this discussion, we can see that although Java 8 has added lambda expressions, they have been specifically designed to fit into Java’s existing type system—which has a very strong emphasis on nominal types (rather than the other possible sorts of types that exist in some other programming languages).
Let’s consider an example of lambda conversion—the list()
method of the java.io.File
class.
This method lists the files in a directory. Before it returns the list, though, it passes the name of each file to a FilenameFilter
object that the programmer must supply.
This FilenameFilter
object accepts or rejects each file and is a SAM type defined in the java.io
package:
@FunctionalInterface
public
interface
FilenameFilter
{
boolean
accept
(
File
dir
,
String
name
);
}
The type FilenameFilter
carries the @FunctionalInterface
to indicate that it is a suitable type to be used as the target type for a lambda.
However, this annotation is not required, and any type that meets the requirements (by being an interface and a SAM type) can be used as a target type.
This is because the JDK and the existing corpus of Java code already had a huge number of SAM types available before Java 8 was released. To require potential target types to carry the annotation would have prevented lambdas from being retrofitted to existing code for no real benefit.
Tip
In code that you write, you should always try to indicate when your types are usable as target types, which you can do by adding the @FunctionalInterface
to them.
This aids readability and can help some automated tools as well.
Here’s how we can define a FilenameFilter
class to list only those
files whose names end with .java, using a lambda:
File
dir
=
new
File
(
"/src"
);
// The directory to list
String
[]
filelist
=
dir
.
list
((
d
,
fName
)
->
fName
.
endsWith
(
".java"
));
For each file in the list, the block of code in the lambda expression is
evaluated. If the method returns true
(which happens if the filename
ends in .java), then the file is included in the output—which ends up
in the array filelist
.
This pattern, where a block of code is used to test if an element of a container matches a condition, and to return only the elements that pass the condition, is called a filter idiom. It is one of the standard techniques of functional programming, which we will discuss in more depth presently.
Method References
Recall that we can think of lambda expressions as objects representing methods that don’t have names. Now, consider this lambda expression:
// In real code this would probably be
// shorter because of type inference
(
MyObject
myObj
)
->
myObj
.
toString
()
This will be autoconverted to an implementation of a @FunctionalInterface
type that has a single nondefault method that takes a single MyObject
and returns a String
—specifically, the string obtained by calling toString()
on the instance of MyObject
.
However, this seems like excessive boilerplate, and so Java 8 provides a syntax for making this easier to read and write:
MyObject
::
toString
This shorthand, known as a method reference, uses an existing method as a lambda expression. The method reference syntax is completely equivalent to the previous form expressed as a lambda. It can be thought of as using an existing method but ignoring the name of the method, so it can be used as a lambda and then autoconverted in the usual way. Java defines four types of method reference, which are equivalent to four slightly different lambda expression forms (see Table 4-1).
Name | Method reference | Equivalent lambda |
---|---|---|
Unbound |
|
|
Bound |
|
|
Static |
|
|
Constructor |
|
|
The form we originally introduced can be seen to be an unbound method reference.
When we use an unbound method reference, it is equivalent to a lambda that is expecting an instance of the type that contains the method reference—in Table 4-1 that is a Trade
object.
It is called an unbound method reference because the receiver object needs to be supplied (as the first argument to the lambda) when the method reference is used.
That is, we are going to call getPrice()
on some Trade
object, but the supplier of the method reference has not defined which one. That is left up to the user of the reference.
By contrast, a bound method reference always includes the receiver as part of the instantiation of the method reference.
In Table 4-1, the receiver is System.out
so, when the reference is used, the println()
method will always be called on
System.out
, and all the parameters of the lambda will be used as method parameters to println()
.
We will discuss use cases for method references versus lambda expressions in more detail in the next chapter.
Functional Programming
Java is fundamentally an object-oriented language. However, with the arrival of lambda expressions, it becomes much easier to write code that is closer to the functional approach.
Note
There’s no single definition of exactly what constitutes a functional language—but there is at least consensus that it should at a minimum contain the ability to represent a function as a value that can be put into a variable.
Java has always (since version 1.1) been able to represent functions via inner classes (see next section), but the syntax was complex and lacking in clarity. Lambda expressions greatly simplify that syntax, and so it is only natural that more developers will be seeking to use aspects of functional programming in their Java code.
The first taste of functional programming that Java developers are likely to encounter are three basic idioms that are remarkably useful:
map()
-
The map idiom is used with lists and list-like containers. The idea is that a function is passed in that is applied to each element in the collection, and a new collection is created that consists of the results of applying the function to each element in turn. This means that a map idiom converts a collection of one type to a collection of potentially a different type.
filter()
-
We have already met an example of the filter idiom, when we discussed how to replace an anonymous implementation of
FilenameFilter
with a lambda. The filter idiom is used for producing a new subset of a collection, based on some selection criteria. Note that in functional programming, it is normal to produce a new collection rather than modifying an existing one in place. reduce()
-
The reduce idiom has several different guises. It is an aggregation operation, which can be called fold, accumulate, or aggregate as well as reduce. The basic idea is to take an initial value and an aggregation (or reduction) function, and apply the reduction function to each element in turn, building up a final result for the whole collection by making a series of intermediate results—similar to a “running total”—as the reduce operation traverses the collection.
Java has full support for these key functional idioms (and several others). The implementation is explained in some depth in Chapter 8, where we discuss Java’s data structures and collections, and in particular the stream abstraction, which makes all of this possible.
Let’s conclude this introduction with some words of caution. It’s worth noting that Java is best regarded as having support for “slightly functional programming.” It is not an especially functional language, nor does it try to be. Some particular aspects of Java that militate against any claims to being a functional language include:
-
Java has no structural types, which means no “true” function types. Every lambda is automatically converted to the appropriate target type.
-
Type erasure causes problems for functional programming—type safety can be lost for higher-order functions.
-
Java is inherently mutable (as we’ll discuss in Chapter 6)—mutability is often regarded as highly undesirable for functional languages.
-
The Java collections are imperative, not functional. Collections must be converted to streams to use functional style.
Despite this, easy access to the basics of functional programing—and especially idioms such as map, filter, and reduce—is a huge step forward for the Java community. These idioms are so useful that a large majority of Java developers will never need or miss the more advanced capabilities provided by languages with a more thoroughbred functional pedigree.
In truth, many of these techniques were possible using nested types (see next section for details), via patterns like callbacks and handlers, but the syntax was always quite cumbersome, especially given that you had to explicitly define a completely new type even when you needed to express only a single line of code in the callback.
Lexical Scoping and Local Variables
A local variable is defined within a block of code that defines its scope and, outside of that scope, a local variable cannot be accessed and ceases to exist. Only code within the curly braces that define the boundaries of a block can use local variables defined in that block. This type of scoping is known as lexical scoping, and it just defines a section of source code within which a variable can be used.
It is common for programmers to think of such a scope as temporal instead—that is, to think of a local variable as existing from the time the JVM begins executing the block until the time control exits the block. This is usually a reasonable way to think about local variables and their scope. However, lambda expressions (and anonymous and local classes, which we will meet later) have the ability to bend or break this intuition.
This can cause effects that some developers initially find surprising. Because lambdas can use local variables, they can contain copies of values from lexical scopes that no longer exist. This can been seen in the following code:
public
interface
IntHolder
{
public
int
getValue
();
}
public
class
Weird
{
public
static
void
main
(
String
[]
args
)
{
IntHolder
[]
holders
=
new
IntHolder
[
10
]
;
for
(
int
i
=
0
;
i
<
10
;
i
++
)
{
final
int
fi
=
i
;
holders
[
i
]
=
()
->
{
return
fi
;
};
}
// The lambda is now out of scope, but we have 10 valid instances
// of the class the lambda has been converted to in our array.
// The local variable fi is not in our scope here, but is still
// in scope for the getValue() method of each of those 10 objects.
// So call getValue() for each object and print it out.
// This prints the digits 0 to 9.
for
(
int
i
=
0
;
i
<
10
;
i
++
)
{
System
.
out
.
println
(
holders
[
i
]
.
getValue
());
}
}
}
Each instance of a lambda has an automatically created private copy of each of the final local variables it uses, so, in effect, it has its own private copy of the scope that existed when it was created. This is sometimes referred to as a captured variable.
Lambdas that capture variables like this are referred to as closures, and the variables are said to have been closed over.
Warning
Other programming languages may have a slightly different definition of a closure. In fact, some theorists would dispute that Java’s mechanism counts as a closure because, technically, it is the contents of the variable (a value) and not the variable itself that is captured.
In practice, the preceding closure example is more verbose than it needs to be in two separate ways:
-
The lambda has an explicit scope
{}
andreturn
statement. -
The variable
fi
is explicitly declaredfinal
.
The compiler javac
helps with both of these.
Lambdas that return the value of only a single expression need not include a scope or return
; instead, the body of the lambda is just the expression without the need for curly braces.
In our example, we have explicitly included the braces and return
statement to spell out that the lambda is defining its own scope.
In early versions of Java, there were two hard requirements when closing over a variable:
-
The captures must not be modified after they have been captured (e.g., after the lambda)
However, in recent Java versions, javac
can analyze the code and detect whether the programmer attempts to modify the captured variable after the scope of the lambda.
If not, then the final
qualifier on the captured variable can be omitted (such a variable is said to be effectively final).
If the final
qualifier is omitted, then it is a compile-time error to attempt to modify a captured variable after the lambda’s scope.
The reason for this is that Java implements closures by copying the bit pattern of the contents of the variable into the scope created by the closure. Further changes to the contents of the closed-over variable would not be reflected in the copy contained in closure scope, so the design decision was made to make such changes illegal and a compile-time error.
These assists from javac
mean that we can rewrite the inner loop of the preceding example to the very compact form:
for
(
int
i
=
0
;
i
<
10
;
i
++
)
{
int
fi
=
i
;
holders
[
i
]
=
()
->
fi
;
}
Closures are very useful in some styles of programming, and different programming languages define and implement closures in different ways. Java implements closures as lambda expressions, but local classes and anonymous classes can also capture state—and in fact this is how Java implemented closures before lambdas were available.
Nested Types
The classes, interfaces, and enum types we have seen so far in this book have all been defined as top-level types. This means that they are direct members of packages, defined independently of other types. However, type definitions can also be nested within other type definitions. These nested types, commonly known as “inner classes,” are a powerful feature of the Java language.
In general, nested types are used for two separate purposes, both related to encapsulation. First, a type may be nested because it needs especially intimate access to the internals of another type. By being a nested type, it has access in the same way that member variables and methods do. This means that nested types have privileged access and can be thought of as “slightly bending the rules of encapsulation.”
Another way of thinking about this use case of nested types is that they are types that are somehow tied together with another type. This means that they don’t really have a completely independent existence as an entity and only coexist.
Alternatively, a type may be only required for a very specific reason and in a very small section of code. This means that it should be tightly localized, as it is really part of the implementation detail.
In older versions of Java, the only way to do this was with a nested type, such as an anonymous implementation of an interface. In practice, with the advent of Java 8, this use case has substantially been taken over by lambda expressions. The use of anonymous types as closely localized types has dramatically declined as a result, although it still persists for some cases.
Types can be nested within another type in four different ways:
- Static member types
-
A static member type is any type defined as a
static
member of another type. Nested interfaces, enums, and annotations are always static (even if you don’t use the keyword). - Nonstatic member classes
-
A “nonstatic member type” is simply a member type that is not declared
static
. Only classes can be nonstatic member types. - Local classes
-
A local class is a class that is defined and only visible within a block of Java code. Interfaces, enums, and annotations may not be defined locally.
- Anonymous classes
-
An anonymous class is a kind of local class that has no meaningful name that is useful to humans; it is merely an arbitrary name assigned by the compiler, which programmers should not use directly. Interfaces, enums, and annotations cannot be defined anonymously.
The term “nested types,” while correct and precise, is not widely used by developers. Instead, most Java programmers use the much vaguer term “inner class.” Depending on the situation, this can refer to a nonstatic member class, local class, or anonymous class, but not a static member type, with no real way to distinguish between them.
Fortunately, although the terminology for describing nested types is not always clear, the syntax for working with them is, and it is usually apparent from context which kind of nested type is being discussed.
Note
Until Java 11, nested types were implemented using a compiler trick and were mostly syntactic sugar. Experienced Java programmers should note that this detail changed in Java 11, and it is no longer done in quite the same way as it used to be.
Let’s move on to describe each of the four kinds of nested types in greater detail. Each section describes the features of the nested type, the restrictions on its use, and any special Java syntax used with the type.
Static Member Types
A static member type is much like a regular top-level type. For convenience, however, it is nested within the namespace of another type. Static member types have the following basic properties:
-
A static member type is like the other static members of a class: static fields and static methods.
-
A static member type is not associated with any instance of the containing class (i.e., there is no
this
object). -
A static member type can access (only) the
static
members of the class that contains it. -
A static member type has access to all the
static
members (including any other static member types) of its containing type. -
Nested interfaces, enums, and annotations are implicitly static, whether or not the
static
keyword appears. -
Any type nested within an interface or annotation is also implicitly
static
. -
Static member types may be defined within top-level types or nested to any depth within other static member types.
-
A static member type may not be defined within any other kind of nested type.
Let’s look at a quick example of the syntax for static member types.
Example 4-1 shows a helper interface defined
as a static member of a containing interface, in this case Java’s Map
.
Example 4-1. Defining and using a static member interface
public
interface
Map
<
K
,
V
>
{
// ...
Set
<
Map
.
Entry
<
K
,
V
>>
entrySet
();
// All nested interfaces are automatically static
interface
Entry
<
K
,
V
>
{
K
getKey
();
V
getValue
();
V
setValue
(
V
value
);
// other members elided
}
// other members elided
}
When used by an external class, Entry
will be referred to by its hierarchical name Map.Entry
.
Features of static member types
A static member type has access to all static members of its containing
type, including private
members. The reverse is true as well: the
methods of the containing type have access to all members of a static
member type, including the private
members. A static member type even
has access to all the members of any other static member types,
including the private
members of those types. A static member type can
use any other static member without qualifying its name with the name of
the containing type.
Top-level types can be declared as either public
or package-private
(if they’re declared without the public
keyword). But declaring
top-level types as private
and protected
wouldn’t make a great deal
of sense—protected
would just mean the same as package-private, and a
private
top-level class would be unable to be accessed by any other
type.
Static member types, on the other hand, are members and so can use any access control modifiers that other members of the containing type can. These modifiers have the same meanings for static member types as they do for other members of a type.
Under most circumstances, the Outer.Inner
syntax for class names provides a helpful reminder that the inner class is interconnected with its containing type.
However, the Java language does permit you to use the import
directive to directly import a static member type:
import
java.util.Map.Entry
;
You can then reference the nested type without including the name of its
enclosing type (e.g., just as Entry
).
Note
You can also use the import static
directive to import a static member type.
See “Packages and the Java Namespace” in Chapter 2 for details on import
and import static
.
However, importing a nested type obscures the fact that that type is closely associated with its containing type—which is usually important information—and as a result it is not commonly done.
Nonstatic Member Classes
A nonstatic member class is a class that is declared as a member
of a containing class or enumerated type without the static
keyword:
-
If a static member type is analogous to a class field or class method, a nonstatic member class is analogous to an instance field or instance method.
-
Only classes can be nonstatic member types.
-
An instance of a nonstatic member class is always associated with an instance of the enclosing type.
-
The code of a nonstatic member class has access to all the fields and methods (both
static
and non-static
) of its enclosing type. -
Several Java syntax features exist specifically to work with the enclosing instance of a nonstatic member class.
Example 4-2 shows how a member class can be defined and used.
This example shows a LinkedStack
example: it defines a nested interface that describes the nodes of the linked list underlying the stack and a nested class to allow enumeration of the elements on the stack.
The member class defines an implementation of the java.util.Iterator
interface.
Example 4-2. An iterator implemented as a member class
import
java.util.Iterator
;
public
class
LinkedStack
{
// Our static member interface
public
interface
Linkable
{
public
Linkable
getNext
();
public
void
setNext
(
Linkable
node
);
}
// The head of the list
private
Linkable
head
;
// Method bodies omitted here
public
void
push
(
Linkable
node
)
{
...
}
public
Linkable
pop
()
{
...
}
// This method returns an Iterator object for this LinkedStack
public
Iterator
<
Linkable
>
iterator
()
{
return
new
LinkedIterator
();
}
// Here is the implementation of the Iterator interface,
// defined as a nonstatic member class.
protected
class
LinkedIterator
implements
Iterator
<
Linkable
>
{
Linkable
current
;
// The constructor uses a private field of the containing class
public
LinkedIterator
()
{
current
=
head
;
}
// The following three methods are defined
// by the Iterator interface
public
boolean
hasNext
()
{
return
current
!=
null
;
}
public
Linkable
next
()
{
if
(
current
==
null
)
throw
new
java
.
util
.
NoSuchElementException
();
Linkable
value
=
current
;
current
=
current
.
getNext
();
return
value
;
}
public
void
remove
()
{
throw
new
UnsupportedOperationException
();
}
}
}
Notice how the LinkedIterator
class is nested within the LinkedStack
class. LinkedIterator
is a helper class used only within LinkedStack
, so having it defined close to where it is used by the
containing class makes for a clean design.
Features of member classes
Like instance fields and instance methods, every instance of a
nonstatic member class is associated with an instance of the class in
which it is defined. This means that the code of a member class has
access to all the instance fields and instance methods (as well as the
static
members) of the containing instance, including any that are
declared private
.
This crucial feature was already illustrated in Example 4-2.
Here is the LinkedStack.LinkedIterator()
constructor again:
public
LinkedIterator
()
{
current
=
head
;
}
This single line of code sets the current
field of the inner class to
the value of the head
field of the containing class. The code works as
shown, even though head
is declared as a private
field in the
containing class.
A nonstatic member class, like any member of a class, can be assigned
one of the standard access control modifiers. In
Example 4-2, the LinkedIterator
class is
declared protected
, so it is inaccessible to code (in a different
package) that uses the LinkedStack
class but is accessible to any
class that subclasses LinkedStack
.
Member classes have two important restrictions:
-
A nonstatic member class cannot have the same name as any containing class or package. This is an important rule, one that is not shared by fields and methods.
-
Nonstatic member classes cannot contain any
static
fields, methods, or types, except for constant fields declared bothstatic
andfinal
.
Syntax for member classes
The most important feature of a member class is that it can access the instance fields and methods in its containing object.
If we want to use explicit references, and make use of this
, then we
have to use a special syntax for explicitly referring to the containing
instance of the this
object. For example, if we want to be explicit in
our constructor, we can use the following syntax:
public
LinkedIterator
()
{
this
.
current
=
LinkedStack
.
this
.
head
;
}
The general syntax is classname.this
, where classname
is the
name of a containing class. Note that member classes can themselves
contain member classes, nested to any depth.
However, no member class can have the same name as any containing class, so the use of the enclosing class name prepended to this
is a perfectly general way to
refer to any containing instance.
Another way of saying this is that the syntax construction EnclosingClass.this
is an unambiguous way of referring to the containing instance as an uplevel reference.
Local Classes
A local class is declared locally within a block of Java code rather than as a member of a class. Only classes may be defined locally: interfaces, enumerated types, and annotation types must be top-level or static member types. Typically, a local class is defined within a method, but it can also be defined within a static initializer or instance initializer of a class.
Just as all blocks of Java code appear within class definitions, all local classes are nested within containing blocks. For this reason, although local classes share many of the features of member classes, it is usually more appropriate to think of them as an entirely separate kind of nested type.
Note
See Chapter 5 for details as to when it’s appropriate to choose a local class versus a lambda expression.
The defining characteristic of a local class is that it is local to a
block of code. Like a local variable, a local class is valid only within
the scope defined by its enclosing block.
Example 4-3 illustrates how we can modify the
iterator()
method of the LinkedStack
class so it defines
LinkedIterator
as a local class instead of a member class.
By doing this, we move the definition of the class even closer to where
it is used and hopefully improve the clarity of the code even further.
For brevity, Example 4-3 shows only the
iterator()
method, not the entire LinkedStack
class that contains
it.
Example 4-3. Defining and using a local class
// This method returns an Iterator object for this LinkedStack
public
Iterator
<
Linkable
>
iterator
()
{
// Here's the definition of LinkedIterator as a local class
class
LinkedIterator
implements
Iterator
<
Linkable
>
{
Linkable
current
;
// The constructor uses a private field of the containing class
public
LinkedIterator
()
{
current
=
head
;
}
// The following three methods are defined
// by the Iterator interface
public
boolean
hasNext
()
{
return
current
!=
null
;
}
public
Linkable
next
()
{
if
(
current
==
null
)
throw
new
java
.
util
.
NoSuchElementException
();
Linkable
value
=
current
;
current
=
current
.
getNext
();
return
value
;
}
public
void
remove
()
{
throw
new
UnsupportedOperationException
();
}
}
// Create and return an instance of the class we just defined
return
new
LinkedIterator
();
}
Features of local classes
Local classes have the following interesting features:
-
Like member classes, local classes are associated with a containing instance and can access any members, including
private
members, of the containing class. -
In addition to accessing fields defined by the containing class, local classes can access any local variables, method parameters, or exception parameters that are in the scope of the local method definition and are declared
final
.
Local classes are subject to the following restrictions:
-
The name of a local class is defined only within the block that defines it; it can never be used outside that block. (Note, however, that instances of a local class created within the scope of the class can continue to exist outside of that scope. This situation is described in more detail later in this section.)
-
Local classes cannot be declared
public
,protected
,private
, orstatic
. -
Like member classes, and for the same reasons, local classes cannot contain
static
fields, methods, or classes. The only exception is for constants that are declared bothstatic
andfinal
. -
Interfaces, enumerated types, and annotation types cannot be defined locally.
-
A local class, like a member class, cannot have the same name as any of its enclosing classes.
-
As noted earlier, a local class can close over the local variables, method parameters, and even exception parameters that are in its scope but only if those variables or parameters are effectively
final
.
Scope of a local class
In discussing nonstatic member classes, we saw that a member class can access any members inherited from superclasses and any members defined by their containing classes.
The same is true for local classes, but local classes can also behave like lambdas and access effectively final
local variables and parameters.
Example 4-4 illustrates the different kinds of fields and variables that may be accessible to a local class (or a lambda, for that matter).
Example 4-4. Fields and variables available to a local class
class
A
{
protected
char
a
=
'a'
;
}
class
B
{
protected
char
b
=
'b'
;
}
public
class
C
extends
A
{
private
char
c
=
'c'
;
// Private fields visible to local class
public
static
char
d
=
'd'
;
public
void
createLocalObject
(
final
char
e
)
{
final
char
f
=
'f'
;
int
i
=
0
;
// i not final; not usable by local class
class
Local
extends
B
{
char
g
=
'g'
;
public
void
printVars
()
{
// All of these fields and variables are accessible to this class
System
.
out
.
println
(
g
);
// (this.g) g is a field of this class
System
.
out
.
println
(
f
);
// f is a final local variable
System
.
out
.
println
(
e
);
// e is a final local parameter
System
.
out
.
println
(
d
);
// (C.this.d) d field of containing class
System
.
out
.
println
(
c
);
// (C.this.c) c field of containing class
System
.
out
.
println
(
b
);
// b is inherited by this class
System
.
out
.
println
(
a
);
// a is inherited by the containing class
}
}
Local
l
=
new
Local
();
// Create an instance of the local class
l
.
printVars
();
// and call its printVars() method.
}
}
Local classes have quite a complex scoping structure, therefore. To see why, notice that instances of a local class can have a lifetime that extends past the time that the JVM exits the block where the local class is defined.
Note
In other words, if you create an instance of a local class, that instance does not automatically go away when the JVM finishes executing the block that defines the class. So, even though the definition of the class was local, instances of that class can escape the place they were defined.
Local classes, therefore, behave like lambdas in many regards, although the use case of local classes is more general than that of lambdas. However, in practice, the extra generality is rarely required, and lambdas are preferred wherever possible.
Anonymous Classes
An anonymous class is a local class without a name. It is defined and instantiated in a single expression using the new
operator.
While a local class definition is a statement in a block of Java code, an anonymous class definition is an expression, which means that it can be included as part of a larger expression, such as a method call.
Note
For the sake of completeness, we cover anonymous classes here, but for most use cases, lambda expressions (see “Lambda Expressions”) have replaced anonymous classes.
Consider Example 4-5, which shows the LinkedIterator
class implemented as an anonymous class within the
iterator()
method of the LinkedStack
class.
Compare it with Example 4-4, which shows the same class implemented as a local class.
Example 4-5. An enumeration implemented with an anonymous class
public
Iterator
<
Linkable
>
iterator
()
{
// The anonymous class is defined as part of the return statement
return
new
Iterator
<
Linkable
>
()
{
Linkable
current
;
// Replace constructor with an instance initializer
{
current
=
head
;
}
// The following three methods are defined
// by the Iterator interface
public
boolean
hasNext
()
{
return
current
!=
null
;
}
public
Linkable
next
()
{
if
(
current
==
null
)
throw
new
java
.
util
.
NoSuchElementException
();
Linkable
value
=
current
;
current
=
current
.
getNext
();
return
value
;
}
public
void
remove
()
{
throw
new
UnsupportedOperationException
();
}
};
// Note the required semicolon. It terminates the return statement
}
As you can see, the syntax for defining an anonymous class and creating
an instance of that class uses the new
keyword, followed by the name
of a type and a class body definition in curly braces.
If the name following the new
keyword is the name of a class, the anonymous class is a subclass of the named class.
If the name following new
specifies an interface, as in the two previous examples, the anonymous class implements that interface and extends Object
.
Note
The syntax for anonymous classes deliberately does not include any way to specify an extends
clause, an implements
clause, or a name for the class.
Because an anonymous class has no name, it is not possible to define a constructor for it within the class body. This is one of the basic restrictions on anonymous classes. Any arguments you specify between the parentheses following the superclass name in an anonymous class definition are implicitly passed to the superclass constructor. Anonymous classes are commonly used to subclass simple classes that do not take any constructor arguments, so the parentheses in the anonymous class definition syntax are often empty.
Because an anonymous class is just a type of local class, anonymous
classes and local classes share the same restrictions. An anonymous
class cannot define any static
fields, methods, or classes, except for
static
final
constants. Interfaces, enumerated types, and annotation
types cannot be defined anonymously. Also, like local classes,
anonymous classes cannot be public
, private
, protected
, or
static
.
The syntax for defining an anonymous class combines definition with instantiation, similar to a lambda expression. Using an anonymous class instead of a local class is not appropriate if you need to create more than a single instance of the class each time the containing block is executed.
Describing the Java Type System
At this point, we have met all of the major aspects of the Java type system, and so it is possible for us to describe and characterize it.
The most important and obvious characteristics of Java’s type system are that it is:
-
Static
-
Not single-rooted
-
Nominal
Static typing, which is the most widely recognized of the three aspects, means that in Java, every piece of data storage (such as variables, fields, etc.) has a type, and that type is declared when the storage is first introduced. It is a compile-time error to try to put an incompatible value into storage that does not support it.
That Java’s type system is not single-rooted is also immediately apparent. Java has both primitive types and reference types.
Every object in Java belongs to a class, and every class, except Object
, has a single parent.
This means that the set of classes in any Java program forms a tree structure with Object
at the root.
However, there is no inheritance relationship between any of the primitive types and Object
.
As a result, the overall graph of Java classes consists of a large tree of reference types and eight disjoint, isolated points that correspond to the primitives.
This leads to the need to use wrapper types, such as Integer
, to represent primitive values as objects where necessary (such as in the Java Collections).
The final aspect, though, requires a bit more of a detailed discussion.
Nominal Typing
In Java, each type has a name. In the normal course of Java programming, this will be a simple string of letters (and sometimes numbers) that has some semantic meaning that reflects the purpose of the type. This approach is known as nominal typing.
Not all languages have purely nominal typing; for example, some languages can express the idea that “this type has a method with a certain signature” without needing to explicitly refer to the name of the type, sometimes known as a structural type.
For example, in Python, you can call len()
on any object that defines a __len__()
method.
Of course, Python is a dynamically typed language and so will throw a runtime exception if the call to len()
cannot be made.
However, it is also possible to express a similar idea in statically typed languages, such as Scala.
Java, on the other hand, has no way to express this idea without using an interface, which, of course, has a name. Java also maintains type compatibility based strictly on inheritance and implementation. Let’s look at an example:
@FunctionalInterface
public
interface
MyRunnable
{
void
run
();
}
The interface MyRunnable
has a single method that exactly matches that of
Runnable
.
However, the two interfaces have no inheritance or other relationship to each other and so code like this:
MyRunnable
myR
=
()
->
System
.
out
.
println
(
"Hello"
);
Runnable
r
=
(
Runnable
)
myR
;
r
.
run
();
will compile cleanly but will fail with a ClassCastException
at runtime.
The fact that a run()
method with an identical signature exists on both interfaces is not considered, and in fact the program never even makes it to the point where run()
would be called: it fails on the previous line where the cast is attempted.
Another important point is that the entire construction of Java’s lambda expressions, and especially the use of target typing to a functional interface, is to ensure that lambdas fit into the nominal typing approach. For example, consider an interface such as:
@FunctionalInterface
public
interface
MyIntProvider
{
int
run
()
throws
InterruptedException
;
}
then a lambda expression that yields a constant, e.g., () -> 42
, can be used in a number of different ways:
MyIntProvider
prov
=
()
->
42
;
Supplier
<
Integer
>
sup
=
()
->
42
;
Callable
<
Integer
>
callMe
=
()
->
42
;
From this, we can see that the expression () -> 42
is, by itself, incomplete.
Java lambdas rely upon type inference, and so we need to see the expression in context with its target type for it to be meaningful.
When combined with a target type, the lambda’s class type is “an unknown-at-compile-time implementation of the target interface,” and the programmer must use the interface type as the type of the lambda.
Beyond lambdas, there are some corner cases of nominal typing in Java. One example is anonymous classes, but even here the types still have names. However, the type names of anonymous types are automatically generated by the compiler and are specially chosen so as to be usable by the JVM but not accepted by the Java source code compiler.
There is one other corner case that we should consider, and it relates to the enhanced type inference introduced in recent Java versions.
Nondenotable Types and var
From Java 11 onwards (actually introduced in the Java 10 non-LTS release), Java developers can make use of a new language feature Local Variable Type Inference (LVTI), otherwise known as var
.
This is an enhancement to Java’s type inference capabilities that may prove to be more significant than it first appears.
In the simplest case, it allows code such as:
var
ls
=
new
ArrayList
<
String
>
();
which moves the inference from the type of values to the type of variables.
The implementation achieves this by making var
a reserved type name rather than a keyword.
This means that code can still use var
as a variable, method, or package name without being affected by the new syntax.
However, code that has previously used var
as the name of a type will have to be recompiled.
This simple case is designed to reduce verbosity and to make programmers coming to Java from other languages (especially Scala, .NET, and JavaScript) feel more comfortable. However, it does carry the risk that overuse will potentially obscure the intent of the code being written, so it should be used sparingly.
As well as the simple cases, var
actually permits programming constructs that were not possible before.
To see the differences, let’s consider that javac
has always permitted a very limited form of type inference:
public
class
Test
{
public
static
void
main
(
String
[]
args
)
{
(
new
Object
()
{
public
void
bar
()
{
System
.
out
.
println
(
"bar!"
);
}
}).
bar
();
}
}
The code will compile and run, printing out bar!
.
This slightly counterintuitive result occurs because javac
preserves enough type information about the anonymous class (i.e., that it has a bar()
method) for just long enough that the compiler can conclude that the call to bar()
is valid.
In fact, this edge case has been known in the Java community since at least 2009, long before the arrival of Java 7.
The problem with this form of type inference is that it has no real practical applications: the type of “Object-with-a-bar-method” exists within the compiler, but the type is impossible to express as the type of a variable—it is not a denotable type. This means that before Java 10, the existence of this type is restricted to a single expression and cannot be used in a larger scope.
With the arrival of LVTI, however, the type of variables does not always need to be made explicit.
Instead, we can use var
to allow us to preserve the static type information by avoiding denoting the type.
This means we can now modify our example and write:
var
o
=
new
Object
()
{
public
void
bar
()
{
System
.
out
.
println
(
"bar!"
);
}
};
o
.
bar
();
This has allowed us to preserve the true type of o
beyond a single expression.
The type of o
cannot be denoted, and so it cannot appear as the type of either a method parameter or return type.
This means the type is still limited to only a single method, but it is still useful to express some constructions that would be awkward or impossible otherwise.
This use of var
as a “magic type” allows the programmer to preserve type information for each distinct usage of var
, in a way that is somewhat reminiscent of bounded wildcards from Java’s generics.
More advanced usages of var
with nondenotable types are possible.
While the feature is not able to satisfy every criticism of Java’s type system, it does represent a definite (if cautious) step forward.
Summary
By examining Java’s type system, we have been able to build up a clear picture of the worldview that the Java platform has about data types. Java’s type system can be characterized as:
- Static
-
All Java variables have types that are known at compile time.
- Nominal
-
The name of a Java type is of paramount importance. Java does not permit structural types and has only limited support for nondenotable types.
- Object/imperative
-
Java code is object-oriented, and all code must live inside methods, which must live inside classes. However, Java’s primitive types prevent full adoption of the “everything is an object” worldview.
- Slightly functional
-
Java provides support for some of the more common functional idioms but more as a convenience to programmers than anything else.
- Type-inferred
-
Java is optimized for readability (even by novice progammers) and prefers to be explicit but uses type inference to reduce boilerplate where it does not impact the legibility of the code.
- Strongly backward compatible
-
Java is primarily a business-focused language, and backward compatibility and protection of existing codebases are very high priorities.
- Type erased
-
Java permits parameterized types, but this information is not available at runtime.
Java’s type system has evolved (albeit slowly and cautiously) over the years—and is now on par with the type systems of other mainstream programming languages. Lambda expressions, along with default methods, represent the greatest transformation since the advent of Java 5 and the introduction of generics, annotations, and related innovations.
Default methods represent a major shift in Java’s approach to object-oriented programming—perhaps the biggest since the language’s inception. From Java 8 onward, interfaces can contain implementation code. This fundamentally changes Java’s nature. Previously a single-inherited language, Java is now multiply inherited (but only for behavior—there is still no multiple inheritance of state).
Despite all of these innovations, Java’s type system is not (and is not intended to be) equipped with the power of the type systems of languages such as Scala or Haskell. Instead, Java’s type system is strongly biased in favor of simplicity, readability, and a simple learning curve for newcomers.
Java has also benefited enormously from the approaches to types developed in other languages over the last 10 years. Scala’s example of a statically typed language that nevertheless achieves much of the feel of a dynamically typed language through the use of type inference has been a good source of ideas for features to add to Java, even though the languages have quite different design philosophies.
One remaining question is whether the modest support for functional idioms that lambda expressions provide in Java is sufficient for the majority of Java programmers.
Note
The long-term direction of Java’s type system is being explored in research projects such as Valhalla, where concepts such as data classes, pattern matching, and sealed classes are being explored.
It remains to be seen whether the majority of ordinary Java programmers require the added power—and attendant complexity—that comes from an advanced (and much less nominal) type system such as Scala’s, or whether the “slightly functional programming” introduced in Java 8 (e.g., map, filter, reduce, and their peers) will suffice for most developers’ needs.
Get Java in a Nutshell, 8th Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.