The type system of a programming language describes how its data elements (variables and constants) are associated with storage in memory and how they are related to one another. In a statically typed language, such as C or C++, the type of a data element is a simple, unchanging attribute that often corresponds directly to some underlying hardware phenomenon, such as a register or a pointer value. In a more dynamic language such as Smalltalk or Lisp, variables can be assigned arbitrary elements and can effectively change their type throughout their lifetime. A considerable amount of overhead goes into validating what happens in these languages at runtime. Scripting languages such as Perl achieve ease of use by providing drastically simplified type systems in which only certain data elements can be stored in variables, and values are unified into a common representation, such as strings.
Java combines many of the best features of both statically and dynamically typed languages. As in a statically typed language, every variable and programming element in Java has a type that is known at compile time, so the runtime system doesn’t normally have to check the validity of assignments between types while the code is executing. Unlike traditional C or C++, Java also maintains runtime information about objects and uses this to allow truly dynamic behavior. Java code may load new types at runtime and use them in fully object-oriented ways, allowing casting and full polymorphism (extending of types). Java code may also “reflect” upon or examine its own types at runtime, allowing advanced kinds of application behavior such as interpreters that can interact with compiled programs dynamically.
Java data types fall into two categories. Primitive types represent simple values that have built-in functionality in the language; they are fixed elements, such as literal constants and numbers. Reference types (or class types) include objects and arrays; they are called reference types because they “refer to” a large data type that is passed “by reference,” as we’ll explain shortly. Generic types are really just a kind of composition (combination) of class types and are therefore reference types as well.
Numbers, characters, and Boolean values are fundamental elements in Java. Unlike some other (perhaps more pure) object-oriented languages, they are not objects. For those situations where it’s desirable to treat a primitive value as an object, Java provides “wrapper” classes. The major advantage of treating primitive values as special is that the Java compiler and runtime can more readily optimize their implementation. Primitive values and computations can still be mapped down to hardware as they always have been in lower-level languages. Later we’ll see how Java can automatically convert between primitive values and their object wrappers as needed to partially mask the difference between the two. We’ll explain what that means in more detail in the next chapter when we discuss boxing and unboxing of primitive values.
An important portability feature of Java is that primitive types
are precisely defined. For example, you never have to worry about the
size of an int
on a particular
platform; it’s always a 32-bit, signed, two’s complement number. Table 4-2 summarizes Java’s primitive
types.
Table 4-2. Java primitive data types
Type | Definition |
---|---|
| |
16-bit, Unicode character | |
8-bit, signed, two’s complement integer | |
16-bit, signed, two’s complement integer | |
32-bit, signed, two’s complement integer | |
64-bit, signed, two’s complement integer | |
32-bit, IEEE 754, floating-point value | |
64-bit, IEEE 754 |
Note
Those of you with a C background may notice that the primitive types look like an idealization of C scalar types on a 32-bit machine, and you’re absolutely right. That’s how they’re supposed to look. The 16-bit characters were forced by Unicode, and ad hoc pointers were deleted for other reasons. But overall, the syntax and semantics of Java primitive types derive from C.
Floating-point operations in Java follow the IEEE 754
international specification, which means that the result of
floating-point calculations is normally the same on different Java
platforms. However, Java allows for extended precision on platforms
that support it. This can introduce extremely small-valued and arcane
differences in the results of high-precision operations. Most
applications would never notice this, but if you want to ensure that
your application produces exactly the same results on different
platforms, you can use the special keyword strictfp
as a class
modifier on the class containing the floating-point manipulation (we
cover classes in the next chapter). The compiler then prohibits these
platform-specific optimizations.
Variables are declared inside of methods and classes with a type name followed by one or more comma-separated variable names. For example:
int
foo
;
double
d1
,
d2
;
boolean
isFun
;
Variables can optionally be initialized with an expression of the appropriate type when they are declared:
int
foo
=
42
;
double
d1
=
3.14
,
d2
=
2
*
3.14
;
boolean
isFun
=
true
;
Variables that are declared as members of a class are set to
default values if they aren’t initialized (see Chapter 5). In this case, numeric types default to
the appropriate flavor of zero, characters are set to the null
character (\0)
, and Boolean
variables have the value false
.
Local variables, which are declared inside a method and live only for
the duration of a method call, on the other hand, must be explicitly
initialized before they can be used. As we’ll see, the compiler
enforces this rule so there is no danger of forgetting.
Integer literals can be specified in octal (base 8), decimal (base 10), or hexadecimal (base 16). A decimal integer is specified by a sequence of digits beginning with one of the characters 1–9:
int
i
=
1230
;
Octal numbers are distinguished from decimal numbers by a leading zero:
int
i
=
01230
;
// i = 664 decimal
A hexadecimal number is denoted by the leading
characters 0x
or 0X
(zero “x”), followed by a combination of
digits and the characters a–f or A–F, which represent the decimal
values 10–15:
int
i
=
0
xFFFF
;
// i = 65535 decimal
Integer literals are of type int
unless they are
suffixed with an L
, denoting that they
are to be produced as a long
value:
long
l
=
13L
;
long
l
=
13
;
// equivalent: 13 is converted from type int
(The lowercase letter l
is
also acceptable but should be avoided because it often looks like the
number 1
.)
When a numeric type is used in an assignment or an expression
involving a “larger” type with a greater range, it can be promoted to the bigger type. In the
second line of the previous example, the number 13
has the default type of int
, but it’s promoted to type long
for assignment to the long
variable. Certain other numeric and
comparison operations also cause this kind of arithmetic promotion, as
do mathematical expressions involving more than one type. For example,
when multiplying a byte
value by an
int
value, the compiler promotes
the byte
to an int
first:
byte
b
=
42
;
int
i
=
43
;
int
result
=
b
*
i
;
// b is promoted to int before multiplication
A numeric value can never go the other way and be assigned to a type with a smaller range without an explicit cast, however:
int
i
=
13
;
byte
b
=
i
;
// Compile-time error, explicit cast needed
byte
b
=
(
byte
)
i
;
// OK
Conversions from floating-point to integer types always require an explicit cast because of the potential loss of precision.
Finally, we should note that if you are using Java 7 or later, you can add a bit of formatting to your numeric literals by utilizing the “_” underscore character between digits. So if you have particularly large strings of digits, you can break them up as in the following examples:
int
RICHARD_NIXONS_SSN
=
567
_68_0515
;
int
for_no_reason
=
1
___2___3
;
int
JAVA_ID
=
0
xCAFE_BABE
;
Underscores may only appear between digits, not at the beginning or end of a number or next to the “L” long integer signifier.
Floating-point values can be specified in decimal or
scientific notation. Floating-point literals are of type double
unless they
are suffixed with an f
or
F
denoting that they
are to be produced as a float
value. And just
as with integer literals, in Java 7 you may use “_” underscore
characters to format floating-point numbers—but only between digits,
not at the beginning, end, or next to the decimal point or “F”
signifier of the number.
double
d
=
8.31
;
double
e
=
3.00
e
+
8
;
float
f
=
8.31
F
;
float
g
=
3.00
e
+
8
F
;
float
pi
=
3.14
_159_265_358
;
A new feature of Java 7 is the introduction of binary literal values. This allows you to write out binary values directly by prefixing the number with a “0b” or “0B” (zero B).
byte
one
=
(
byte
)
0
b00000001
;
byte
two
=
(
byte
)
0
b00000010
;
byte
four
=
(
byte
)
0
b00000100
;
byte
sixteen
=
(
byte
)
0
b00001000
;
int
cafebabe
=
0
b11001010111111101011101010111110
;
long
lots_o_ones
=
(
long
)
0
b11111111111111111111111111111111111111111111111L
;
In an object-oriented language like Java, you create new,
complex data types from simple primitives by creating a class
. Each class then serves as a new type in
the language. For example, if we create a new class called Foo
in Java, we are also implicitly creating a
new type called Foo
. The type of an
item governs how it’s used and where it can be assigned. As with
primitives, an item of type Foo
can,
in general, be assigned to a variable of type Foo
or passed as an argument to a method that
accepts a Foo
value.
A type is not just a simple attribute. Classes can have
relationships with other classes and so do the types that they
represent. All classes in Java exist in a parent-child hierarchy, where
a child class or subclass is a specialized kind of
its parent class. The corresponding types have the same relationship,
where the type of the child class is considered a subtype of the parent
class. Because child classes inherit all of the functionality of their
parent classes, an object of the child’s type is in some sense
equivalent to or an extension of the parent type. An object of the child
type can be used in place of an object of the parent’s type. For
example, if you create a new class, Cat
, that extends Animal
, the new type, Cat
, is considered a subtype of Animal
. Objects of type Cat
can then be used anywhere an object of
type Animal
can be used; an object of
type Cat
is said to be assignable to
a variable of type Animal
. This is called
subtype polymorphism and is one of the primary
features of an object-oriented language. We’ll look more closely at
classes and objects in Chapter 5.
Primitive types in Java are used and passed “by value.” In other
words, when a primitive value like an int
is assigned to a variable or passed as an
argument to a method, it’s simply copied. Reference types (class types),
on the other hand, are always accessed “by reference.” A reference is simply a handle or a
name for an object. What a variable of a reference type holds is a
“pointer” to an object of its type (or of a subtype, as described
earlier). When the reference is assigned to a variable or passed to a
method, only the reference is copied, not the object to which it’s
pointing. A reference is like a pointer in C or C++, except that its
type is so strictly enforced. The reference value itself can’t be
explicitly created or changed. A variable acquires a reference value
only through assignment to an appropriate object.
Let’s run through an example. We declare a variable of type
Foo
, called myFoo
, and assign it an appropriate
object:[7]
Foo
myFoo
=
new
Foo
();
Foo
anotherFoo
=
myFoo
;
myFoo
is a reference-type
variable that holds a reference to the newly constructed Foo
object. (For now, don’t worry about the
details of creating an object; we’ll cover that in Chapter 5.) We declare a second Foo
type variable, anotherFoo
, and assign it to the same object.
There are now two identical references : myFoo
and anotherFoo
, but only one actual Foo
object instance. If we change things in
the state of the Foo
object itself,
we see the same effect by looking at it with either reference.
Object references are passed to methods in the same way. In this
case, either myFoo
or anotherFoo
would serve as equivalent
arguments:
myMethod
(
myFoo
);
An important, but sometimes confusing, distinction to make at this
point is that the reference itself is a value and that value is copied
when it is assigned to a variable or passed in a method call. Given our
previous example, the argument passed to a method (a local variable from
the method’s point of view) is actually a third reference to the Foo
object, in addition to myFoo
and
anotherFoo
. The method can alter the
state of the Foo
object through that
reference (calling its methods or altering its variables), but it can’t
change the caller’s notion of the reference to myFoo
: that is, the method can’t change the
caller’s myFoo
to point to a
different Foo
object; it can change
only its own reference. This will be more obvious when we talk about
methods later. Java differs from C++ in this respect. If you need to
change a caller’s reference to an object in Java, you need an additional
level of indirection. The caller would have to wrap the reference in
another object so that both could share the reference to it.
Reference types always point to objects, and objects are always defined by classes. However, two special kinds of reference types—arrays and interfaces—specify the type of object they point to in a slightly different way.
Arrays in Java have a special place in the type system. They are a special kind of object automatically created to hold a collection of some other type of object, known as the base type. Declaring an array type reference implicitly creates the new class type designed as a container for its base type, as you’ll see in the next chapter.
Interfaces are a bit sneakier. An interface defines a set of methods and gives it a corresponding type. An object that implements the methods of the interface can be referred to by that interface type, as well as its own type. Variables and method arguments can be declared to be of interface types, just like other class types, and any object that implements the interface can be assigned to them. This adds flexibility in the type system and allows Java to cross the lines of the class hierarchy and make objects that effectively have many types. We’ll cover interfaces in the next chapter as well.
Generic types or parameterized types, as we mentioned earlier, are an extension of the Java class syntax that allows for additional abstraction in the way classes work with other Java types. Generics allow for specialization of classes by the user without changing any of the original class’s code. We cover generics in detail in Chapter 8.
Strings in Java are objects; they are therefore a
reference type. String
objects do,
however, have some special help from the Java compiler that makes them
look more like primitive types. Literal string values in Java source
code are turned into String
objects
by the compiler. They can be used directly, passed as arguments to
methods, or assigned to String
type
variables:
System
.
out
.
println
(
"Hello, World..."
);
String
s
=
"I am the walrus..."
;
String
t
=
"John said: \"I am the walrus...\""
;
The +
symbol in Java is
“overloaded” to perform string concatenation as well as regular numeric
addition. Along with its sister +=
, this is the only
overloaded operator in Java:
String
quote
=
"Four score and "
+
"seven years ago,"
;
String
more
=
quote
+
" our"
+
" fathers"
+
" brought..."
;
Java builds a single String
object from the concatenated strings and provides it as the result of
the expression. We discuss the String
class and all things text-related in great detail in Chapter 10.
Get Learning Java, 4th Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.