Types

The type system of a programming language describes how its data elements (variables and constants) are associated with storage in memory and how they are related to one another. In a statically typed language, such as C or C++, the type of a data element is a simple, unchanging attribute that often corresponds directly to some underlying hardware phenomenon, such as a register or a pointer value. In a more dynamic language such as Smalltalk or Lisp, variables can be assigned arbitrary elements and can effectively change their type throughout their lifetime. A considerable amount of overhead goes into validating what happens in these languages at runtime. Scripting languages such as Perl achieve ease of use by providing drastically simplified type systems in which only certain data elements can be stored in variables, and values are unified into a common representation, such as strings.

Java combines many of the best features of both statically and dynamically typed languages. As in a statically typed language, every variable and programming element in Java has a type that is known at compile time, so the runtime system doesn’t normally have to check the validity of assignments between types while the code is executing. Unlike traditional C or C++, Java also maintains runtime information about objects and uses this to allow truly dynamic behavior. Java code may load new types at runtime and use them in fully object-oriented ways, allowing casting and full polymorphism (extending of types). Java code may also “reflect” upon or examine its own types at runtime, allowing advanced kinds of application behavior such as interpreters that can interact with compiled programs dynamically.

Java data types fall into two categories. Primitive types represent simple values that have built-in functionality in the language; they are fixed elements, such as literal constants and numbers. Reference types (or class types) include objects and arrays; they are called reference types because they “refer to” a large data type that is passed “by reference,” as we’ll explain shortly. Generic types are really just a kind of composition (combination) of class types and are therefore reference types as well.

Primitive Types

Numbers, characters, and Boolean values are fundamental elements in Java. Unlike some other (perhaps more pure) object-oriented languages, they are not objects. For those situations where it’s desirable to treat a primitive value as an object, Java provides “wrapper” classes. The major advantage of treating primitive values as special is that the Java compiler and runtime can more readily optimize their implementation. Primitive values and computations can still be mapped down to hardware as they always have been in lower-level languages. Later we’ll see how Java can automatically convert between primitive values and their object wrappers as needed to partially mask the difference between the two. We’ll explain what that means in more detail in the next chapter when we discuss boxing and unboxing of primitive values.

An important portability feature of Java is that primitive types are precisely defined. For example, you never have to worry about the size of an int on a particular platform; it’s always a 32-bit, signed, two’s complement number. Table 4-2 summarizes Java’s primitive types.

Table 4-2. Java primitive data types

Type

Definition

boolean

true or false

char

16-bit, Unicode character

byte

8-bit, signed, two’s complement integer

short

16-bit, signed, two’s complement integer

int

32-bit, signed, two’s complement integer

long

64-bit, signed, two’s complement integer

float

32-bit, IEEE 754, floating-point value

double

64-bit, IEEE 754

Note

Those of you with a C background may notice that the primitive types look like an idealization of C scalar types on a 32-bit machine, and you’re absolutely right. That’s how they’re supposed to look. The 16-bit characters were forced by Unicode, and ad hoc pointers were deleted for other reasons. But overall, the syntax and semantics of Java primitive types derive from C.

Floating-point precision

Floating-point operations in Java follow the IEEE 754 international specification, which means that the result of floating-point calculations is normally the same on different Java platforms. However, Java allows for extended precision on platforms that support it. This can introduce extremely small-valued and arcane differences in the results of high-precision operations. Most applications would never notice this, but if you want to ensure that your application produces exactly the same results on different platforms, you can use the special keyword strictfp as a class modifier on the class containing the floating-point manipulation (we cover classes in the next chapter). The compiler then prohibits these platform-specific optimizations.

Variable declaration and initialization

Variables are declared inside of methods and classes with a type name followed by one or more comma-separated variable names. For example:

    int foo;
    double d1, d2;
    boolean isFun;

Variables can optionally be initialized with an expression of the appropriate type when they are declared:

    int foo = 42;
    double d1 = 3.14, d2 = 2 * 3.14;
    boolean isFun = true;

Variables that are declared as members of a class are set to default values if they aren’t initialized (see Chapter 5). In this case, numeric types default to the appropriate flavor of zero, characters are set to the null character (\0), and Boolean variables have the value false. Local variables, which are declared inside a method and live only for the duration of a method call, on the other hand, must be explicitly initialized before they can be used. As we’ll see, the compiler enforces this rule so there is no danger of forgetting.

Integer literals

Integer literals can be specified in octal (base 8), decimal (base 10), or hexadecimal (base 16). A decimal integer is specified by a sequence of digits beginning with one of the characters 1–9:

    int i = 1230;

Octal numbers are distinguished from decimal numbers by a leading zero:

    int i = 01230;             // i = 664 decimal

A hexadecimal number is denoted by the leading characters 0x or 0X (zero “x”), followed by a combination of digits and the characters a–f or A–F, which represent the decimal values 10–15:

    int i = 0xFFFF;            // i = 65535 decimal

Integer literals are of type int unless they are suffixed with an L, denoting that they are to be produced as a long value:

    long l = 13L;
    long l = 13;       // equivalent: 13 is converted from type int

(The lowercase letter l is also acceptable but should be avoided because it often looks like the number 1.)

When a numeric type is used in an assignment or an expression involving a “larger” type with a greater range, it can be promoted to the bigger type. In the second line of the previous example, the number 13 has the default type of int, but it’s promoted to type long for assignment to the long variable. Certain other numeric and comparison operations also cause this kind of arithmetic promotion, as do mathematical expressions involving more than one type. For example, when multiplying a byte value by an int value, the compiler promotes the byte to an int first:

    byte b = 42;
    int i = 43;
    int result = b * i;  // b is promoted to int before multiplication

A numeric value can never go the other way and be assigned to a type with a smaller range without an explicit cast, however:

    int i = 13;
    byte b = i;          // Compile-time error, explicit cast needed
    byte b = (byte) i;   // OK

Conversions from floating-point to integer types always require an explicit cast because of the potential loss of precision.

Finally, we should note that if you are using Java 7 or later, you can add a bit of formatting to your numeric literals by utilizing the “_” underscore character between digits. So if you have particularly large strings of digits, you can break them up as in the following examples:

    int RICHARD_NIXONS_SSN = 567_68_0515;
    int for_no_reason = 1___2___3;
    int JAVA_ID = 0xCAFE_BABE;

Underscores may only appear between digits, not at the beginning or end of a number or next to the “L” long integer signifier.

Floating-point literals

Floating-point values can be specified in decimal or scientific notation. Floating-point literals are of type double unless they are suffixed with an f or F denoting that they are to be produced as a float value. And just as with integer literals, in Java 7 you may use “_” underscore characters to format floating-point numbers—but only between digits, not at the beginning, end, or next to the decimal point or “F” signifier of the number.

    double d = 8.31;
    double e = 3.00e+8;
    float f = 8.31F;
    float g = 3.00e+8F;
    float pi = 3.14_159_265_358;

Binary literals

A new feature of Java 7 is the introduction of binary literal values. This allows you to write out binary values directly by prefixing the number with a “0b” or “0B” (zero B).

    byte one =     (byte)0b00000001;
    byte two =     (byte)0b00000010;
    byte four =    (byte)0b00000100;
    byte sixteen = (byte)0b00001000;
    int cafebabe = 0b11001010111111101011101010111110;
    long lots_o_ones = (long)0b11111111111111111111111111111111111111111111111L;

Character literals

A literal character value can be specified either as a single-quoted character or as an escaped ASCII or Unicode sequence:

    char a = 'a';
    char newline = '\n';
    char smiley = '\u263a';

Reference Types

In an object-oriented language like Java, you create new, complex data types from simple primitives by creating a class. Each class then serves as a new type in the language. For example, if we create a new class called Foo in Java, we are also implicitly creating a new type called Foo. The type of an item governs how it’s used and where it can be assigned. As with primitives, an item of type Foo can, in general, be assigned to a variable of type Foo or passed as an argument to a method that accepts a Foo value.

A type is not just a simple attribute. Classes can have relationships with other classes and so do the types that they represent. All classes in Java exist in a parent-child hierarchy, where a child class or subclass is a specialized kind of its parent class. The corresponding types have the same relationship, where the type of the child class is considered a subtype of the parent class. Because child classes inherit all of the functionality of their parent classes, an object of the child’s type is in some sense equivalent to or an extension of the parent type. An object of the child type can be used in place of an object of the parent’s type. For example, if you create a new class, Cat, that extends Animal, the new type, Cat, is considered a subtype of Animal. Objects of type Cat can then be used anywhere an object of type Animal can be used; an object of type Cat is said to be assignable to a variable of type Animal. This is called subtype polymorphism and is one of the primary features of an object-oriented language. We’ll look more closely at classes and objects in Chapter 5.

Primitive types in Java are used and passed “by value.” In other words, when a primitive value like an int is assigned to a variable or passed as an argument to a method, it’s simply copied. Reference types (class types), on the other hand, are always accessed “by reference.” A reference is simply a handle or a name for an object. What a variable of a reference type holds is a “pointer” to an object of its type (or of a subtype, as described earlier). When the reference is assigned to a variable or passed to a method, only the reference is copied, not the object to which it’s pointing. A reference is like a pointer in C or C++, except that its type is so strictly enforced. The reference value itself can’t be explicitly created or changed. A variable acquires a reference value only through assignment to an appropriate object.

Let’s run through an example. We declare a variable of type Foo, called myFoo, and assign it an appropriate object:[7]

    Foo myFoo = new Foo();
    Foo anotherFoo = myFoo;

myFoo is a reference-type variable that holds a reference to the newly constructed Foo object. (For now, don’t worry about the details of creating an object; we’ll cover that in Chapter 5.) We declare a second Foo type variable, anotherFoo, and assign it to the same object. There are now two identical references : myFoo and anotherFoo, but only one actual Foo object instance. If we change things in the state of the Foo object itself, we see the same effect by looking at it with either reference.

Object references are passed to methods in the same way. In this case, either myFoo or anotherFoo would serve as equivalent arguments:

    myMethod( myFoo );

An important, but sometimes confusing, distinction to make at this point is that the reference itself is a value and that value is copied when it is assigned to a variable or passed in a method call. Given our previous example, the argument passed to a method (a local variable from the method’s point of view) is actually a third reference to the Foo object, in addition to myFoo and anotherFoo. The method can alter the state of the Foo object through that reference (calling its methods or altering its variables), but it can’t change the caller’s notion of the reference to myFoo: that is, the method can’t change the caller’s myFoo to point to a different Foo object; it can change only its own reference. This will be more obvious when we talk about methods later. Java differs from C++ in this respect. If you need to change a caller’s reference to an object in Java, you need an additional level of indirection. The caller would have to wrap the reference in another object so that both could share the reference to it.

Reference types always point to objects, and objects are always defined by classes. However, two special kinds of reference types—arrays and interfaces—specify the type of object they point to in a slightly different way.

Arrays in Java have a special place in the type system. They are a special kind of object automatically created to hold a collection of some other type of object, known as the base type. Declaring an array type reference implicitly creates the new class type designed as a container for its base type, as you’ll see in the next chapter.

Interfaces are a bit sneakier. An interface defines a set of methods and gives it a corresponding type. An object that implements the methods of the interface can be referred to by that interface type, as well as its own type. Variables and method arguments can be declared to be of interface types, just like other class types, and any object that implements the interface can be assigned to them. This adds flexibility in the type system and allows Java to cross the lines of the class hierarchy and make objects that effectively have many types. We’ll cover interfaces in the next chapter as well.

Generic types or parameterized types, as we mentioned earlier, are an extension of the Java class syntax that allows for additional abstraction in the way classes work with other Java types. Generics allow for specialization of classes by the user without changing any of the original class’s code. We cover generics in detail in Chapter 8.

A Word About Strings

Strings in Java are objects; they are therefore a reference type. String objects do, however, have some special help from the Java compiler that makes them look more like primitive types. Literal string values in Java source code are turned into String objects by the compiler. They can be used directly, passed as arguments to methods, or assigned to String type variables:

    System.out.println( "Hello, World..." );
    String s = "I am the walrus...";
    String t = "John said: \"I am the walrus...\"";

The + symbol in Java is “overloaded” to perform string concatenation as well as regular numeric addition. Along with its sister +=, this is the only overloaded operator in Java:

    String quote = "Four score and " + "seven years ago,";
    String more = quote + " our" + " fathers" +  " brought...";

Java builds a single String object from the concatenated strings and provides it as the result of the expression. We discuss the String class and all things text-related in great detail in Chapter 10.



[7] The comparable code in C++ would be:

    Foo& myFoo = *(new Foo());
    Foo& anotherFoo = myFoo;

Get Learning Java, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.