Chapter 4. Generics

In Chapter 3, I showed how to write types and described the various kinds of members they can contain. However, there’s an extra dimension to classes, structs, interfaces, and methods that I did not show. They can define type parameters, which are placeholders that let you plug in different types at compile time. This lets you write just one type and then produce multiple versions of it. This is called a generic type. For example, the class library defines a generic class called List<T> that acts as a variable-length array. T is a type parameter here, and you can use any type as an argument, so List<int> is a list of integers, List<string> is a list of strings, and so on. You can also write a generic method, which is a method that has its own type arguments, independently of whether its containing type is generic.

Generic types and methods are visually distinctive because they always have angle brackets (< and >) after the name. These contain a comma-separated list of parameters or arguments. The same parameter/argument distinction applies here as with methods: the declaration specifies a list of parameters, and then when you come to use the method or type, you supply arguments for those parameters. So List<T> defines a single type parameter, T, and List<int> supplies a type argument, int, for that parameter.

Type parameters can be called whatever you like, within the usual constraints for identifiers in C#. There’s a common but not universal convention of using T when there’s only one parameter. For multiparameter generics, you tend to see slightly more descriptive names. For example, the class library defines the Dictionary<TKey, TValue> collection class. Sometimes you will see a descriptive name like that even when there’s just one parameter, but in any case, you will tend to see a T prefix, so that the type parameters stand out when you use them in your code.

Generic Types

Classes, structs, and interfaces can all be generic, as can delegates, which we’ll be looking at in Chapter 9. Example 4-1 shows how to define a generic class. The syntax for structs and interfaces is much the same—the type name is followed immediately by a type parameter list.

Example 4-1. Defining a generic class

public class NamedContainer<T>
{
    public NamedContainer(T item, string name)
    {
        Item = item;
        Name = name;
    }

    public T Item { get; private set; }
    public string Name { get; private set; }
}

Inside the body of the class, you can use T anywhere you would normally use a type name. In this case, I’ve used it as a constructor argument, and also as the type of the Item property. I could define fields of type T too. (In fact I have, albeit not explicitly. The automatic property syntax generates hidden fields, so my Item property will have an associated hidden field of type T.) You can also define local variables of type T. And you’re free to use type parameters as arguments for other generic types. My NamedContainer<T> could declare a variable of List<T>, for example.

The class that Example 4-1 defines is, like any generic type, not a complete type. A generic type declaration is unbound, meaning that there are type parameters that must be filled in to provide a complete type. Basic questions such as how much memory a NamedContainer<T> instance will require cannot be answered without knowing what T is—the hidden field for the Item property would need 4 bytes if T were an int, but 16 bytes if it were a decimal. The CLR cannot produce executable code for a type if it does not even know how the contents will be arranged in memory. So to use this, or any other generic type, we must provide type arguments. Example 4-2 shows how. When type arguments are supplied, the result is sometimes called a constructed type. (Slightly confusingly, this has nothing to do with constructors, the special kind of member we looked at in Chapter 3. In fact, Example 4-2 uses those too—it invokes the constructors of a couple of constructed types.)

Example 4-2. Using a generic class

var a = new NamedContainer<int>(42, "The answer");
var b = new NamedContainer<int>(99, "Number of red balloons");
var c = new NamedContainer<string>("Programming C#", "Book title");

You can use a constructed generic type anywhere you would use a normal type. For example, you can use them as the types for method parameters and return values, properties, or fields. You can even use one as a type argument for another generic type, as Example 4-3 shows.

Example 4-3. Constructed generic types as type arguments

// ...where a, and b come from Example 4-2.
var namedInts = new List<NamedContainer<int>>() { a, b };
var namedNamedItem = new NamedContainer<NamedContainer<int>>(a, "Wrapped");

Each distinct combination of type arguments forms a distinct type. (Or, in the case of a generic type with just one parameter, each different type you supply as an argument constructs a distinct type.) This means that NamedContainer<int> is a different type than NamedContainer<string>. That’s why there’s no conflict in using NamedContainer<int> as the type argument for another NamedContainer as the final line of Example 4-3 does—there’s no infinite recursion here.

Because each different set of type arguments produces a distinct type, there is no implied compatibility between different forms of the same generic type. You cannot assign a NamedContainer<int> into a variable of type NamedContainer<string> or vice versa. It makes sense that those two types are incompatible, because int and string are quite different types. But what if we used object as a type argument? As Chapter 2 described, you can put almost anything in an object variable. If you write a method with a parameter of type object, it’s OK to pass a string, so you might expect a method that takes a NamedContainer<object> to be happy with a NamedContainer<string>. By default, that won’t work, but some generic types (specifically, interfaces and delegates) can declare that they want this kind of compatibility relationship. The mechanisms that support this (called covariance and contravariance) are closely related to the type system’s inheritance mechanisms. Chapter 6 is all about inheritance and type compatibility, so I will discuss how that works with generic types in that chapter.

The number of type parameters forms part of a generic type’s identity. This makes it possible to introduce multiple types with the same name as long as they have different numbers of type parameters. So you could define a generic class called, say, Operation<T>, and then another class, Operation<T1, T2>, and also Operation<T1, T2, T3>, and so on, all in the same namespace, without introducing any ambiguity. When you are using these types, it’s clear from the number of arguments which type was meant—Operation<int> clearly uses the first, while Operation<string, double> uses the second, for example. And for the same reason, you can also have a nongeneric type with the same name as a generic type. So an Operation class would be distinct from generic types of the same name.

My NamedContainer<T> example doesn’t do anything to instances of its type argument, T—it never invokes any methods, or uses any properties or other members of T. All it does is accept a T as a constructor argument, which it stores away for later retrieval. This is also true of the generic types I’ve pointed out in the .NET Framework class library—I’ve mentioned some collection classes, which are all variations on the same theme of containing data for later retrieval. There’s a reason for this: a generic class can find itself working with any type, so it can presume very little about its type arguments. However, if you want to be able to presume certain things about your type arguments, you can specify constraints.

Constraints

C# allows you to state that a type argument must fulfill certain requirements. For example, suppose you want to be able to create new instances of the type on demand. Example 4-4 shows a simple class that provides deferred construction—it makes an instance available through a static property, but does not attempt to construct that instance until the first time you read the property.

Example 4-4. Creating a new instance of a parameterized type

// For illustration only. Consider using Lazy<T> in a real program.
public static class Deferred<T>
    where T : new()
{
    private static T _instance;

    public static T Instance
    {
        get
        {
            if (_instance == null)
            {
                _instance = new T();
            }
            return _instance;
        }
    }
}

Warning

You wouldn’t write a class like this in practice, because the class library offers Lazy<T>, which does the same job but with more flexibility. Lazy<T> can work correctly in multithreaded code, which Example 4-4 will not. Example 4-4 is just to illustrate how constraints work. Don’t use it!

For this class to do its job, it needs to be able to construct an instance of whatever type is supplied as the argument for T. The get accessor uses the new keyword, and since it passes no arguments, it clearly requires T to provide a parameterless constructor. But not all types do, so what happens if we try to use a type without a suitable constructor as the argument for Deferred<T>? The compiler will reject it, because it violates a constraint that this generic type has declared for T. Constraints appear just before the class’s opening brace, and they begin with the where keyword. The constraint in Example 4-4 states that T is required to supply a zero-argument constructor.

If that constraint had not been present, the class in Example 4-4 would not compile—you would get an error on the line that attempts to construct a new T. A generic type (or method) is allowed to use only features that it has specified through constraints, or that are defined by the base object type. (The object type defines a ToString method, for example, so you can invoke that on any instance without needing to specify a constraint.)

C# offers only a very limited suite of constraints. You cannot demand a constructor that takes arguments, for example. In fact, C# supports only four kinds of constraints on a type argument: a type constraint, a reference type constraint, a value type constraint, and the new() constraint. We just saw that last one, so let’s look at the rest.

Type Constraints

You can constrain the argument for a type parameter to be compatible with a particular type. For example, you could use this to demand that the argument type implements a particular interface. Example 4-5 shows the syntax.

Example 4-5. Using a type constraint

using System;
using System.Collections.Generic;

public class GenericComparer<T> : IComparer<T>
    where T : IComparable<T>
{
    public int Compare(T x, T y)
    {
        return x.CompareTo(y);
    }
}

I’ll just explain the purpose of this example before describing how it takes advantage of a type constraint. This class provides a bridge between two styles of value comparison that you’ll find in .NET. Some data types provide their own comparison logic, but at times, it can be more useful for comparison to be a separate function implemented in its own class. These two styles are represented by the IComparable<T> and IComparer<T> interfaces, which are both part of the class library. (They are in the System and System.Collections.Generics namespaces, respectively.) I showed IComparer<T> in Chapter 3—an implementation of this interface can compare two objects or values of type T. The interface defines a single Compare method that takes two arguments and returns either a negative number, 0, or a positive number if the first argument is respectively less than, equal to, or greater than the second. IComparable<T> is very similar, but its CompareTo method takes just a single argument, because with this interface, you are asking an instance to compare itself to some other instance.

Some of the .NET class library’s collection classes require you to provide an IComparer<T> to support ordering operations such as sorting. They use the model in which a separate object performs the comparison, because this offers two advantages over the IComparable<T> model. First, it enables you to use data types that don’t implement IComparable<T>. Second, it allows you to plug in different sorting orders. (For example, suppose you want to sort some strings with a case-insensitive order. The string type implements IComparable<string>, but that provides a case-sensitive order.) So IComparer<T> is the more flexible model. However, what if you are using a data type that implements IComparable<T>, and you’re perfectly happy with the order that provides? What would you do if you’re working with an API that demands an IComparer<T>?

Actually, the answer is that you’d probably just use the .NET Framework class library feature designed for this very scenario: Comparer<T>.Default. If T implements IComparable<T>, that property will return an IComparer<T> that does precisely what you want. So, in practice, you wouldn’t need to write the code in Example 4-5, because the .NET Framework has already written it for you. However, it’s instructive to see how you’d write your own version, because it illustrates how to use a type constraint.

The line starting with the where keyword states that this generic class requires the argument for its type parameter T to implement IComparable<T>. Without this, the Compare method would not compile—it invokes the CompareTo method on an argument of type T. That method is not present on all objects, and the C# compiler allows this only because we’ve constrained T to be an implementation of an interface that does offer such a method.

Interface constraints are relatively rare. If a method needs a particular argument to implement a particular interface, you wouldn’t normally need a generic type constraint. You can just use that interface as the argument’s type. However, Example 4-5 can’t do this. You can demonstrate this by trying Example 4-6. It won’t compile.

Example 4-6. Will not compile: interface not implemented

public class GenericComparer<T> : IComparer<T>
{
    public int Compare(IComparable<T> x, T y)
    {
        return x.CompareTo(y);
    }
}

The compiler will complain that I’ve not implemented the IComparer<T> interface’s Compare method. Example 4-6 has a Compare method, but its signature is wrong—that first argument should be a T. I could also try the correct signature without specifying the constraint, as Example 4-7 shows.

Example 4-7. Will not compile: missing constraint

public class GenericComparer<T> : IComparer<T>
{
    public int Compare(T x, T y)
    {
        return x.CompareTo(y);
    }
}

That will also fail to compile, because the compiler can’t find that CompareTo method I’m trying to use. It’s the constraint for T in Example 4-5 that enables the compiler to know what that method really is.

Type constraints don’t have to be interfaces, by the way. You can use any type. For example, you can constrain a particular argument always to derive from a particular base class. More subtly, you can also define one parameter’s constraint in terms of another type parameter. Example 4-8 requires the first type argument to derive from the second, for example.

Example 4-8. Constraining one argument to derive from another

public class Foo<T1, T2>
    where T1 : T2
...

Type constraints are fairly specific—they require either a particular inheritance relationship, or the implementation of specific interfaces. However, you can define slightly less specific constraints.

Reference Type Constraints

You can constrain a type argument to be a reference type. As Example 4-9 shows, this looks similar to a type constraint. You just put the keyword class instead of a type name.

Example 4-9. Constraint requiring a reference type

public class Bar<T>
    where T : class
...

This constraint prevents the use of value types such as int, double, or any struct as the type argument. Its presence enables your code to do three things that would not otherwise be possible. First, it means that you can write code that tests whether variables of the relevant type are null. If you’ve not constrained the type to be a reference type, there’s always a possibility that it’s a value type, and those can’t have null values. The second capability is that you can use the as operator, which we’ll look at in Chapter 6. This is really just a variation on the first feature—the as keyword requires a reference type because it can produce a null result.

Note

You cannot use a nullable type such as int? (or Nullable<int>, as the CLR calls it) as the argument for a parameter with a class constraint. Although you can test an int? for null and use it with the as operator, the compiler generates quite different code for nullable types for both operations than it would for a reference type. It cannot compile a single method that can cope with both reference types and nullable types if you use these features.

The third feature that a reference type constraint enables is the ability to use certain other generic types. It’s often convenient for generic code to use one of its type arguments as an argument for another generic type, and if that other type specifies a constraint, you’ll need to put the same constraint on your own type parameter. So if some other type specifies a class constraint, this might require you to constrain your own argument in the same way.

Of course, this does raise the question of why the type you’re using needs the constraint in the first place. It might be that it simply wants to test for null or use the as operator, but there’s another reason for applying this constraint. Sometimes, you just need a type argument to be a reference type—there are situations in which a generic method might be able to compile without a class constraint, but it will not work correctly if used with a value type. To illustrate this, I’ll describe the scenario in which I most often find myself needing to use this kind of constraint.

I regularly write tests that create an instance of the class I’m testing, and that also need one or more fake objects to stand in for real objects with which the object under test wants to interact. Using these stand-ins reduces the amount of code any single test has to exercise, and can make it easier to verify the behavior of the object being tested. For example, my test might need to verify that my code sends messages to a server at the right moment, but I don’t want to have to run a real server during a unit test, so I provide an object that implements the same interface as the class that would transmit the message, but which won’t really send the message. This combination of an object under test plus a fake is such a common pattern that it might be useful to put the code into a reusable base class. Using generics means that the class can work for any combination of the type being tested and the type being faked. Example 4-10 shows a simplified version of a kind of helper class I sometimes write in these situations.

Example 4-10. Constrained by another constraint

using Microsoft.VisualStudio.TestTools.UnitTesting;
using Moq;

public class TestBase<TSubject, TFake>
    where TSubject : new()
    where TFake : class
{
    public TSubject Subject { get; private set; }
    public Mock<TFake> Fake { get; private set; }

    [TestInitialize]
    public void Initialize()
    {
        Subject = new TSubject();
        Fake = new Mock<TFake>();
    }
}

There are various ways to build fake objects for test purposes. You could just write new classes that implement the same interface as your real objects. Some editions of Visual Studio 2012 include a feature called Fakes that can create these for you. There are also various third-party libraries that can generate them. One such library is called Moq (an open source project available for free from http://code.google.com/p/moq/), and that’s where the Mock<T> class in Example 4-10 comes from. It’s capable of generating a fake implementation of any interface or of any nonsealed class. It will provide empty implementations of all members by default, and you can configure more interesting behaviors if necessary. You can also verify whether the code under test used the fake object in the way you expected.

How is that relevant to constraints? The Mock<T> class specifies a reference type constraint on its own type argument, T. This is due to the way in which it creates dynamic implementations of types at runtime; it’s a technique that can work only for reference types. Moq generates a type at runtime, and if T is an interface, that generated type will implement it, whereas if T is a class, the generated type will derive from it.[21] There’s nothing useful it can do if T is a struct, because you cannot derive from a value type. That means that when I use Mock<T> in Example 4-10, I need to make sure that whatever type argument I pass is either an interface or a class (i.e., a reference type). But the type argument I’m using is one of my class’s type parameters: TFake. So I don’t know what type that will be—that’ll be up to whoever is using my class.

For my class to compile without error, I have to ensure that I have met the constraints of any generic types that I use. I have to guarantee that Mock<TFake> is valid, and the only way to do that is to add a constraint on my own type that requires TFake to be a reference type. And that’s what I’ve done on the third line of the class definition in Example 4-10. Without that, the compiler would report errors on the two lines that refer to Mock<TFake>.

To put it more generally, if you want to use one of your own type parameters as the type argument for a generic that specifies a constraint, you’ll need to specify the same constraint on your own type parameter.

Value Type Constraints

Just as you can constrain a type argument to be a reference type, you can also constrain it to be a value type. As shown in Example 4-11, the syntax is similar to that for a reference type constraint, but with the struct keyword.

Example 4-11. Constraint requiring a value type

public class Quux<T>
    where T : struct
...

Before now, we’ve seen the struct keyword only in the context of custom value types, but despite how it looks, this constraint permits any of the built-in numeric types such as int, as well as custom structs. That’s because they all derive from the same System.ValueType base class.

The .NET Framework’s Nullable<T> type imposes this constraint. Recall from Chapter 3 that Nullable<T> provides a wrapper for value types that allows a variable to hold either a value, or no value. (We normally use the special syntax C# provides, so we’d write, say, int? instead of Nullable<int>.) The only reason this type exists is to provide nullability for types that would not otherwise be able to hold a null value. So it only makes sense to use this with a value type—reference type variables can already be set to null without needing this wrapper. The value type constraint prevents you from using Nullable<T> with types for which it is unnecessary.

Multiple Constraints

If you’d like to impose multiple constraints for a single type argument, you can just put them in a list, as Example 4-12 shows. There are a couple of ordering restrictions: if you have a reference or value type constraint, the class or struct keyword must come first in the list. If the new() constraint is present, it must be last.

Example 4-12. Multiple constraints

public class Spong<T>
    where T : IEnumerable<T>, IDisposable, new()
...

When your type has multiple type parameters, you write one where clause for each type parameter you wish to constrain. In fact, we saw this earlier—Example 4-10 defines constraints for both of its parameters.

Zero-Like Values

There are a few features that all types support, and which therefore do not require a constraint. This includes the set of methods defined by the object base class, which I’ll show in Chapter 6. But there’s a more basic feature that can sometimes come in useful in generic code.

Variables of any type can be initialized to a default value. As you have seen in the preceding chapters, there are some situations in which the CLR does this for us. For example, all the fields in a newly constructed object will have a known value even if we don’t write field initializers and don’t supply values in the constructor. Likewise, a new array of any type will have all of its elements initialized to a known value. The CLR does this by filling the relevant memory with zeros. The exact interpretation depends on the data type. For any of the built-in numeric types, the value will quite literally be the number 0, but for nonnumeric types, it’s something else. For bool, the default is false, and for a reference type, it is null.

Sometimes, it can be useful for generic code to be able to reset a variable back to this initial default zero-like value. But you cannot use a literal expression to do this in most situations. You cannot assign null into a variable whose type is specified by a type parameter unless that parameter has been constrained to be a reference type. And you cannot assign the literal 0 into any such variable, because there is no way to constrain a type argument to be a numeric type.

Instead, you can request the zero-like value for any type, using the default keyword. (This is the same keyword we saw inside a switch statement in Chapter 2, but used in a completely different way. C# keeps up the C-family tradition of defining multiple, unrelated meanings for each keyword.) If you write default(SomeType), where SomeType is either a type or a type parameter, you will get the default initial value for that type: 0 if it is a numeric type, and the equivalent for any other type. For example, the expression default(int) has the value 0, default(bool) is false, and default(string) is null. You can use this with a generic type parameter to get the default value for the corresponding type argument, as Example 4-13 shows.

Example 4-13. Getting the default (zero-like) value of a type argument

static void PrintDefault<T>()
{
    Console.WriteLine(default(T));
}

Inside a generic type or method that defines a type parameter T, the expression default(T) will produce the default, zero-like value for T—whatever T may be—without requiring any constraints. So you could use the generic method in Example 4-13 to verify that the defaults for int, bool, and string are the values I stated. And since I’ve just shown you an example of one, this seems like a good time to talk about generic methods.

Generic Methods

As well as generic types, C# also supports generic methods. In this case, the generic type parameter list follows the method name, and precedes the method’s normal parameter list. Example 4-14 shows a method with a single type parameter. It uses that parameter as its return type, and also as the element type for an array to be passed in as the method’s argument. This method returns the final element in the array, and because it’s generic, it will work for any array element type.

Example 4-14. A generic method

public static T GetLast<T>(T[] items)
{
    return items[items.Length - 1];
}

Note

You can define generic methods inside either generic types or nongeneric types. If a generic method is a member of a generic type, all of the type parameters from the containing type are in scope inside the method, as well as the type parameters specific to the method.

Just as with a generic type, you can use a generic method by specifying its name along with its type arguments, as Example 4-15 shows.

Example 4-15. Invoking a generic method

int[] values = { 1, 2, 3 };
int last = GetLast<int>(values);

Generic methods work in a similar way to generic types, but with type parameters that are only in scope within the method declaration and body. You can specify constraints in much the same way as with generic types. The constraints appear after the method’s parameter list and before its body, as Example 4-16 shows.

Example 4-16. A generic method with a constraint

public static T MakeFake<T>()
    where T : class
{
    return new Mock<T>().Object;
}

There’s one significant way in which generic methods differ from generic types, though: you don’t always need to specify a generic method’s type arguments explicitly.

Type Inference

The C# compiler is often able to infer the type arguments for a generic method. I can modify Example 4-15 by removing the type argument list from the method invocation, as Example 4-17 shows, and this does not change the meaning of the code in any way.

Example 4-17. Generic method type argument inference

int[] values = { 1, 2, 3 };
int last = GetLast(values);

When presented with this sort of ordinary-looking method call, if there’s no nongeneric method of that name available, the compiler starts looking for suitable generic methods. If the method in Example 4-14 is in scope, it will be a candidate, and the compiler will attempt to deduce the type arguments. This is a pretty simple case. The method expects an array of some type T, and we’ve passed an array of type int, so it’s not a massive stretch to work out that this code should be treated as a call to GetLast<int>.

It gets more complex with more intricate cases. The C# specification has about six pages dedicated to the type inference algorithm, but it’s all to support one goal: letting you leave out type arguments when they would be redundant. By the way, type inference is always performed at compile time, so it’s based on the static type of the method arguments.

Inside Generics

If you are familiar with C++ templates, you will by now have noticed that C# generics are quite different than templates. Superficially, they have some similarities, and can be used in similar ways—both are suitable for implementing collection classes, for example. However, there are some template-based techniques that simply won’t work in C#, such as the code in Example 4-18.

Example 4-18. A template technique that doesn’t work in C# generics

public static T Add<T>(T x, T y)
{
    return x + y;  // Will not compile
}

You can do this sort of thing in a C++ template but not in C#, and you cannot fix it completely with a constraint. You could add a type constraint requiring T to derive from some type that defines a custom + operator, which would get this to compile, but it would be pretty limited—it would work only for types derived from that base type. In C++, you can write a template that will add together two items of any type that supports addition, whether that’s a built-in type or a custom one. Moreover, C++ templates don’t need constraints; the compiler is able to work out for itself whether a particular type will work as a template argument.

This issue is not specific to arithmetic. The fundamental problem is that because generic code relies on constraints to know what operations are available on its type parameters, it can use only features represented as members of interfaces or shared base classes. (If arithmetic in .NET were interface-based, it would be possible to define a constraint that requires it. But operators are all static methods, and interfaces can contain only instance members.)

The limitations of C# generics are an upshot of how they are designed to work, so it’s useful to understand the mechanism. (These limitations are not specific to Microsoft’s CLR, by the way. They are an inevitable result of how generics fit into the design of the CLI.)

Generic methods and types are compiled without knowing which types will be used as arguments. This is the fundamental difference between C# generics and C++ templates—in C++, the compiler gets to see every instantiation of a template. But with C#, you can instantiate generic types without access to any of the relevant source code, long after the code has been compiled. After all, Microsoft wrote the generic List<T> class years ago, but you could write a brand-new class today and plug that in as the type argument just fine. (You might point out that the C++ standard library’s std::vector has been around even longer. However, the C++ compiler has access to the source file that defines the class, which is not true of C# and List<T>. C# sees only the compiled library.)

The upshot of this is that the C# compiler needs to have enough information to be able to generate type-safe code at the point at which it compiles generic code. Take Example 4-18. It cannot know what the + operator means here, because it would be different for different types. With the built-in numeric types, that code would need to compile to the specialized intermediate language (IL) instructions for performing addition. If that code were in a checked context (i.e., using the checked keyword shown in Chapter 2), we’d already have a problem, because the code for adding integers with overflow checking uses different IL opcodes for signed and unsigned integers. Furthermore, since this is a generic method, we may not be dealing with the built-in numeric types at all—perhaps we are dealing with a type that defines a custom + operator, in which case the compiler would need to generate a method call. (Custom operators are just methods under the covers.) Or if the type in question turns out not to support addition, the compiler should generate an error.

There are several possible outcomes, depending on the actual types involved. That would be fine if the types were known to the compiler, but it has to compile the code for generic types and methods without knowing which types will be used as arguments.

You might argue that perhaps Microsoft could have supported some sort of tentative semicompiled format for generic code, and in a sense, it did. When introducing generics, Microsoft modified the type system, file format, and IL instructions to allow generic code to use placeholders representing type parameters to be filled in when the type is fully constructed. So why not extend it to handle operators? Why not let the compiler generate errors at the point at which you attempt to use a generic type instead of insisting on generating errors when the generic code itself is compiled? Well, it turns out that you can plug in new sets of type arguments at runtime—the reflection API that we’ll look at in Chapter 13 lets you construct generic types. So there isn’t necessarily a compiler available at the point at which an error would become apparent, because not all versions of .NET ship with a copy of the C# compiler. And in any case, what should happen if a generic class was written in C# but consumed by a completely different language, perhaps one that didn’t support operator overloading? Which language’s rules should apply when it comes to working out what to do with that + operator? Should it be the language in which the generic code was written, or the language in which the type argument was written? (What if there are multiple type parameters, and for each argument, you use a type written in a different language?) Or perhaps the rules should come from the language that decided to plug the type arguments into the generic type or method, but what about cases where one piece of generic code passes its arguments through to some other generic entity? Even if you could decide which of these approaches would be best, it supposes that the rules used to determine what a line of code actually means are available at runtime, a presumption that once again founders on the fact that the relevant compilers will not necessarily be installed on the machine running the code.

.NET generics solve this problem by requiring the meaning of generic code to be fully defined when the generic code is compiled, by the language in which the generic code was written. If the generic code involves using methods or other members, they must be resolved statically (i.e., the identity of those members must be determined precisely at compile time). Critically, that means compile time for the generic code itself, not for the code consuming the generic code. These requirements explain why C# generics are not as flexible as the consumer-compile-time substitution model that C++ uses. The payoff is that you can compile generics into libraries in binary form, and they can be used by any .NET language that supports generics, with completely predictable behavior.

Summary

Generics enable us to write types and methods with type arguments, which can be filled in at compile time to produce different versions of the types or methods that work with particular types. The most important use case for generics back when they were first introduced was to make it possible to write type-safe collection classes. .NET did not have generics at the beginning, so the collection classes available in version 1.0 used the general-purpose object type. This meant you had to cast objects back to their real type every time you extracted one from a collection. It also meant that value types were not handled efficiently in collections; as we’ll see in Chapter 7, referring to values through an object requires the generation of boxes to contain the values. Generics solve these problems well. They make it possible to write collection classes such as List<T>, which can be used without casts. Moreover, because the CLR is able to construct generic types at runtime, it can generate code optimized for whatever type a collection contains. So collection classes can handle value types such as int much more efficiently than before generics were introduced. We’ll look at some of these collection types in the next chapter.



[21] Moq relies on the dynamic proxy feature from the Castle Project to generate this type. If you would like to use something similar in your code, you can find this at http://castleproject.org/.

Get Programming C# 5.0 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.