Chapter 1. Introducing the ADO.NET Entity Framework

Developers spend far too much time worrying about their backend database, its tables and their relationships, the names and parameters of stored procedures and views, as well as the schema of the data that they return. Microsoft’s new Entity Framework changes the game for .NET developers so that we no longer have to be concerned with the details of the data store as we write our applications. We can focus on the task of writing our applications, rather than accessing the data.

The ADO.NET Entity Framework is a new data access platform from Microsoft for writing .NET applications. It was released in July 2008 as part of Visual Studio 2008 Service Pack 1 and .NET 3.5 Service Pack 1, two years after Microsoft announced it at its TechEd 2006 Conference.

Although the existing data access features remain in ADO.NET, this new framework is part of Microsoft’s core data access strategy going forward. Therefore, the Entity Framework will receive the bulk of the innovation and resources from Microsoft’s Data Programmability team. It’s an important technology for Microsoft, and one that you should not ignore.

Why do we need a new data access technology? After many years of forcing developers to switch from DAO to RDO to ADO and then to ADO.NET, with ADO.NET it seemed that Microsoft had finally settled on a tool in which we could make a big investment. As Visual Studio and the .NET Framework have evolved, ADO.NET has evolved by enhancement and addition, but has remained backward compatible all along. Our investment was safe.

And it remains safe. The Entity Framework is another enhancement to ADO.NET, giving developers an added mechanism for accessing data and working with the results in addition to DataReaders and DataSets.

One of the core benefits of the Entity Framework is that you don’t have to be concerned with the structure of your database. All of your data access and storage are done against a conceptual data model that reflects your own business objects.

Programming Against a Model, Not Against the Database

With DataReaders and many other data access technologies, you spend a lot of time getting data from the database, reading through the results, picking out bits of data, and pushing those bits into your business classes. With the Entity Framework, you are not querying against the schema of the database, but rather against a schema that reflects your own business model. As the data is retrieved, you are not forced to reason out the columns and rows and push them into objects; they are returned as objects. When it’s time to save changes back to the database, you have to save only those objects. The Entity Framework will do the necessary work to translate your objects back into the rows and columns of the relational store. The Entity Framework does this part of the job for you, similar to the way an Object Relational Mapping tool works.

The Entity Framework uses a model called an Entity Data Model (EDM), which evolved from Entity Relationship Modeling (ERM), a concept that has been used in database development for many years.

The Entity Data Model: A Client-Side Data Model

An EDM is a client-side data model and it is the core of the Entity Framework. It is not the same as the database model; that belongs to the database. The data model in the application describes the structure of your business objects. It’s as though you were given permission to restructure the database tables and views in your enterprise’s database so that the tables and relationships look more like your business domain rather than the normalized schema that is designed by database administrators.

Figure 1-1 shows the schema of a typical set of tables in a database. PersonalDetails provides additional information about a Person that the database administrator has chosen to put into a separate table for the sake of scalability. SalesPerson is a table that is used to provide additional information for those people who are salespeople.

Schema of normalized database tables

Figure 1-1. Schema of normalized database tables

When working with this data from your application, your queries will be full of inner joins and outer joins to access the additional data about Person records. Or you will access a variety of predefined stored procedures and views, which might each require a different set of parameters and return data that is shaped in a variety of ways.

A T-SQL query to retrieve a set of SalesPerson records along with their personal details would look something like this:

SELECT     SalesPerson.*, PersonalDetails.*, Person.*
FROM       Person
           INNER JOIN PersonalDetails
           ON Person.PersonID = PersonalDetails.PersonID
              INNER JOIN SalesPerson ON Person.PersonID = SalesPerson.PersonID

Imagine that a particular application could have its own view of what you wish the database looked like. Figure 1-2 reshapes the schema.

Person data shaped to match your business objects

Figure 1-2. Person data shaped to match your business objects

All of the fields from PersonalDetails are now part of Person. And SalesPerson is doing something that is not even possible in a database; it is deriving from Person, just as you would in an object model.

Now imagine that you can write a LINQ query that looks like this:

VB
From p In People.OfType(Of SalesPerson) Select p
C#
from p in People.OfType<SalesPerson> select p

In return, you will have a set of SalesPerson objects with all of the properties defined by this model (see Figure 1-3).

The SalesPerson object

Figure 1-3. The SalesPerson object

Note

LINQ exists only in the C# and Visual Basic languages. With the Entity Framework there is another way to express queries, which not only allows you to use other languages, but also provides additional benefits that you can take advantage of as necessary. It’s called Entity SQL, and you will learn much more about it and LINQ to Entities in Chapters 3 and 4.

This is the crux of how the Entity Framework can remove the pain of having not only to interact with the database, but also to translate the tabular data into objects.

.NET is but one tool that uses an EDM. The next version of SQL Server will use an EDM for Reporting Services. You will begin to see other Microsoft applications that will adopt the EDM concept as well. In fact, you will find that model-driven development in general is getting more and more attention from Microsoft.

When working with the Entity Framework, you will use a particular implementation of an EDM. In the Entity Framework, an EDM is represented by an EDMX file at design time that is split into a set of three XML files at runtime.

The Entity in “Entity Framework”

The items described in the EDM are called entities. The objects that are returned are based on these entities and are called entity objects. They differ from typical domain objects in that they have properties but no behavior apart from methods to enable change tracking.

Figure 1-4 shows the class diagram for both of the classes that the model generates automatically. Each class has a factory method and methods that are used to notify the Entity Framework of whether a property has changed.

Class diagrams for the Person and SalesPerson entities

Figure 1-4. Class diagrams for the Person and SalesPerson entities

You can add business logic to the generated classes, pull the results into your own business objects, and even link your business objects to the EDM and remove the generated classes. But by definition, the entities describe only their schema.

In addition to being able to reshape the entities in the data model, you can define relationships between entities. Figure 1-5 adds a Customer entity (also deriving from person) and an Order entity to the model. Notice the relationship lines between SalesPerson and Order, showing a one-to-many relationship between them. There is also a one-to-many relationship between Customer and Order.

SalesPerson and Customer entities, each with a relationship to Order entities

Figure 1-5. SalesPerson and Customer entities, each with a relationship to Order entities

When you write queries against this version of the model, you don’t need to use JOINs. The model provides navigation between the entities.

The following LINQ to Entities query retrieves order information along with information about the customer. It navigates into the Customer property of the Order to get the FirstName and LastName of the Customer:

VB
From o In context.Orders
Select o.OrderID,o.OrderNumber,o.Customer.FirstName,o.Customer.LastName
C#
from o in context.Orders 
select new {o.OrderID,o.OrderNumber,o.Customer.FirstName,o.Customer.LastName}

Once that data is in memory, you can navigate through each object and its properties, myOrder.Customer.LastName, just as readily.

The Entity Framework also lets you retrieve graphs, which means you can return shaped data such as a Customer with all of its Order details already attached.

These are some of the huge benefits to querying against a data model, rather than directly against the database.

Choosing Your Backend

You may have noticed that I have not mentioned the actual data store that owns the data being queried. The model doesn’t have any knowledge of the data store—what type of database it is, much less what the schema is. And it doesn’t need to.

The database you choose as your backend will have no impact on your model or your code.

The Entity Framework communicates with the same ADO.NET data providers that ADO.NET already uses, but with a caveat. The provider must be updated to support the Entity Framework. The provider takes care of reshaping the Entity Framework’s queries and commands into native queries and commands. All you need to do is identify the provider and a database connection string so that the Entity Framework can get to the database.

This means that if you need to write applications against a number of different databases, you won’t have to learn the ins and outs of each database. You can write queries with the Entity Framework’s syntax (either LINQ to Entities or Entity SQL) and never have to worry about the differences between the databases. If you need to take advantage of functions or operators that are particular to a database, Entity SQL allows you to do that as well.

Note

You may have noticed I use the term data store rather than always referring to the database. Although the Entity Framework currently focuses on working with databases, Microsoft’s vision is to work with any relational store—for example, an XML file with a known schema.

Available Providers

Microsoft’s SqlClient API that is included with Visual Studio 2008 SP 1 supports the Entity Framework. It will allow you to use SQL Server 2000, 2005, and 2008. You can use the full or Express version of SQL Server 2005 and 2008 and the full version of SQL Server 2000. SQL Server CE version 3.5 supports the Entity Framework as well in the System.Data.SqlServerCe.3.5 API.

At the time of this writing, a host of other providers are available—and more are on their way—that will allow you to use Oracle, IBM databases, SQL Anywhere, MySQL, SQLite, VistaDB, and many other databases. The providers are being written by the database vendors as well as by third-party vendors.

Note

On the Resources page of this book’s website, you can find a list of provider APIs that support the Entity Framework.

Access and ODBC

A provider that supports the Entity Framework needs to have specific knowledge about the type of database it is connecting to. It needs to be aware of the available functions and operators for the database, as well as the proper syntax for native queries. Open Database Connectivity (ODBC) providers provide generic access to a variety of databases, including Access, and cannot furnish the necessary database particulars to act as a provider for the Entity Framework. Therefore, ODBC is not a valid provider for the Entity Framework. Unless someone creates a provider specifically for Access, you won’t be able to use it with Entity Framework applications. Microsoft does not have plans to build an Access provider because the demand is too low.

Entity Framework Features

In addition to the EDM, the Entity Framework provides a set of .NET APIs that let you write .NET applications using the EDM. It also includes a set of design tools for designing the model. Following is a synopsis of the Entity Framework’s key features.

The Entity Data Model

Although the Entity Framework is designed to let you work directly with the classes from the EDM, it still needs to interact with the database. The conceptual data model that the EDM describes is stored in an XML file whose schema identifies the entities and their properties. Behind the conceptual schema described in the EDM is another pair of schema files that map your data model back to the database. One is an XML file that describes your database and the other is a file that provides the mapping between your conceptual model and the database.

During query execution and command execution (for updates), the Entity Framework figures out how to turn a query or command that is expressed in terms of the data model into one that is expressed in terms of your database.

When data is returned from the database, it does the job of shaping the database results into the entities and further materializing objects from those results.

Entity Data Model Design Tools

The screenshots in Figures 1-2 and 1-3 are taken from the EDM Designer. It is part of Visual Studio and provides you with a way to work visually with the model rather than tangle with the XML. You will work with the Designer right away in Chapter 2, and you’ll learn how to use it to do some more advanced modeling, such as inheritance, in Chapter 12. You will also learn about the Designer’s limitations, such as the fact that it does not support all of the features of the EDM. With some of the less frequently used EDM features, you’ll have to work directly with the XML after all. In Chapter 2, you will get a look at the XML and how it relates to what you see in the Designer so that when it comes time to modify it in Chapter 12, you’ll have some familiarity with the raw schema files.

The Designer also allows you to map stored procedures to entities, which you’ll learn about in Chapter 6. Unfortunately, support for stored procedures in the Designer has its limitations as well, so Chapter 13 will show you how to achieve what you can’t do with the Designer.

Another notable feature of the Designer is that it will let you update the model from the database to add additional database objects that you did not need earlier or that have been added to the database since you created the model.

The Entity Data Model Wizard

One of the EDM design tools is the Entity Data Model Wizard. It allows you to point to an existing database and create a model directly from the database so that you don’t have to start from scratch. Once you have this first pass at the model, you can begin to customize the model in the Designer.

This first release of the Entity Framework is much more focused on creating models from existing databases. Although it is possible to begin with an empty model, it’s much more challenging to create the model first and then wire it up to an existing database.

Note

Frequently, developers ask about the possibility of generating a database from the model. The current version of the Entity Framework Design Tools (in Visual Studio 2008 SP1) does not have this ability. However, the next version, which will be part of Visual Studio 2010, will include this feature.

Managing Objects with Object Services

The Entity Framework’s most prominent feature set and that which you are likely to work with most often is referred to as Object Services. Object Services sits on top of the Entity Framework stack, as shown in Figure 1-6, and provides all the functionality needed to work with objects that are based on your entities. Object Services provides a class called EntityObject and can manage any class that inherits from EntityObject. This includes materializing objects from the results of queries against the EDM, keeping track of changes to those objects, managing relationships between objects, and saving changes back to the database.

In between querying and updating, Object Services provides a host of capabilities to interact with entity objects, such as automatically working with a lower level of the Entity Framework to do all of the work necessary to make calls to the database and deal with the results. Object Services also provides serialization (both XML and binary). You will see this pattern used in Chapters 20 through 22.

The Entity Framework stack

Figure 1-6. The Entity Framework stack

Change Tracking

Once an entity object has been instantiated, either as a result of data returned from a query or by instantiating a new object in code, Object Services can keep track of that object. This is the default for objects returned from queries. When Object Services manages an object, it can keep track of changes made to the object’s properties or its relationships to other entity objects.

Object Services then uses the change-tracking information when it’s time to update the data. It constructs Insert, Update, and Delete commands for each object that has been added, modified, or deleted by comparing the original values to the current values of the entity. If you are using stored procedures in conjunction with entities, it will pass the current values (and any original values specifically identified) to those procedures.

Relationship Management

Relationships are a critical piece of the EDM, and in Object Services, relationships are objects. If a SalesPerson has two Orders, there will be one relationship object between the SalesPerson and the first order and another object representing a relationship between the SalesPerson and the second order.

This paradigm enables the Entity Framework to have a generic way of handling a wide variety of modeling scenarios. But as you will find, especially in Chapter 15, which dives deeply into relationships, this also requires that you have a very good understanding of how these relationships work. Some of the rules of engagement when working with related data are not very intuitive, and you can write code that will raise plenty of exceptions if you break these rules. Chapter 15 will provide insight into relationships in the EDM so that you will be able to work with them in an expert manner.

Data Binding

You can use entity objects in many .NET data-binding scenarios. In Windows Forms, you can use entities as a data source for data-bound controls or as the data source for BindingSource controls, which orchestrate the binding between objects and UI controls on the form. Chapter 8 provides a well-informed walkthrough for using entities with BindingSource controls to edit and update data. Chapter 20 focuses on separating the data access and other business logic from the user interface to provide better architecture for your applications.

Chapter 8 also provides a walkthrough for data-binding entities in Windows Presentation Foundation (WPF) applications.

For ASP.NET, there is a new DataSource control called the EntityDataSource that works in a similar way to the SqlDataSource and LinqDataSource controls, allowing you to declaratively bind entity objects to your user interface. Chapter 11 is all about using the EntityDataSource.

For layered applications, Chapter 21 focuses on pulling all of the data access tasks out of the ASP.NET user interface.

EntityClient

EntityClient is the other major API in the Entity Framework. It provides the functionality necessary for working with the store queries and commands (in conjunction with the database provider) connecting to the database, executing the commands, retrieving the results from the store, and reshaping the results to match the EDM.

You can work with EntityClient directly or work with Object Services, which sits on top of EntityClient. EntityClient is only able to perform queries, and it does this on behalf of Object Services. The difference is that when you work directly with EntityClient, you will get tabular results (though the results can be shaped). If you are working with Object Services, it will transform the tabular data created by EntityClient into objects.

The tabular data returned by EntityClient is read-only. Only Object Services provides change tracking and the ability to save changes back to the data store.

The Entity Framework in Web Services

You can use the Entity Framework anywhere you can use ADO.NET, including web services and WCF services. Chapter 14 walks you through the process of providing services for entities, and Chapter 22 revisits WCF services using much of the knowledge you will gain in between the two chapters.

At the same time the Entity Framework was released, another new technology called ADO.NET Data Services (which you may know from its original code name, “Astoria”) was also released. ADO.NET Data Services provides an automated way to expose data through an EDM, a LINQ to SQL model, or other particular interfaces, to allow wide access to your data using HTTP commands such as GET and PUT. Although this is a great way to expose your data when you don’t need to have a lot of control over how it is used, I won’t cover this topic in this book. Here you will learn to write services that are designed more specifically for an enterprise.

What About ADO.NET DataSets and LINQ to SQL?

The Entity Framework is only part of the ADO.NET stack. DataSets and DataReaders are an intrinsic part of ADO.NET, and LINQ to SQL was part of the original release of Visual Studio 2008.

DataSets

DataSets and DataReaders are not going away. All of your existing investment will continue to function and you can continue to use this methodology of retrieving data and interacting with it. The Entity Framework provides a completely different way to retrieve and work with data. You would not integrate the two technologies—for example, using the Entity Framework to query some data, and then pushing it into a data set; there would be no point. You should use one or the other. As you learn about the Entity Framework, you will find that it provides a very different paradigm for accessing data. You may find that the Entity Framework fits for some projects, but not others where you may want to stick with DataSets.

The Entity Framework uses DataReaders as well as the EntityDataReader, which inherits the same DbDataReader as SqlDataReader. This is what a query with EntityClient returns. In fact, you’ll find that the code querying the EDM with EntityClient looks very similar to the code that you use to query the database directly with ADO.NET. It uses connections, commands, and command parameters, and returns a DbDataReader that you can read as you would any other DataReader, such as SqlDataReader.

Some ADO.NET tools that are not available with the Entity Framework are query notification and ASP.NET’s SqlCacheDependency. Additionally, ADO.NET’s SqlBulkCopy requires a DataReader or DataSet to stream data into the database; therefore, you cannot do client-side bulk loading with the Entity Framework. The Entity Framework does not have an equivalent to ADO.NET’s DataAdapter.BatchUpdate. Therefore, when the Entity Framework saves changes to the database, it can send only one command at a time.

A few things are easier with DataSets than with the Entity Framework, such as unit testing and change tracking across processes. You’ll find a discussion of each of these in the following section.

LINQ to SQL

LINQ to SQL and the Entity Framework look similar on the surface. They both provide LINQ querying against a database using a data model.

Why did Microsoft create two similar technologies? LINQ to SQL evolved from the LINQ project, which came out of folks working with language development. The Entity Framework was a project of the Data Programmability team and was focused on the Entity SQL language. By the time each technology had come along far enough that it was being shown to other teams at Microsoft, it was clear that Microsoft had two great new technologies that could target different scenarios. The Entity Framework team adapted LINQ to work with entities, which confused developers even more because LINQ to Entities and LINQ to SQL look so much alike.

LINQ to SQL eventually was brought into Microsoft’s Data Programmability team, and in November 2008 the team announced that because the technologies target the same problems, going forward they would focus on developing the Entity Framework while maintaining and tweaking LINQ to SQL. This is not a happy situation for many developers who have made an investment in LINQ to SQL. Although Microsoft has made no statements regarding deprecating this great and fairly new tool, it has said that it will provide a migration path from LINQ to SQL to the Entity Framework and will recommend the Entity Framework over LINQ to SQL.

The last chapter of this book will highlight some differences between the Entity Framework and LINQ to SQL that will be more comprehensible once you have some knowledge about the Entity Framework.

Entity Framework Pain Points

This book is about the Entity Framework version 1, which Microsoft released in July 2008 as part of Visual Studio 2008 Service Pack 1. Microsoft has a big vision for the Entity Framework and has made an explicit choice to get as much as it can into the Visual Studio 2008 SP 1 release. Although the Entity Framework is an impressive technology with enormous flexibility, a lot of functionality is not exposed in discoverable and easy-to-use ways. Additionally, as with any technology, there are features that some developers find impossible to live without, and they will most likely wait until version 2 to begin to put the Entity Framework into production.

This book spends a lot of time looking into the depths of the APIs to show you how to get around some of these limitations, and attempts to point out potholes, hiccups, and omissions.

The Entity Framework Designer

The Designer goes a long way toward giving you a visual means of working with the EDM, but not every capability of the EDM is easy to achieve with the model, and instead may require some work in the raw XML. Although most would agree that the features you need in order to code manually are those that will be used less commonly, a few do stand out.

Stored procedures

The Designer supports a narrow use of stored procedures. Using the Designer, you can override the Entity Framework’s automatic generation of Insert, Update, and Delete commands by mapping an entity to a set of stored procedures with two important rules. The first is that the stored procedure must line up with the entity. For inserts and updates, that means the values for the stored procedure parameters must come from an entity’s property. The second rule is that you have to override the Insert, Update, and Delete commands, or no commands at all, so you’ll need to map all three functions.

In addition, the Designer supports read queries as long as the query results map directly to an entity. If you have a query that returns random data, you will need to manually create an entity for it to map to. That’s not too hard in the Designer, but there’s another requirement that will necessitate doing some work in the XML.

Chapter 6 walks you through the scenarios that the Designer supports easily, and Chapter 13 digs into the scenarios that will take more effort and walks you through the necessary steps.

Unsupported EDM types

The EDM has a very rich set of modeling capabilities, which I demonstrate in Chapter 12. But the Designer does not support all of these advanced modeling techniques, requiring you to handcode some of them in the XML. In most cases, you can continue to work with the model in the Designer even though you won’t see these particular model types, though you can leverage them in your code. However, there are a few model types, such as the very useful complex type, that, when included in the XML, will make it impossible to open the model in the Designer. The Designer is well aware of these limitations, and at least provides an alternative view that displays a message explaining why the model can’t be opened. You’ll learn about this in Chapter 13.

Generating a database from the model

The EDM is based on a data-driven design with the assumption that there is an existing database for the model to map back to. This makes a lot of sense if you are building an application for an existing database. Domain-driven developers prefer to create their object model first and have a database generated from that. The current designer does not support this capability. However, model first development will be possible in the next version of the Entity Framework tools, which will ship in Visual Studio 2010. In the meantime, developers in the community and at Microsoft are playing with a code generator called T4 Templates (Text Template Transformation Toolkit) to read the model and generate SQL script files to generate database objects for you.

A host of little things

As more developers use the new tools, they are finding other things that could make their lives easier and have contributed to an MSDN Forum thread titled “V2 Wish List.” You can find a link to this thread on the Resources page of this book’s website.

Challenges with Change Tracking Distributed Applications

To put it mildly, using the Entity Framework in distributed applications can be challenging when it comes to the change tracking performed by Object Services, because the change-tracking information is not stored in the entities and instead is maintained by a separate set of Object Services objects. When an entity is transferred across a process, it is disconnected from the object that contains its change-tracking information. Those objects that own the tracking data are not serializable, so they can’t easily be shipped across to the new process along with the entities. Therefore, when the entities arrive at the new process, they have no idea whether they are new or preexisting, or whether they have been edited or marked for deletion. There’s no way to simply use the ObjectContext’s default method for saving changes to the database without doing additional work.

I address and dissect this problem in a number of places throughout this book, and provide a variety of coding patterns to help you succeed at moving entities around in distributed applications such as services or layered ASP.NET applications. Starting with Chapter 9, which focuses on Object Services, many of the chapters throughout this book provide detailed information regarding change tracking and working with entities across tiers, whether you use the ASP.NET Entity Data Source, as in Chapter 11, or write a WCF service with Data Transfer Objects (DTOs), as in Chapter 22.

Domain-Driven Development

On the cover of this book, you may have noticed the phrase “Building Data-Centric Apps with the ADO.NET Entity Framework.” Entity Framework version 1 is data-centric in the features it implements. Domain-driven development begins with the model, not the database. Many developers who embrace the tenets of domain-driven design will find the Entity Framework to be too restrictive. However, some of the advocates of this point of view are working with the Entity Framework team to enable version 2 to expand its capabilities so that you can use it with this approach.

Unit Testing

Although it is possible to build unit tests with Entity Framework classes, the fact that entities must either inherit from the EntityObject class or implement some of the key interfaces of Object Services makes it impossible to decouple the entity classes from the mechanism that executes queries. Therefore, you cannot unit-test your Entity Framework code without causing the database to be accessed. Hitting the database while performing unit tests is not a favorable option for most developers. Hopefully you wouldn’t dream of testing against a live database, but even with a copy you would need to deal with rollbacks and an assortment of other complications. Because of this, many developers have just written the Entity Framework off as “untestable.” I’ve seen developers implement unit testing so that their tests automatically create then remove a new database on the fly. Mocking is another path to overcome this limitation. This book will not delve into the specifics of unit testing with the Entity Framework.

Programming the Entity Framework

As you read through this book, you will gain experience in designing EDMs and using the Entity Framework to write applications, as well as dig deep into the APIs to learn how to manipulate entity objects and have granular control over much of their behavior. A lot of functionality is very accessible, and there’s a lot of hidden power. You will learn what’s under the covers so that you can realize the true benefits of the Entity Framework.

Get Programming Entity Framework now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.