Chapter 1. Language Integrated Query (LINQ)

1.0. Introduction

Language Integrated Query (LINQ) is a new way to access data from many different sources. LINQ provides a single querying model that can operate against different data domains individually or all together in a single query. LINQ brings the ability to query datato .NET languages, and some of the languages have provided extensions to make its use even more intuitive. One of these languages is C#; there are a number of extensions to the language in C# 3.0 that help to facilitate querying in a rich and intuitive manner.

Traditional object-oriented programming is based on an imperative style wherein the developer describes in detail not only what they want to happen, but also describes a majority of the detail regarding exactly how this should be performed through code. LINQ helps to take coding down a more declarative path that facilitates describing what the developer wants to do instead of describing how to accomplish the goal in detail. LINQ also enables a more functional style of programming. These changes can dramatically shorten the amount of code it takes to perform some tasks. That said, object-oriented programming is still very much alive and well in C# and .NET, but for the first time the language is offering the chance to choose the style of programming based on your needs. Note, however, that LINQ will not fit into every scenario and is not a replacement for good design or practice. You can write bad code using LINQ just as you can write bad object-oriented or procedural code. The trick, like it always has been, is to figure out when it is appropriate to use which technique.

The initial version of LINQ encompasses a number of data domains as listed here:

  • LINQ to Objects

  • LINQ to XML

  • LINQ to ADO.NET

  • LINQ to SQL / LINQ to DataSet / LINQ to Entities

There are a number of other “LINQ to” implementations currently under development, but these are Microsoft’s initial offerings. A few of the others in development are LINQ to SharePoint, LINQ to LDAP, and even a LINQ to Amazon implementation. The only one of the initial Microsoft set that won’t be ready immediately with the release of Visual Studio 2008 is LINQ to Entities, or the ADO.NET Entity Framework, as it is also known. LINQ to Entities will be released shortly after Visual Studio 2008.

As you begin your examination of LINQ, it is easy to begin to think of it as a new object relational mapping layer, or some neat new widgets on IEnumerable<T>, ora new XML API, or even just an excuse to not write SQL directly anymore. You can do any of these things, but we would encourage you to think of LINQ as how your program asks for, calculates, or transforms sets of data from both single and disparate sources. It takes a bit of time and playing with LINQ for its functionality to click, but once it does, you will be surprised at what you can do with it. This chapter begins to show some of what is possible with LINQ and will hopefully start you down the path toward thinking of which of your scenarios are applicable to this new capability in C#.

1.1. Query a Message Queue

Problem

You want to be able to query for messages with specific criteria from an existing message queue.

Solution

Write a query using LINQ to retrieve messages using the System.Messaging.MessageQueue type:

	  // open an existing message queue
	  string queuePath = @".\private$\LINQMQ";
	  MessageQueue messageQueue = new MessageQueue(queuePath);
	      BinaryMessageFormatter messageFormatter = new BinaryMessageFormatter();

	  var query = from Message msg in messageQueue
	     // The first assignment to msg.Formatter is so that we can touch the
	     // Message object. It assigns the BinaryMessageFormatter to each message
	     // instance so that it can be read to determine if it matches the criteria.
	     // Next, a check is performed that the formatter was correctly assigned
	     // by performing an equality check, which satisfies the Where clause's need
	     // for a boolean result while still executing the assignment of the formatter.
	              where ((msg.Formatter = messageFormatter) == messageFormatter) &&
	                  int.Parse(msg.Label) > 5 &&
	                  msg.Body.ToString().Contains('D')
	              orderby msg.Body.ToString() descending
	              select msg;
	  // Check our results for messages with a label > 5 and containing a 'D' in the name
	    foreach (var msg in query) 
	    {
	        Console.WriteLine("Label: " + msg.Label + " Body: " + msg.Body); 
	    }

The query retrieves the data from the MessageQueue by selecting the messages where the Label is a number greater than 5 and the message body contains a capital letter “D”. These messages are then returned sorted by the message body in descending order.

Discussion

There are a number of new keywords in this code using LINQ that were not previously used to access a message queue:

var

Instructs the compiler to infer the type of the variable from the right side of the statement. In essence, the type of the variable is determined by what is on the right side of the operator separating the var keyword and the expression. This allows for implicitly typed local variables.

from

The from keyword sets out the source collection to query against and a range variable to represent a single element from that collection. It is always the first clause in a query operation. This may seem counterintuitive if you are used to SQL and expect select to be first, but if you consider that first we need what to work on before we determine what to return, it makes sense. If we weren’t used to how SQL does this already, it would be SQL that seems counterintuitive.

where

The where keyword specifies the constraints by which the elements to return are filtered. Each condition must evaluate to a Boolean result, and when all expressions evaluate to true, the element of the collection is allowed to be selected.

orderby

This keyword indicates that the result set should be sorted according to the criteria specified. The default order is ascending, and elements use the default comparer.

select

Allows the projection of an entire element from the collection, the construction of a new type with parts of that element and other calculated values, or a sub-collection of items into the result.

The messageQueue collection is of type System.Messaging.MessageQueue, which implements the IEnumerable interface. This is important, as the LINQ methods provided need a set or collection to implement at least IEnumerable for it to work with that set or collection. It would be possible to implement a set of extension methods that did not need IEnumerable, but most people will not have the need to. It is even better when the set or collection implements IEnumerable<T>, as LINQ then knows the type of element in the set or collection that it is working with, but in this case, MessageQueue has been in the framework for a while and isn’t likely to change, so the query provides the element type Message, as shown in the “from” line:

	var query = from Message msg in messageQueue

For more about this, see Recipe 1.1.

In the Solution, the messages in the queue have been sent with the use of the BinaryFormatter. To be able to query against them correctly, the Formatter property must be set on each Message before it is examined as part of the where clause:

	// The first assignment to msg.Formatter is so that we can touch the
	// Message object. It assigns the BinaryMessageFormatter to each message
	// instance so that it can be read to determine if it matches the criteria.
	// This is done, and then it checks that the formatter was correctly assigned
	// by performing an equality check, which satisfies the Where clause's need
	// for a boolean result, while still executing the assignment of the formatter.
	where ((msg.Formatter = messageFormatter) == messageFormatter) &&

There are two uses of the var keyword in the solution code:

	var query = from Message msg in messageQueue 
	            ...

	foreach (var msg in query)
	...

The first usage infers that an IEnumerable<Message> will be returned and assigned to the query variable. The second usage infers that the type of msg is Message because the query variable is of type IEnumerablew<Message> and the msg variable is an element from that IEnumerable.

It is also worth noting that when performing operations in a query, actual C# code can be used to determine the conditions, and there is more than just the predetermined set of operators. In the where clause of this query, both int.Parse and string. Contains are used to help filter messages:

	int.Parse(msg.Label) > 5 &&
	msg.Body.ToString().Contains('D')

See Also

Recipe 1.9, and the “MessageQueue class,” “Implicitly typed local variable,” “from keyword,” “where keyword,” “orderby keyword,” and “select keyword” topics in the MSDN documentation.

1.2. Using Set Semantics with Data

Problem

You would like to work with your collections using set operations for union, intersections, exceptions, and distinct items.

Solution

Use the Set operators provided as part of the Standard Query Operators to perform those operations.

Distinct:
	IEnumerable<string> whoLoggedIn = 
	    dailySecurityLog.Where(logEntry => logEntry.Contains("logged in")).Distinct( );
Union:
	// Union
	Console.WriteLine("Employees for all projects");
	var allProjectEmployees = project1.Union(project2.Union(project3));
Intersection:
	// Intersect
	Console.WriteLine("Employees on every project");
	var everyProjectEmployees = project1.Intersect(project2.Intersect(project3));
Exception:
	Console.WriteLine("Employees on only one project");
	var onlyProjectEmployees = allProjectEmployees.Except(unionIntersect);

Discussion

The Standard Query Operators are the set of methods that represent the LINQ pattern. This set includes operators to perform many different types of operations, such as filtering, projection, sorting, grouping, and many others, including set operations.

The set operations for the Standard Query Operators are:

  • Distinct

  • Union

  • Intersect

  • Except

The Distinct operator extracts all nonduplicate items from the collection or result set being worked with. Say, for example, that we had a set of strings representing login and logout behavior for a terminal services box for today:

	// Distinct
	string[] dailySecurityLog = {
	      "Bob logged in", 
	      "Bob logged out", 
	      "Bob logged in", 
	      "Bill logged in",
	      "Melissa logged in",
	      "Bob logged out",
	      "Bill logged out", 
	      "Bill logged in", 
	      "Tim logged in", 
	      "Scott logged in", 
	      "Scott logged out", 
	      "Dave logged in", 
	      "Tim logged out", 
	      "Bob logged in", 
	      "Dave logged out"};

From that collection, we would like to determine the list of people who logged in to the box today. Since people can log in and log out many times during the course of a day or remain logged in for the whole day, we need to eliminate the duplicate login entries. Distinct is an extension method on the System.Linq.Enumerable class (which implements the Standard Query Operators) that can be called on the string array (which supports IEnumerable) in order to get the distinct set of items from the set. For more information on extension methods, see Recipe 1.4. The set is produced by using another of the Standard Query Operators: Where. Where takes a lambda expression that determines the filter criteria for the set and examines each string in the IEnumerable<string> to determine if the string has “logged in.” Lambda expressions are inline statements (similar to anonymous methods) that can be used in place of a delegate. See Chapter 9 for more on lambda expressions. If the strings do, then they are selected. Distinct narrows down the set of strings further to eliminate duplicate “logged in” records, leaving only one per user:

	    IEnumerable<string> whoLoggedIn =
	        dailySecurityLog.Where(logEntry => logEntry.Contains("logged in")).Distinct(
	);
	    Console.WriteLine("Everyone who logged in today:");
	    foreach (string who in whoLoggedIn)
	    {
	        Console.WriteLine(who); 
	    }

To make things a bit more interesting, for the rest of the operators, we will work with sets of employees on various projects in a company. An Employee is a pretty simple class with a Name and overrides for ToString, Equals, and GetHashCode, as shown here:

	public class Employee
	{
	    public string Name { get; set; }   
	    public override string ToString()
	    {
	        return this.Name;
	    }
	    public override bool Equals(object obj) 
	    {
	        return this.GetHashCode().Equals(obj.GetHashCode()); 
	    }    
	    public override int GetHashCode() 
	    {
	        return this.Name.GetHashCode(); 
	    }
	}

You might wonder why Equals and GetHashCode are overloaded for such a simple class. The reason is that when LINQ performs comparisons of elements in the sets or collections, it uses the default comparison, which in turn uses Equals and GetHashCode to determine if one instance of a reference type is the same as another. If you do not provide the semantics in the reference type class to provide the same hash code or equals value when the data for two instances of the object is the same, then the instances will, by default, be different, as two reference types have different hash codes by default. We override that so that if the Name is the same for each Employee, the hash code and the equals will both correctly identify the instances as the same. There are also overloads for the set operators that take a custom comparer, which would also allow you to make this determination even for classes for which you can’t make the changes to Equals and GetHashCode.

Having done this, we can now assign Employees to projects like so:

	Employee[] project1 = {
	           new Employee(){ Name = "Bob" },
	           new Employee(){ Name = "Bill" },
	           new Employee(){ Name = "Melissa" },
	           new Employee(){ Name = "Shawn" } };
	Employee[] project2 = {
	           new Employee(){ Name = "Shawn" },
	           new Employee(){ Name = "Tim" },
	           new Employee(){ Name = "Scott" } };
	Employee[] project3 = {
	           new Employee(){ Name = "Bob" },
	           new Employee(){ Name = "Dave" },
	           new Employee(){ Name = "Tim" },
	           new Employee(){ Name = "Shawn" } };

To find all employees on all projects, use Unionl to get all nonduplicate Employees in all three projects and write them out:

	// Union
	Console.WriteLine("Employees for all projects:"); 
	var allProjectEmployees = project1.Union(project2.Union(project3)); 
	foreach (Employee employee in allProjectEmployees) 
	{
	    Console.WriteLine(employee);
	}

We can then use Intersect to get the Employees on every project:

	// Intersect
	Console.WriteLine("Employees on every project:");
	var everyProjectEmployees = project1.Intersect(project2.Intersect(project3));
	foreach (Employee employee in everyProjectEmployees) 
	{
	    Console.WriteLine(employee); 
	}

Finally, we can use a combination of Union and Except to find Employees that are only on one project:

	// Except
	var intersect1_3 = project1.Intersect(project3); 
	var intersect1_2 = project1.Intersect(project2);
	var intersect2_3 = project2.Intersect(project3); 
	var unionIntersect = intersect1_2.Union(intersect1_3).Union(intersect2_3);

	Console.WriteLine("Employees on only one project:");
	var onlyProjectEmployees = allProjectEmployees.Except(unionIntersect);
	foreach (Employee employee in onlyProjectEmployees)
	{
	    Console.WriteLine(employee); 
	}

See Also

The "Standard Query Operators,” “Distinct method,” “Union method,” "Intersect method,” and “Except method” topics in the MSDN documentation.

1.3. Reuse Parameterized Queries with LINQ to SQL

Problem

You need to execute the same parameterized query multiple times with different parameter values, but you want to avoid the overhead of parsing the query expression tree to build the parameterized SQL each time the query executes.

Solution

Use the CompiledQuery.Compile method to build an expression tree that will not have to be parsed each time the query is executed with new parameters:

	var GetEmployees =
	        CompiledQuery.Compile((Northwind db, string ac, string ttl) =>
	                   from employee in db.Employees 
	                   where employee.HomePhone.Contains(ac) &&
	                         employee.Title == ttl 
	                   select employee);

	Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString);

The first time the query executes is when it actually compiles (where GetEmployees is called the first time in the foreach loop). Every other iteration in this loop and in the next loop use the compiled version, avoiding the expression tree parsing:

	foreach (var employee in GetEmployees(dataContext, "(206)", "Sales Representative"))
	{
	    Console.WriteLine("{0} {1}", 
	        employee.FirstName, employee.LastName); 
	}       

	foreach (var employee in GetEmployees(dataContext, "(71)", "Sales Manager"))
	{
	    Console.WriteLine("{0} {1}", 
	        employee.FirstName, employee.LastName);
	}

Discussion

We used var for the query declaration, as it was cleaner, but what var actually is in this case is:

	Func<Northwind, string, string, IQueryable<Employees>>

which is the delegate signature for the lambda expression we created that contains the query. That’s right, all this crazy new query stuff, and we just instantiated a delegate. To be fair, the Func delegate was brought about in the System namespace as part of LINQ, so do not dismay, we are still doing cool newstuff!

This illustrates that we are not returning an IEnumerable or IQueryable based result set from Compile, but rather an expression tree. This is the expression tree that represents the potential for a query rather than the query itself. Once we have that tree, LINQ to SQL then has to perform the conversion from the tree to actual SQL that can run against the database. Interestingly enough, if we had put a call to string.Format in as part of detecting the area code in the home phone number, we would get a NotSupportedException that informs us that string.Format can’t be translated to SQL:

	    where employee.HomePhone.Contains(string.Format("({0})",ac)) &&

	System.NotSupportedException:
	Method 'System.String Format(System.String,System.Object)'
	  has no supported translation to SQL.

This is understandable, as SQL has no concept of .NET Framework methods for performing actions, but it is something to keep in mind as you design your queries that this is a limitation when using LINQ to SQL.

After the first execution, the query is compiled, and for every iteration after that, we do not pay the transformation cost for turning the expression tree into the parameterized SQL.

Compiling your queries is something that should be done for parameterized queries that get a lot of traffic, but if a query is infrequently used, it may not be worth the effort. As always, profile your code to see the areas where this could be useful.

See Also

The "CompiledQuery.Compile method” and "Expression Trees” topics in the MSDN documentation.

1.4. Sort Results in a Culture-Sensitive Manner

Problem

You want to ensure that when you sort in a query, the sort order is for an applicationspecific culture that may not be the same as the current thread’s current culture.

Solution

Use the overload of the OrderBy query operator, which accepts a custom comparer in order to specify the culture in which to perform comparisons:

	// Create CultureInfo for Danish in Denmark.
	CultureInfo danish = new CultureInfo("da-DK");

	CultureStringComparer comparer = new CultureStringComparer(danish,CompareOptions.
	None);
	var query = names.OrderBy(n => n, comparer);

Discussion

Handling localization issues such as sorting for a specific culture is a relatively trivial task in .NET if the current culture of the current thread is the culture you want to use. The framework classes that assist in handling culture issues in C# are accessed by including the System.Globalization namespace. This namespace would be included in order to make the code in the solution run. One example of not using the thread current culture would be in an application that needs to display a sorted list of words in Danish on a version of Windows XP that is set for U.S. English. The current thread in the application may have a CultureInfo for “en-US” and, by default, the sort order for OrderBy will use the current culture sort settings. To specify that this list should sort according to Danish rules, a bit of work is necessary in the form of a custom comparer:

	CultureStringComparer comparer = new CultureStringComparer(danish,CompareOptions.
	None);

The comparer variable is an instance of a custom comparer class CultureStringComparer defined as implementing the IComparer<T> interface specialized for strings. This class is used to provide the culture settings for the sort order:

	public class CultureStringComparer : IComparer<string>
	{
	    private CultureStringComparer() 
	    {
	    }

	    public CultureStringComparer(CultureInfo cultureInfo, CompareOptions options) 
	    {
	        if (cultureInfo == null) 
	            throw new ArgumentNullException("cultureInfo");

	        CurrentCultureInfo = cultureInfo; 
	        Options = options; 
	    }

	    public int Compare(string x, string y) 
	    {
	        return CurrentCultureInfo.CompareInfo.Compare(x, y, Options); 
	    }

	    public CultureInfo CurrentCultureInfo { get; set; }

	    public CompareOptions Options { get; set; }
	}

To demonstrate how this could be used, first we compile a list of words to order by. Since the Danish language treats the character “Æ” as an individual letter, sorting it after “Z” in the alphabet, and the English language treats the character “Æ” as a special symbol, sorting it before the letter “A” in the alphabet, this will demonstrate the sort difference:

	string[] names = { "Jello", "Apple", "Bar", "Æble", "Forsooth", "Orange", "Zanzibar"
	};

Now, we can set up the CultureInfos for both Danish and U.S. English and call OrderBy with the comparer specific to each culture. This query is not using the query expression syntax, but rather uses the functional style of IEnumerable<string>.OrderBy():

	// Create CultureInfo for Danish in Denmark.
	CultureInfo danish = new CultureInfo("da-DK");
	// Create CultureInfo for English in the U.S.
	CultureInfo american = new CultureInfo("en-US");

	CultureStringComparer comparer = new CultureStringComparer(danish, CompareOptions.
	None);
	var query = names.OrderBy(n => n, comparer);
	Console.WriteLine("Ordered by specific culture : " + comparer.CurrentCultur	eInfo.
	Name);
	foreach (string name in query)
	{
	    Console.WriteLine(name);
	}
	comparer.CurrentCultureInfo = american;
	query = names.OrderBy(n => n, comparer);
	Console.WriteLine("Ordered by specific culture : " + comparer.CurrentCultureInfo.
	Name);
	foreach (string name in query)
	{
	    Console.WriteLine(name);
	}

The output results below show that the word Æble is last in the Danish list and first in the U.S. English list:

	Ordered by specific culture : da-DK
	Apple
	Bar
	Forsooth
	Jello
	Orange
	Zanzibar
	Æble
	Ordered by specific culture : en-US
	Æble
	Apple
	Bar
	Forsooth
	Jello
	Orange
	Zanzibar

See Also

The “OrderBy,” “CultureInfo,” and “IComparer<T>” topics in the MSDN documentation.

1.5. Adding Functional Extensions for Use with LINQ

Problem

There are operations you perform on collections frequently that currently reside in utility classes. You would like to be able to have these operations be used on collections in a more seamless manner than having to pass the reference to the collection to the utility class.

Solution

Use extension methods to help achieve a more functional style of programming for your collection operations. For example, to add a weighted moving average calculation operation to numeric collections, implement a set of WeightedMovingAverage extension methods in a static class and then call them as part of those collections:

	decimal[] prices = new decimal[10] { 13.5M, 17.8M, 92.3M, 0.1M, 15.7M,
	                                     19.99M, 9.08M, 6.33M, 2.1M, 14.88M };
	Console.WriteLine(prices.WeightedMovingAverage());

	double[] dprices = new double[10] { 13.5, 17.8, 92.3, 0.1, 15.7,
	                                    19.99, 9.08, 6.33, 2.1, 14.88 };
	Console.WriteLine(dprices.WeightedMovingAverage());

	float[] fprices = new float[10] { 13.5F, 17.8F, 92.3F, 0.1F, 15.7F,
	                                  19.99F, 9.08F, 6.33F, 2.1F, 14.88F };
	Console.WriteLine(fprices.WeightedMovingAverage());

	int[] iprices = new int[10] { 13, 17, 92, 0, 15,
	                              19, 9, 6, 2, 14 };
	Console.WriteLine(iprices.WeightedMovingAverage());

	long[] lprices = new long[10] { 13, 17, 92, 0, 15,
	                                19, 9, 6, 2, 14 };
	Console.WriteLine(lprices.WeightedMovingAverage());

To provide WeightedMovingAverage for the full range of numeric types, methods for both the nullable and non-nullable numeric types are provided in the LinqExtensions class:

	public static class LinqExtensions
	{
	    public static decimal? WeightedMovingAverage(this IEnumerable<decimal?> source) 
	    {
	        if (source == null)
	            throw new ArgumentNullException("source");

	        decimal aggregate = 0.0M;
	        decimal weight;
	        int item = 1;
	        // count how many items are not null and use that
	        // as the weighting factor
	        int count = source.Count(val => val.HasValue);
	        foreach (var nullable in source)
	        {
	            if (nullable.HasValue)
	            {
	                weight = item / count;
	                aggregate += nullable.GetValueOrDefault() * weight;
	                count++;
	            }
	        } 
	        if (count > 0) 
	        {
	            return new decimal?(aggregate / count);
	        }
	        return null;
	    }
	    // The same method pattern as above is followed for each of the other 
	    // types and their nullable counterparts (double / double?, int / int?, etc.)


	    #region Extend Average...

	}

Discussion

Extension methods allow you to create operations that appear to be part of a collection. They are static methods that can be called as if they were instance methods, allowing you to extend existing types. Extension methods must also be declared in static classes that are not nested. Once a static class is defined with extension methods, the using directive for the namespace of the class makes those extensions available in the source file.

Tip

It is worth noting that if an instance method exists with the same signature as the extension method, the extension method will never be called. Conflicting extension method declarations will resolve to the method in the closest enclosing namespace.

You cannot use extension methods to create:

  • Properties (get and set methods)

  • Operators (+, -, = , etc…)

  • Events

Declaring an extension method is done by specifying the this keyword in front of the first parameter of a method declaration, and the type of that parameter is the type being extended. For example, in the Nullable<decimal> version of the WeightedMovingAverage method, collections that support IEnumerable<decimal?> (or IEnumerable<Nullable<decimal>>) are supported:

	public static decimal? WeightedMovingAverage(this IEnumerable<decimal?> source)
	{
	    if (source == null)
	        throw new ArgumentNullException("source");

	    decimal aggregate = 0.0M;
	    decimal weight;
	    int item = 1;
	    // count how many items are not null and use that
	    // as the weighting factor
	    int count = source.Count(val => val.HasValue);
	    foreach (var nullable in source)
	    {
	        if (nullable.HasValue)
	        {
	            weight = item / count;
	            aggregate += nullable.GetValueOrDefault() * weight;
	            count++;
	        }
	    }
	    if (count > 0)
	    {
	        return new decimal?(aggregate / count);
	    }
	    return null;
	}

The extension methods that support much of the LINQ functionality are on the System.Linq.Extensions class, including an Average method. The Average method has most of the numeric types but did not provide an overload for short (Int16). That’s easily rectified by adding them ourselves for short and Nullable<short>:

	#region Extend Average
	public static double? Average(this IEnumerable<short?> source)
	{
	    if (source == null)
	        throw new ArgumentNullException("source");

	    double aggregate = 0.0;
	    int count = 0;
	    foreach (var nullable in source)
	    {
	        if (nullable.HasValue)
	        {
	            aggregate += nullable.GetValueOrDefault();
	            count++;
	        }
	    }
	    if (count > 0)
	    {
	        return new double?(aggregate / count);
	    }
	    return null;
	}
	public static double Average(this IEnumerable<short> source)
	{
	    if.(source == null)
	        throw new ArgumentNullException("source");

	    double aggregate = 0.0;
	    // use the count of the items from the source
	    int count = source.Count();
	    foreach (var value in source)
	    {
	        aggregate += value;
	    }
	    if (count > 0)
	    {
	        return aggregate / count;
	    }
	    else
	        return 0.0; 
	}
	public static double? Average<TSource>(this IEnumerable<TSource> source,
	                                      Func<TSource, short?> selector)
	{
	    return source.Select<TSource, short?>(selector).Average();
	}
	public static double Average<TSource>(this IEnumerable<TSource> source,
	                                     Func<TSource, short> selector)
	{
	    return source.Select<TSource, short>(selector).Average();
	}
	#endregion // Extend Average

We can then call Average on short-based collections just like WeightedMovingAverage:

	short[] sprices = new short[10] { 13, 17, 92, 0, 15, 19, 9, 6, 2, 14 };
	Console.WriteLine(sprices.WeightedMovingAverage());
	// System.Linq.Extensions doesn't implement Average for short but we do for them!
	Console.WriteLine(sprices.Average());

See Also

The "Extension methods” topic in the MSDN documentation.

1.6. Query and Join Across Data Repositories

Problem

You have two sets of data from different data domains, and you want to be able to combine the data and work with it.

Solution

Use LINQ to bridge across the disparate data domains. LINQ is intended to be used in the same manner across different data domains and supports combining those sets of data with join syntax.

To demonstrate this, we will join an XML file full of Categories with the data from a database (Northwind) with Products and combine the two to create a new set of data for product information that holds the product name, the category description, and the category name:

	Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString);
	ProductsTableAdapter adapter = new ProductsTableAdapter();
	Products products = new Products();
	adapter.Fill(products._Products);

	XElement xmlCategories = XElement.Load("Categories.xml");

	var expr = from product in products._Products
	           where product.Units_In_Stock > 100
	           join xc in xmlCategories.Elements("Category")
	           on product.Category_ID equals int.Parse(xc.Attribute("CategoryID").Value)
	           select new
	           {
	               ProductName = product.Product_Name,
	               Category = xc.Attribute("CategoryName").Value,
	               CategoryDescription = xc.Attribute("Description").Value
	           };

	foreach (var productInfo in expr)
	{
	    Console.WriteLine("ProductName: " + productInfo.ProductName +
	        " Category: " + productInfo.Category + 
	        " Category Description: " + productInfo.CategoryDescription);
	}

The new set of data is printed to the console, but this could easily have been rerouted to another method, transformed in another query, or written out to a third data format:

	ProductName: Grandma's Boysenberry Spread Category: Condiments Category Description:
	Sweet and savory sauces, relishes, spreads, and seasonings
	ProductName: Gustaf's Knäckebröd Category: Grains/Cereals Category Description:
	Breads, crackers, pasta, and cereal
	ProductName: Geitost Category: Dairy Products Category Description: Cheeses
	ProductName: Sasquatch Ale Category: Beverages Category Description: Soft drinks,
	coffees, teas, beer, and ale
	ProductName: Inlagd Sill Category: Seafood Category Description: Seaweed and fish
	ProductName: Boston Crab Meat Category: Seafood Category Description: Seaweed and
	fish
	ProductName: Pâté chinois Category: Meat/Poultry Category Description: Prepared meats
	ProductName: Sirop d'érable Category: Condiments Category Description: Sweet and
	savory sauces, relishes, spreads, and seasonings
	ProductName: Röd Kaviar Category: Seafood Category Description: Seaweed and fish
	ProductName: Rhönbräu Klosterbier Category: Beverages Category Description: Soft
	drinks, coffees, teas, beer, and ale

Discussion

The solution combines data from two different data domains: XML and a SQL Database. To do this before LINQ, you would have to not only create a third data repository by hand to hold the result, but you would also have to write the specific code for each domain to query that domain for its part of the data (XPath for XML; SQL for database) and then manually transform the result sets from each domain into the new data repository. LINQ gives the ability to write the query to combine the two sets of data, automatically constructs a type via projecting a new Anonymous Type, and places the pertinent data in the new type, all in the same syntax. Not only does this simplify the code, but it allows you to concentrate more on getting the data you want and less on exactly how to read both data domains.

This example uses both LINQ to DataSet and LINQ to XML to access the multiple data domains:

	Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString);
	ProductsTableAdapter adapter = new ProductsTableAdapter();
	Products products = new Products();
	adapter.Fill(products._Products);

	XElement xmlCategories = XElement.Load("Categories.xml");

Northwind is a DataContext class. A DataContext is analogous to an ADO.NET Connection and Command object rolled into one. You use it to establish your connection, execute queries, or access tables directly via Entity Classes. A DataContext can be generated directly from the database through Visual Studio by adding a new “LINQ to SQL Classes” item or from the command line using the SQLMetal.exe. This provides access to the local Northwind database for the query. A Products DataSet is loaded from the Products table in the Northwind database for use in the query. For more on DataContext, see Recipe 1.6.

The Northwind DataContext can be generated using SQLMetal.exe using the following command line syntax:

	SqlMetal /server:. /database:Northwind /code:Northwind.cs

XElement is one of the main classes in LINQ to XML. It enables the loading of existing XML, creation of new XML, or retrieving of the XML text for the element via ToString. Example 1-1 shows the Categories.xml file that will be loaded. For more on XElement and LINQ to XML, see Chapter 15.

Example 1-1. Categories.xml
<?xml version="1.0" encoding="utf-8"?>
<Categories>
  <Category CategoryID="1" CategoryName="Beverages" Description="Soft drinks, coffees,
teas, beer, and ale" />
  <Category CategoryID="2" CategoryName="Condiments" Description="Sweet and savory sauces,
relishes, spreads, and seasonings" />
  <Category CategoryID="3" CategoryName="Confections" Description="Desserts, candies,
sweetbreads" />
  <Category CategoryID="4" CategoryName="Dairy Products" Description="Cheeses" />
  <Category CategoryID="5" CategoryName="Grains/Cereals" Description="Breads, crackers,
pasta, and cereal" />
  <Category CategoryID="6" CategoryName="Meat/Poultry" Description="Prepared meats" />
  <Category CategoryID="7" CategoryName="Produce" Description="Dried fruit and bean curd"
/>
  <Category CategoryID="8" CategoryName="Seafood" Description="Seaweed and fish" />
</Categories>

The two sets of data are joined using LINQ and, in particular, the join keyword. The data is joined by matching the category id in the products table with the category id in the xml file to combine the data. In SQL terms, the join keyword represents an inner join:

	var expr = from product in products._Products
	           where product.Units_In_Stock > 100
	           join xc in xmlCategories.Elements("Category")
	           on product.Category_ID equals int.Parse(xc.Attribute("CategoryID").Value)

Once the join result is complete, a new type is projected using the select keyword:

	select new
	{
	    ProductName = product.Product_Name,
	    Category = xc.Attribute("CategoryName").Value,
	    CategoryDescription = xc.Attribute("Description").Value
	};

This allows us to combine different data elements from the two sets of data to make a third set that can look completely different than either of the original two.

Doing joins on two sets of database data would be a bad idea, as the database can do this much faster for those sets, but when you need to join disparate data sets, LINQ can lend a helping hand.

See Also

The "join keyword,” “System.Data.Linq.DataContext,” and “XElement” topics in the MSDN documentation.

1.7. Querying Configuration Files with LINQ

Problem

Sets of data can be stored in many different locations, such as configuration files. You want to be able to query your configuration files for sets of information.

Solution

Use LINQ to query against the configuration sections. In the example below, this is done by retrieving all chapter titles with even numbers and the word “and” in the title from the custom configuration section containing chapter information:

	CSharpRecipesConfigurationSection recipeConfig =
	    ConfigurationManager.GetSection("CSharpRecipesConfiguration") as
	CSharpRecipesConfigurationSection;

	var expr = from ChapterConfigurationElement chapter in
	               recipeConfig.Chapters.OfType<ChapterConfigurationElement>()
	           where (chapter.Title.Contains("and")) && ((int.Parse(chapter.Number) % 2)
	== 0)
	           select new
	           {
	               ChapterNumber = "Chapter " + chapter.Number,
	               chapter.Title
	           };

	foreach (var chapterInfo in expr)
	{
	    Console.WriteLine(chapterInfo.ChapterNumber + ": " + chapterInfo.Title);
	}

The configuration section being queried looks like this:

	<CSharpRecipesConfiguration CurrentEdition="3">
	  <Chapters>
	    <add Number="1" Title="Language Integrated Query (LINQ)"/>
	    <add Number="2" Title="Strings and Characters"/>
	    <add Number="3" Title="Classes and Structures"/>
	    <add Number="4" Title="Generics"/>
	    <add Number="5" Title="Collections"/>
	    <add Number="6" Title="Iterators, Partial Types and Partial Methods"/>
	    <add Number="7" Title="Exception Handling"/>
	    <add Number="8" Title="Diagnostics"/>
	    <add Number="9" Title="Delegates, Events, and Functional Programming"/>
	    <add Number="10" Title="Regular Expressions"/>
	    <add Number="11" Title="Data Structures &amp; Algorithms"/>
	    <add Number="12" Title="Filesystem I/O"/>
	    <add Number="13" Title="Reflection"/>
	    <add Number="14" Title="Web"/>
	    <add Number="15" Title="XML"/>
	    <add Number="16" Title="Networking"/>
	    <add Number="17" Title="Security"/>
	    <add Number="18" Title="Threading and Synchronization"/>
	    <add Number="19" Title="Toolbox"/>
	    <add Number="20" Title="Numbers &amp; Enumerations"/>
	  </Chapters>
	  <Editions>
	    <add Number="1" PublicationYear="2004"/>
	    <add Number="2" PublicationYear="2006"/>
	    <add Number="3" PublicationYear="2007"/>
	  </Editions>
	</CSharpRecipesConfiguration>

The output from the query is:

	Chapter 2: Strings and Characters
	Chapter 6: Iterators, Partial Types and Partial Methods
	Chapter 18: Threading and Synchronization

Discussion

Configuration files in .NET play a significant role in achieving manageability and ease of deployment for .NET-based applications. It can be challenging to get all of the various settings right in the hierarchy of configuration files that can affect an application, so understanding how to write utilities to programmatically check configuration file settings is of great use during development, testing, deployment, and ongoing management of an application.

Tip

To access the configuration types, you will need to reference the System.Configuration assembly.

Even though the ConfigurationElementCollection class (the base of sets of data inconfiguration files) only supports IEnumerable and not IEnumerable<T>, we can stilluse it to get the elements we need by using the OfType<ChapterConfigurationElement> method on the collection, which selects elements of that type from the collection:

	var expr = from ChapterConfigurationElement chapter in
	               recipeConfig.Chapters.OfType<ChapterConfigurationElement>()

ChapterConfigurationElement is a custom configuration section class that holds the chapter number and title:

	/// <summary>
	/// Holds the information about a chapter in the configuration file
	/// </summary>
	public class ChapterConfigurationElement : ConfigurationElement
	{
	    /// <summary>
	    /// Default constructor
	    /// </summary>
	    public ChapterConfigurationElement()
	    {
	    }

	    /// <summary>
	    /// The number of the Chapter
	    /// </summary>
	    [ConfigurationProperty("Number", IsRequired=true)]
	    public string Number
	    {
	        get { return (string)this["Number"]; }
	        set { this["Number"] = value; }
	    }

	    /// <summary>
	    /// The title of the Chapter
	    /// </summary>
	    [ConfigurationProperty("Title", IsRequired=true)]
	    public string Title
	    {
	        get { return (string)this["Title"]; }
	        set { this["Title"] = value; }
	    }
	}

This technique can be used on the standard configuration files such as machine.config as well. This example determines which sections in machine.config require access permissions. For this collection, OfType<ConfigurationSection> is used, as this is a standard section:

	System.Configuration.Configuration machineConfig =
	    ConfigurationManager.OpenMachineConfiguration();

	var query = from ConfigurationSection section in machineConfig.Sections.
	OfType<ConfigurationSection>()
	            where section.SectionInformation.RequirePermission
	            select section;

	foreach (ConfigurationSection section in query)
	{
	    Console.WriteLine(section.SectionInformation.Name);
	}

The sections detected will look something like this:

	system.data
	windows
	system.webServer
	mscorlib
	system.data.oledb
	system.data.oracleclient
	system.data.sqlclient
	configProtectedData
	satelliteassemblies
	system.data.dataset
	startup
	system.data.odbc
	system.diagnostics
	runtime
	system.codedom
	system.runtime.remoting
	assemblyBinding
	system.windows.forms

See Also

The “Enumerable.OfType, method,” "ConfigurationSectionCollection,class” and “ConfigurationElementCollection class” topics in the MSDN documentation.

1.8. Creating XML Straight from a Database

Problem

You want to be able to take a set of data from a database and represent it as XML.

Solution

Use LINQ to SQL and LINQ to XML to retrieve and transform the data all in one query. In this case, we will select the top five customers in the Northwind database whose contact is the owner and those owners who placed orders totaling more than $10,000, then create XML containing the company name, contact name, phone number, and total amount of the orders. Finally, the results are written out to the BigSpenders.xml file:

	Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString);
	// Log the generated SQL to the console
	dataContext.Log = Console.Out;

	var bigSpenders = new XElement("BigSpenders",
	            from top5 in
	            (
	               from customer in
	               (
	                   from c in dataContext.Customers
	                   // get the customers where the contact is the owner
	                   // and they placed orders
	                   where c.ContactTitle.Contains("Owner")
	                      && c.Orders.Count > 0
	                   join orderData in
	                     (
	                         from c in dataContext.Customers
	                         // get the customers where the contact is the owner
	                         // and they placed orders
	                         where c.ContactTitle.Contains("Owner")
	                            && c.Orders.Count > 0
	                         from o in c.Orders
	                         // get the order details
	                         join od in dataContext.OrderDetails
	                             on o.OrderID equals od.OrderID
	                         select new
	                         {
	                             c.CompanyName,
	                             c.CustomerID,
	                             o.OrderID,
	                             // have to calc order value from orderdetails
	                             //(UnitPrice*Quantity as Total)- (Total*Discount)
	                             // as NetOrderTotal
	                             NetOrderTotal = (
	                                 (((double)od.UnitPrice) * od.Quantity)
	                                 (((double)od.UnitPrice) * od.Quantity) * od.Discount))
	                         }
	                     )
	                   on c.CustomerID equals orderData.CustomerID
	                   into customerOrders
	                   select new
	                   {
	                       c.CompanyName,
	                       c.ContactName,
	                       c.Phone,
	                       // Get the total amount spent by the customer
	                       TotalSpend = customerOrders.Sum(order => order.NetOrderTotal)
	                   }
	               )
	               // place focus on the customers that spent > 10000
	               where customer.TotalSpend > 10000
	               orderby customer.TotalSpend descending
	               // only take the top five spenders
	               select customer).Take(5)
	            )
	            // format the data as XML
	            select new XElement("Customer",
	                       new XAttribute("companyName", top5.CompanyName),
	                       new XAttribute("contactName", top5.ContactName),
	                       new XAttribute("phoneNumber", top5.Phone),
	                       new XAttribute("amountSpent", top5.TotalSpend)));

	using (XmlWriter writer = XmlWriter.Create("BigSpenders.xml"))
	{
	    bigSpenders.WriteTo(writer);
	}

Tip

When building larger queries, you may find it is sometimes easier to use the functional approach (.Join()) to building up the query instead of the query expression manner (join x on y equals z) if you have done more C# than SQL.

Discussion

LINQ to SQL is the part of LINQ to ADO.NET that facilitates rapid database development. It is targeted at the scenarios where you want to program almost directly against the database schema. Most of these scenarios have one-to-one correlations between strongly typed classes and database tables. If you are in more of an enterprise development scenario with lots of stored procedures and databases that have moved away from “one table equals one entity” scenarios, you would want to look into LINQ to Entities.

To use LINQ to SQL, there are two design tools to help you get started, the visual designer for LINQ to SQL in Visual Studio 2008 and the command line utility SqlMetal.exe in the SDK. You can access the visual designer by adding a new or opening an existing “LINQ to SQL Classes” item (*.dbml file) to the project, which opens the designer. Both of these help you to build out the DataContext and Entity Classes for your database that can then be used with LINQ (or other programming constructs if you wish). A DataContext is analogous to an ADO.NET Connection and Command object rolled into one. You use it to establish your connection, execute queries, or access tables directly via Entity Classes. The Northwind Data Context is a strongly typed instance of a DataContext generated by SqlMetal and is partially shown here:

	public partial class Northwind : System.Data.Linq.DataContext
	{

	        private static System.Data.Linq.Mapping.MappingSource mappingSource = new
	AttributeMappingSource();

	  #region Extensibility Method Definitions
	  /// removed code for extensibility points for clarity
	  #endregion

	        static Northwind()
	        {
	        }

	        public Northwind(string connection) :
	                        base(connection, mappingSource)
	        {
	                OnCreated();
	        }

	        public Northwind(System.Data.IDbConnection connection) :
	                        base(connection, mappingSource)
	        {
	                OnCreated();
	        }

	        public Northwind(string connection, System.Data.Linq.Mapping.MappingSource
mappingSource) :
	                        base(connection, mappingSource)
	        {
	                OnCreated();
	        }

	        public Northwind(System.Data.IDbConnection connection, System.Data.Linq.
Mapping.MappingSource mappingSource) :
	                        base(connection, mappingSource)
	        {
	                OnCreated();
	        }

	        public System.Data.Linq.Table<Customers> Customers
	        {
	                get
	                {
	                        return this.GetTable<Customers>();
	                }
	        }
	   // More Table<EntityClass> definitions, one for each table in the database
	}

The Entity Class definitions for the Northwind database are all present in the generated code as well, with each table having an Entity Class defined for it. The Entity Classes are indicated by the Table attribute with no parameters. This means that the name of the Entity Class matches the table name:

	[Table()]
	public partial class Customers : INotifyPropertyChanging, INotifyPropertyChanged
	{

	    #region Extensibility Method Definitions
	    partial void OnLoaded();
	    partial void OnValidate();
	    partial void OnCreated();
	    partial void OnCustomerIDChanging(string value);
	    partial void OnCustomerIDChanged();
	    // more extensibility methods to indicate change states for each property...
	    #endregion

	        public event PropertyChangingEventHandler PropertyChanging;

	        public event PropertyChangedEventHandler PropertyChanged;

The standard property change notifications are implemented via INotifyPropertyChanging and INotifyPropertyChanged and have PropertyChanging and PropertyChanged events for conveying the change to a property. There are also a set of partial methods that will report when a specific property is modified on this Entity Class if the partial method is implemented in another partial class definition for the Entity Class. If no other partial class definition is found, the compiler will remove those notifications. Partial methods enable the declaration of a method signature in one file of a partial class declaration and the implementation of the method in another. If the signature is found but the implementation is not, the signature is removed by the compiler.

The properties in the Entity Class match up to the columns in the database via the Column attribute, where the Name value is the database column name and the Storage value is the internal storage for the class of the data. Events for the property changes are wired into the setter for the property:

	[Column(Name="Company Name", Storage="_CompanyName", DbType="NVarChar(40) NOT
NULL", CanBeNull=false)]
	public string CompanyName
	{
	        get
	        {
	                return this._CompanyName;
	        }
	        set
	        {
	                if ((this._CompanyName != value))
	                {
	                        this.OnCompanyNameChanging(value);
	                        this.SendPropertyChanging();
	                        this._CompanyName = value;
	                        this.SendPropertyChanged("CompanyName");
	                        this.OnCompanyNameChanged();
	                }
	        }
	}

For a one-to-many child relationship, an EntitySet<T> of the child Entity Class is declared with an Association attribute. The Association attribute specifies the relationship information between the parent and child Entity Classes, as shown here for the Orders property on Customer:

	[Association(Name="Orders_FK00", Storage="_Orders", ThisKey="CustomerID",
OtherKey="CustomerID", DeleteRule="NO ACTION")]
	public EntitySet<Orders> Orders
	{
	        get
	        {
	                return this._Orders;
	        }
	        set
	        {
	                this._Orders.Assign(value);
	        }
	}

LINQ to SQL covers much more than what has been shown here; we encourage you to investigate it more, but let’s now see the other data domain we are dealing with: LINQ to XML.

LINQ to XML is not only how you perform queries against XML; it is a more developer-friendly way to work with XML. One of the main classes in LINQ to XML is XElement.XElement allows you to create XML in a manner that more closely resembles the structure of the XML itself. This may not seem like a big deal, but when you can see the XML taking shape in your code, it makes it easier to know where you are. (Ever forget which XmlWriter.WriteEndElement you were on? We have!) You can get more details and examples about using XElement in Chapter 15, so we won’t go much further into it here, but as you can see, it is very easy to build up XML in a query.

The first part of the query deals with setting up the main XML element "BigSpenders“, getting the initial set of customers where the contact is the owner:

	var bigSpenders = new XElement("BigSpenders",
	            from top5 in
	            (
	                (from customer in
	                     (
	                         from c in dataContext.Customers
	                         // get the customers where the contact is the owner
	                         // and they placed orders
	                         where c.ContactTitle.Contains("Owner")
	                            && c.Orders.Count > 0

The middle of the query deals with joining the order and order detail information with the customer information to get the NetOrderTotal for the order. It also creates order data containing that value and the customer and order ids and the customer name. We need the NetOrderTotal in the last part of the query, so stay tuned!

	    join orderData in
	        (
	            from c in dataContext.Customers
	            // get the customers where the contact is the owner
	            // and they placed orders
	            where c.ContactTitle.Contains("Owner")
	               && c.Orders.Count > 0
	            from o in c.Orders
	            // get the order details
	            join od in dataContext.OrderDetails
	                on o.OrderID equals od.OrderID
	            select new
	            {
	                c.CompanyName,
	                c.CustomerID,
	                o.OrderID,
	                // have to calc order value from orderdetails
	                //(UnitPrice*Quantity as Total)
	                  (Total*Discount)
	                // as NetOrderTotal
	                NetOrderTotal = (
	(((double)od.UnitPrice) * od.Quantity) -
	((((double)od.UnitPrice) * od.Quantity) * od.Discount))
	            }
	        )
	    on c.CustomerID equals orderData.CustomerID
	    into customerOrders

The last part of the query determines the TotalSpend for that customer across all orders using the Sum function on NetOrderTotal for the generated customerOrders collection. The last criteria evaluated is that only the top five customers with a TotalSpend value > 10000 are selected by using the Take function. Take is the equivalent to TOP in SQL. The records are then used to construct one inner Customer element with attributes that nest inside the BigSpenders root element started in the first part of the query:

	             select new
	             {
	                  c.CompanyName,
	                  c.ContactName,
	                  c.Phone,
	                  // Get the total amount spent by the customer
	                  TotalSpend = customerOrders.Sum(order => order. NetOrderTotal)
	             }
	          )
	      // only worry about customers that spent > 10000
	      where customer.TotalSpend > 10000
	      orderby customer.TotalSpend descending
	     // only take the top 5 spenders
	     select customer).Take(5)
	)
	// format the data as XML
	select new XElement("Customer",
	           new XAttribute("companyName", top5.CompanyName),
	           new XAttribute("contactName", top5.ContactName),
	           new XAttribute("phoneNumber", top5.Phone),
	           new XAttribute("amountSpent", top5.TotalSpend)));

Tip

It is much easier to build large-nested queries as individual queries first and then put them together once you are sure the inner query is working.

At this point, for all of the code here, nothing has happened yet. That’s right, until the query is accessed, nothing happens through the magic of deferred execution. LINQ has constructed a query expression, but nothing has talked to the database; there is no XML in memory, nada. Once the WriteTo method is called on the bigSpenders query expression, then the query is evaluated by LINQ to SQL, and the XML is constructed. The WriteTo method writes out the constructed XML to the XmlWriter provided, and we are done:

	using (XmlWriter writer = XmlWriter.Create("BigSpenders.xml"))
	{
	    bigSpenders.WriteTo(writer);
	}

If you are interested in what that SQL will look like, connect the DataContext.Log property to a TextWriter (like the console):

	// Log the generated SQL to the console
	dataContext.Log = Console.Out;

This query generates SQL that looks like this:

	Generated SQL for query - output via DataContext.Log
	SELECT [t10].[CompanyName], [t10].[ContactName], [t10].[Phone], [t10].[TotalSpend]
	FROM (
	    SELECT TOP (5) [t0].[Company Name] AS [CompanyName], [t0].[Contact Name] AS
	[ContactName], [t0].[Phone], [t9].[value] AS [TotalSpend]
	    FROM [Customers] AS [t0]
	    OUTER APPLY (
	        SELECT COUNT(*) AS [value]
	        FROM [Orders] AS [t1]
	        WHERE [t1].[Customer ID] = [t0].[Customer ID]
	        ) AS [t2]
	    OUTER APPLY (
	        SELECT SUM([t8].[value]) AS [value]
	        FROM (
	            SELECT [t3].[Customer ID], [t6].[Order ID],
	                 ([t7].[Unit Price] *
	                 (CONVERT(Decimal(29,4),[t7].[Quantity]))) - ([t7].[Unit Price] *
	                     (CONVERT(Decimal(29,4),[t7].[Quantity])) *
	                         (CONVERT(Decimal(29,4),[t7].[Discount]))) AS [value],
	                 [t7].[Order ID] AS [Order ID2], 
	                 [t3].[Contact Title] AS [ContactTitle],
	                 [t5].[value] AS [value2],
	                 [t6].[Customer ID] AS [CustomerID]
	        FROM [Customers] AS [t3]
	        OUTER APPLY (
	            SELECT COUNT(*) AS [value]
	            FROM [Orders] AS [t4]
	            WHERE [t4].[Customer ID] = [t3].[Customer ID]
	            ) AS [t5]
	        CROSS JOIN [Orders] AS [t6]
	        CROSS JOIN [Order Details] AS [t7]
	        ) AS [t8]
	    WHERE ([t0].[Customer ID] = [t8].[Customer ID]) AND ([t8].[Order ID] = [
t8].[Order ID2]) AND ([t8].[ContactTitle] LIKE @p0) AND ([t8].[value2] > @p1) AN
D ([t8].[CustomerID] = [t8].[Customer ID])
	        ) AS [t9]
	    WHERE ([t9].[value] > @p2) AND ([t0].[Contact Title] LIKE @p3) AND ([t2].[va
	lue] > @p4)
	    ORDER BY [t9].[value] DESC
	    ) AS [t10]
	ORDER BY [t10].[TotalSpend] DESC
	-- @p0: Input String (Size = 0; Prec = 0; Scale = 0) [%Owner%]
	-- @p1: Input Int32 (Size = 0; Prec = 0; Scale = 0) [0]
	-- @p2: Input Decimal (Size = 0; Prec = 29; Scale = 4) [10000]
	-- @p3: Input String (Size = 0; Prec = 0; Scale = 0) [%Owner%]
	-- @p4: Input Int32 (Size = 0; Prec = 0; Scale = 0) [0]
	-- Context: SqlProvider(SqlCE) Model: AttributedMetaModel Build: 3.5.20706.1

The final XML is shown below:

	<BigSpenders>
	  <Customer companyName="Folk och fa HB" contactName="Maria Larsson"
	            phoneNumber="0695-34 67 21" amountSpent="39805.162472039461" />
	  <Customer companyName="White Clover Markets" contactName="Karl Jablonski"
	            phoneNumber="(206) 555-4112" amountSpent="35957.604972146451" />
	  <Customer companyName="Bon app'" contactName="Laurence Lebihan"
	            phoneNumber="91.24.45.40" amountSpent="22311.577472746558" />
	  <Customer companyName="LINO-Delicateses" contactName="Felipe Izquierdo"
	            phoneNumber="(8) 34-56-12" amountSpent="20458.544984650609" />
	  <Customer companyName="Simons bistro" contactName="Jytte Petersen"
	            phoneNumber="31 12 34 56" amountSpent="18978.777493602414" />
	</BigSpenders>

See Also

The “The Three Parts of a LINQ Query,” “DataContext.Log, property,” “DataContext class,” “XElement class,” and "LINQ to SQL” topics in the MSDN documentation.

1.9. Being Selective About Your Query Results

Problem

You want to be able to get a dynamic subset of a query result.

Solution

Use the TakeWhile extension method to retrieve all results until the criteria is matched:

	Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString);

	var query =
	    dataContext.Suppliers.GroupJoin(dataContext.Products,
	        s => s.SupplierID, p => p.SupplierID,
	        (s, products) => new
	        {
	            s.CompanyName,
	            s.ContactName,
	            s.Phone,
	            Products = products
	        }).OrderByDescending(supplierData => supplierData.Products.Count())
	            .TakeWhile(supplierData => supplierData.Products.Count() > 3);


	Console.WriteLine("Suppliers that provide more than three products: {0}", query.
	Count());
	foreach (var supplierData in query)
	{
	    Console.WriteLine("    Company Name : {0}",supplierData.CompanyName);
	    Console.WriteLine("    Contact Name : {0}", supplierData.ContactName);
	    Console.WriteLine("    Contact Phone : {0}", supplierData.Phone);
	    Console.WriteLine("    Products Supplied : {0}", supplierData.Products.Count());
	    foreach (var productData in supplierData.Products)
	    {
	        Console.WriteLine("        Product: " + productData.ProductName);
	    }
	}

You can also use the SkipWhile extension method to retrieve all results once the criteria are matched:

	Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString);

	var query =
	    dataContext.Suppliers.GroupJoin(dataContext.Products,
	        s => s.SupplierID, p => p.SupplierID,
	        (s, products) => new
	        {
	            s.CompanyName,
	            s.ContactName,
	            s.Phone,
	            Products = products
	        }).OrderByDescending(supplierData => supplierData.Products.Count())
	            .SkipWhile(supplierData =>
	      {
	        return supplierData.Products.Count() > 3;
	      });

	Console.WriteLine("Suppliers that provide three or less products: {0}",
	                 query.Count());
	foreach (var supplierData in query)
	{
	    Console.WriteLine("    Company Name : {0}",supplierData.CompanyName);
	    Console.WriteLine("    Contact Name : {0}", supplierData.ContactName);
	    Console.WriteLine("    Contact Phone : {0}", supplierData.Phone);
	    Console.WriteLine("    Products Supplied : {0}", supplierData.Products.Count());
	    foreach (var productData in supplierData.Products)
	    {
	        Console.WriteLine(" Product: " + productData.ProductName);
	    }
	}

Discussion

In this example using LINQ to SQL, the number of products each supplier provides is determined, and the result set is sorted in descending order by product count:

	var query =
	    dataContext.Suppliers.GroupJoin(dataContext.Products,
	        s => s.SupplierID, p => p.SupplierID,
	        (s, products) => new
	        {
	            s.CompanyName,
	            s.ContactName,
	            s.Phone,
	            Products = products
	        }).OrderByDescending(supplierData => supplierData.Products.Count())

From that result, the supplier data for suppliers is only accepted into the final result set if they provide more than 3 products and the results are displayed. TakeWhile is used with a lambda expression to determine if the product count is greater than 3, and if so, the supplier is accepted into the result set:

	       .TakeWhile(supplierData =>
	{
	  return supplierData.Products.Count() > 3;
	});

If SkipWhile was used instead, all of the suppliers that provide 3 or fewer products would be returned:

	      .SkipWhile(supplierData =>
	{
	  return supplierData.Products.Count() > 3;
	});

Being able to write code-based conditions allows for more flexibility than the regular Take and Skip methods, which are absolute based on record count, but keep in mind that once the condition is hit for either TakeWhile or SkipWhile, you get all records after that, which is why sorting the result set before using these is important.

The query also uses GroupJoin, which is comparable to a SQL LEFT or RIGHT OUTER JOIN, but the result is not flattened. GroupJoin produces a hierarchical result set instead of a tabular one, which is used to get the collection of Products by Supplier in this example:

	dataContext.Suppliers.GroupJoin(dataContext.Products,
	   s => s.SupplierID, p => p.SupplierID,

See Also

The “Enumerable.TakeWhile method,” “Enumerable.SkipWhile method,” and “Enumerable.GroupJoin method” topics in the MSDN documentation.

1.10. Using LINQ with Collections That Don’t Support IEnumerable<T>

Problem

There are a whole bunch of collections that don’t support the generic versions of IEnumerable or ICollection but that do support the original nongeneric versions of the IEnumerable or ICollection interfaces, and you would like to be able to query those collections using LINQ.

Solution

The type cannot be inferred from the original IEnumeration or ICollection interfaces, so it must be provided using either the OfType<T> or Cast<T> extension methods or by specifying the type in the from clause, which inserts a Cast<T> for you. The first example uses Cast<XmlNode> to let LINQ know that the elements in the XmlNodeList returned from XmlDocument.SelectNodes are of type XmlNode. For an example of how to use the OfType<T> extension method, see the Discussion section:

	// Make some XML with some types that you can use with LINQ
	// that don't support IEnumerable<T> directly
	XElement xmlFragment = new XElement("NonGenericLinqableTypes",
	                        new XElement("IEnumerable",
	                            new XElement("System.Collections",
	                                new XElement("ArrayList"),
	                                new XElement("BitArray"),
	                                new XElement("Hashtable"),
	                                new XElement("Queue"),
	                                new XElement("SortedList"),
	                                new XElement("Stack")),*
	                            new XElement("System.Net",
	                                new XElement("CredentialCache")),
	                            new XElement("System.Xml",
	                                new XElement("XmlNodeList")),
	                            new XElement("System.Xml.XPath",
	                                new XElement("XPathNodeIterator"))),
	                        new XElement("ICollection",
	                            new XElement("System.Diagnostics",
	                                new XElement("EventLogEntryCollection")),
	                            new XElement("System.Net",
	                                new XElement("CookieCollection")),
	                            new XElement("System.Security.AccessControl",
	                                new XElement("GenericAcl")),
	                            new XElement("System.Security",
	                                new XElement("PermissionSet"))));

	XmlDocument doc = new XmlDocument();
	doc.LoadXml(xmlFragment.ToString());

	// Select the names of the nodes under IEnumerable that have children and are
	// named System.Collections and contain a capital S and return that list in
	descending order
	var query = from node in doc.SelectNodes("/NonGenericLinqableTypes/IEnumerable/*").
	Cast<XmlNode>()
	            where node.HasChildNodes &&
	                node.Name == "System.Collections"
	            from XmlNode xmlNode in node.ChildNodes
	            where xmlNode.Name.Contains('S')
	            orderby xmlNode.Name descending
	            select xmlNode.Name;

	foreach (string name in query)
	{
	    Console.WriteLine(name);
	}

The second example works against the Application event log and retrieves the errors that occurred in the last 6 hours. The type of the element in the collection (EventLogEntry) is provided next to the from keyword, which allows LINQ to infer the rest of the information it needs about the collection element type:

	EventLog log = new EventLog("Application");
	var query = from EventLogEntry entry in log.Entries
	            where entry.EntryType == EventLogEntryType.Error &&
	                  entry.TimeGenerated > DateTime.Now.Subtract(new TimeSpan(6, 0, 0))
	            select entry.Message;

	Console.WriteLine("There were " + query.Count<string>() +
	    " Application Event Log error messages in the last 6 hours!");
	foreach (string message in query)
	{
	    Console.WriteLine(message);
	}

Discussion

Cast<T> will transform the IEnumerable into IEnumerable<T> so that LINQ can access each of the items in the collection in a strongly typed manner. Before using Cast<T>, it would behoove you to check that all elements of the collection really are of type T, or you will get an InvalidCastException if the type of the element is not convertible to the type T specified, because all elements will be cast using the type. Placing the type of the element next to the from keyword acts just like a Cast<T>:

	ArrayList stuff = new ArrayList();
	stuff.Add(DateTime.Now);
	stuff.Add(DateTime.Now);
	stuff.Add(1);
	stuff.Add(DateTime.Now);

	var expr = from item in stuff.Cast<DateTime>()
	           select item;
	// attempting to cast the third element throws InvalidCastException
	foreach (DateTime item in expr)
	{
	    Console.WriteLine(item);
	}

Tip

Note that again because of the deferred execution semantics that the exception that occurs with Cast<T> or from only happens once that element has been iterated to.

Another way to approach this issue would be to use OfType<T>, as it will only return the elements of a specific type and not try to cast elements from one type to another:

	var expr = from item in stuff.OfType<DateTime>()
	           select item;
	// only three elements, all DateTime returned. No exceptions
	foreach (DateTime item in expr)
	{
	    Console.WriteLine(item);
	}

See Also

The “OfType<TResult> method” and “Cast<TResult> method” topics in the MSDN documentation.

Get C# 3.0 Cookbook, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.