Chapter 1. Language Integrated Query (LINQ)
1.0. Introduction
Language Integrated Query (LINQ) is a new way to access data from many different sources. LINQ provides a single querying model that can operate against different data domains individually or all together in a single query. LINQ brings the ability to query datato .NET languages, and some of the languages have provided extensions to make its use even more intuitive. One of these languages is C#; there are a number of extensions to the language in C# 3.0 that help to facilitate querying in a rich and intuitive manner.
Traditional object-oriented programming is based on an imperative style wherein the developer describes in detail not only what they want to happen, but also describes a majority of the detail regarding exactly how this should be performed through code. LINQ helps to take coding down a more declarative path that facilitates describing what the developer wants to do instead of describing how to accomplish the goal in detail. LINQ also enables a more functional style of programming. These changes can dramatically shorten the amount of code it takes to perform some tasks. That said, object-oriented programming is still very much alive and well in C# and .NET, but for the first time the language is offering the chance to choose the style of programming based on your needs. Note, however, that LINQ will not fit into every scenario and is not a replacement for good design or practice. You can write bad code using LINQ just as you can write bad object-oriented or procedural code. The trick, like it always has been, is to figure out when it is appropriate to use which technique.
The initial version of LINQ encompasses a number of data domains as listed here:
There are a number of other “LINQ to” implementations currently under development, but these are Microsoft’s initial offerings. A few of the others in development are LINQ to SharePoint, LINQ to LDAP, and even a LINQ to Amazon implementation. The only one of the initial Microsoft set that won’t be ready immediately with the release of Visual Studio 2008 is LINQ to Entities, or the ADO.NET Entity Framework, as it is also known. LINQ to Entities will be released shortly after Visual Studio 2008.
As you begin your examination of LINQ, it is easy to begin to think of it as a new object relational mapping layer, or some neat new widgets on IEnumerable<T>
, ora new XML API, or even just an excuse to not write SQL directly anymore. You can do any of these things, but we would encourage you to think of LINQ as how your program asks for, calculates, or transforms sets of data from both single and disparate sources. It takes a bit of time and playing with LINQ for its functionality to click, but once it does, you will be surprised at what you can do with it. This chapter begins to show some of what is possible with LINQ and will hopefully start you down the path toward thinking of which of your scenarios are applicable to this new capability in C#.
1.1. Query a Message Queue
Problem
You want to be able to query for messages with specific criteria from an existing message queue.
Solution
Write a query using LINQ to retrieve messages using the System.Messaging.MessageQueue
type:
// open an existing message queue string queuePath = @".\private$\LINQMQ"; MessageQueue messageQueue = new MessageQueue(queuePath); BinaryMessageFormatter messageFormatter = new BinaryMessageFormatter(); var query = from Message msg in messageQueue // The first assignment to msg.Formatter is so that we can touch the // Message object. It assigns the BinaryMessageFormatter to each message // instance so that it can be read to determine if it matches the criteria. // Next, a check is performed that the formatter was correctly assigned // by performing an equality check, which satisfies the Where clause's need // for a boolean result while still executing the assignment of the formatter. where ((msg.Formatter = messageFormatter) == messageFormatter) && int.Parse(msg.Label) > 5 && msg.Body.ToString().Contains('D') orderby msg.Body.ToString() descending select msg; // Check our results for messages with a label > 5 and containing a 'D' in the name foreach (var msg in query) { Console.WriteLine("Label: " + msg.Label + " Body: " + msg.Body); }
The query retrieves the data from the MessageQueue
by selecting the messages where the Label is a number greater than 5 and the message body contains a capital letter “D”. These messages are then returned sorted by the message body in descending order.
Discussion
There are a number of new keywords in this code using LINQ that were not previously used to access a message queue:
var
Instructs the compiler to infer the type of the variable from the right side of the statement. In essence, the type of the variable is determined by what is on the right side of the operator separating the
var
keyword and the expression. This allows for implicitly typed local variables.from
The
from
keyword sets out the source collection to query against and a range variable to represent a single element from that collection. It is always the first clause in a query operation. This may seem counterintuitive if you are used to SQL and expect select to be first, but if you consider that first we need what to work on before we determine what to return, it makes sense. If we weren’t used to how SQL does this already, it would be SQL that seems counterintuitive.where
The
where
keyword specifies the constraints by which the elements to return are filtered. Each condition must evaluate to a Boolean result, and when all expressions evaluate to true, the element of the collection is allowed to be selected.orderby
This keyword indicates that the result set should be sorted according to the criteria specified. The default order is ascending, and elements use the default comparer.
select
Allows the projection of an entire element from the collection, the construction of a new type with parts of that element and other calculated values, or a sub-collection of items into the result.
The messageQueue
collection is of type System.Messaging.MessageQueue
, which implements the IEnumerable
interface. This is important, as the LINQ methods provided need a set or collection to implement at least IEnumerable
for it to work with that set or collection. It would be possible to implement a set of extension methods that did not need IEnumerable, but most people will not have the need to. It is even better when the set or collection implements IEnumerable<T>
, as LINQ then knows the type of element in the set or collection that it is working with, but in this case, MessageQueue
has been in the framework for a while and isn’t likely to change, so the query provides the element type Message
, as shown in the “from” line:
var query = from Message
msg in messageQueue
For more about this, see Recipe 1.1.
In the Solution, the messages in the queue have been sent with the use of the BinaryFormatter
. To be able to query against them correctly, the Formatter
property must be set on each Message
before it is examined as part of the where
clause:
// The first assignment to msg.Formatter is so that we can touch the // Message object. It assigns the BinaryMessageFormatter to each message // instance so that it can be read to determine if it matches the criteria. // This is done, and then it checks that the formatter was correctly assigned // by performing an equality check, which satisfies the Where clause's need // for a boolean result, while still executing the assignment of the formatter. where ((msg.Formatter = messageFormatter) == messageFormatter) &&
There are two uses of the var
keyword in the solution code:
var query = from Message msg in messageQueue ... foreach (var msg in query) ...
The first usage infers that an IEnumerable<Message>
will be returned and assigned to the query
variable. The second usage infers that the type of msg
is Message
because the query
variable is of type IEnumerablew<Message>
and the msg
variable is an element from that IEnumerable
.
It is also worth noting that when performing operations in a query, actual C# code can be used to determine the conditions, and there is more than just the predetermined set of operators. In the where
clause of this query, both int.Parse
and string
. Contains
are used to help filter messages:
int.Parse(msg.Label) > 5 && msg.Body.ToString().Contains('D')
See Also
Recipe 1.9, and the “MessageQueue class,” “Implicitly typed local variable,” “from keyword,” “where keyword,” “orderby keyword,” and “select keyword” topics in the MSDN documentation.
1.2. Using Set Semantics with Data
Problem
You would like to work with your collections using set operations for union, intersections, exceptions, and distinct items.
Solution
Use the Set operators provided as part of the Standard Query Operators to perform those operations.
- Distinct:
IEnumerable<string> whoLoggedIn = dailySecurityLog.Where(logEntry => logEntry.Contains("logged in")).Distinct( );
- Union:
// Union Console.WriteLine("Employees for all projects"); var allProjectEmployees = project1.Union(project2.Union(project3));
- Intersection:
// Intersect Console.WriteLine("Employees on every project"); var everyProjectEmployees = project1.Intersect(project2.Intersect(project3));
- Exception:
Console.WriteLine("Employees on only one project"); var onlyProjectEmployees = allProjectEmployees.Except(unionIntersect);
Discussion
The Standard Query Operators are the set of methods that represent the LINQ pattern. This set includes operators to perform many different types of operations, such as filtering, projection, sorting, grouping, and many others, including set operations.
The set operations for the Standard Query Operators are:
Distinct
Union
Intersect
Except
The Distinct operator extracts all nonduplicate items from the collection or result set being worked with. Say, for example, that we had a set of strings representing login and logout behavior for a terminal services box for today:
// Distinct string[] dailySecurityLog = { "Bob logged in", "Bob logged out", "Bob logged in", "Bill logged in", "Melissa logged in", "Bob logged out", "Bill logged out", "Bill logged in", "Tim logged in", "Scott logged in", "Scott logged out", "Dave logged in", "Tim logged out", "Bob logged in", "Dave logged out"};
From that collection, we would like to determine the list of people who logged in to the box today. Since people can log in and log out many times during the course of a day or remain logged in for the whole day, we need to eliminate the duplicate login entries. Distinct
is an extension method on the System.Linq.Enumerable
class (which implements the Standard Query Operators) that can be called on the string array (which supports IEnumerable)
in order to get the distinct set of items from the set. For more information on extension methods, see Recipe 1.4. The set is produced by using another of the Standard Query Operators: Where. Where
takes a lambda expression that determines the filter criteria for the set and examines each string in the IEnumerable<string>
to determine if the string has “logged in.” Lambda expressions are inline statements (similar to anonymous methods) that can be used in place of a delegate. See Chapter 9 for more on lambda expressions. If the strings do, then they are selected. Distinct
narrows down the set of strings further to eliminate duplicate “logged in” records, leaving only one per user:
IEnumerable<string> whoLoggedIn = dailySecurityLog.Where(logEntry => logEntry.Contains("logged in")).Distinct( ); Console.WriteLine("Everyone who logged in today:"); foreach (string who in whoLoggedIn) { Console.WriteLine(who); }
To make things a bit more interesting, for the rest of the operators, we will work with sets of employees on various projects in a company. An Employee
is a pretty simple class with a Name
and overrides for ToString, Equals
, and GetHashCode
, as shown here:
public class Employee { public string Name { get; set; } public override string ToString() { return this.Name; } public override bool Equals(object obj) { return this.GetHashCode().Equals(obj.GetHashCode()); } public override int GetHashCode() { return this.Name.GetHashCode(); } }
You might wonder why Equals
and GetHashCode
are overloaded for such a simple class. The reason is that when LINQ performs comparisons of elements in the sets or collections, it uses the default comparison, which in turn uses Equals
and GetHashCode
to determine if one instance of a reference type is the same as another. If you do not provide the semantics in the reference type class to provide the same hash code or equals value when the data for two instances of the object is the same, then the instances will, by default, be different, as two reference types have different hash codes by default. We override that so that if the Name
is the same for each Employee
, the hash code and the equals will both correctly identify the instances as the same. There are also overloads for the set operators that take a custom comparer, which would also allow you to make this determination even for classes for which you can’t make the changes to Equals
and GetHashCode
.
Having done this, we can now assign Employees
to projects like so:
Employee[] project1 = { new Employee(){ Name = "Bob" }, new Employee(){ Name = "Bill" }, new Employee(){ Name = "Melissa" }, new Employee(){ Name = "Shawn" } }; Employee[] project2 = { new Employee(){ Name = "Shawn" }, new Employee(){ Name = "Tim" }, new Employee(){ Name = "Scott" } }; Employee[] project3 = { new Employee(){ Name = "Bob" }, new Employee(){ Name = "Dave" }, new Employee(){ Name = "Tim" }, new Employee(){ Name = "Shawn" } };
To find all employees on all projects, use Unionl
to get all nonduplicate Employees
in all three projects and write them out:
// Union Console.WriteLine("Employees for all projects:"); var allProjectEmployees = project1.Union(project2.Union(project3)); foreach (Employee employee in allProjectEmployees) { Console.WriteLine(employee); }
We can then use Intersect
to get the Employees
on every project:
// Intersect Console.WriteLine("Employees on every project:"); var everyProjectEmployees = project1.Intersect(project2.Intersect(project3)); foreach (Employee employee in everyProjectEmployees) { Console.WriteLine(employee); }
Finally, we can use a combination of Union
and Except
to find Employees
that are only on one project:
// Except var intersect1_3 = project1.Intersect(project3); var intersect1_2 = project1.Intersect(project2); var intersect2_3 = project2.Intersect(project3); var unionIntersect = intersect1_2.Union(intersect1_3).Union(intersect2_3); Console.WriteLine("Employees on only one project:"); var onlyProjectEmployees = allProjectEmployees.Except(unionIntersect); foreach (Employee employee in onlyProjectEmployees) { Console.WriteLine(employee); }
1.3. Reuse Parameterized Queries with LINQ to SQL
Problem
You need to execute the same parameterized query multiple times with different parameter values, but you want to avoid the overhead of parsing the query expression tree to build the parameterized SQL each time the query executes.
Solution
Use the CompiledQuery.Compile
method to build an expression tree that will not have to be parsed each time the query is executed with new parameters:
var GetEmployees =
CompiledQuery.Compile
((Northwind db, string ac, string ttl) =>
from employee in db.Employees
where employee.HomePhone.Contains(ac) &&
employee.Title == ttl
select employee);
Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString);
The first time the query executes is when it actually compiles (where GetEmployees
is called the first time in the foreach
loop). Every other iteration in this loop and in the next loop use the compiled version, avoiding the expression tree parsing:
foreach (var employee in GetEmployees(dataContext, "(206)", "Sales Representative")) { Console.WriteLine("{0} {1}", employee.FirstName, employee.LastName); } foreach (var employee in GetEmployees(dataContext, "(71)", "Sales Manager")) { Console.WriteLine("{0} {1}", employee.FirstName, employee.LastName); }
Discussion
We used var
for the query declaration, as it was cleaner, but what var
actually is in this case is:
Func<Northwind, string, string, IQueryable<Employees>>
which is the delegate signature for the lambda expression we created that contains the query. That’s right, all this crazy new query stuff, and we just instantiated a delegate. To be fair, the Func
delegate was brought about in the System
namespace as part of LINQ, so do not dismay, we are still doing cool newstuff!
This illustrates that we are not returning an IEnumerable
or IQueryable
based result set from Compile
, but rather an expression tree. This is the expression tree that represents the potential for a query rather than the query itself. Once we have that tree, LINQ to SQL then has to perform the conversion from the tree to actual SQL that can run against the database. Interestingly enough, if we had put a call to string.Format
in as part of detecting the area code in the home phone number, we would get a NotSupportedException
that informs us that string.Format
can’t be translated to SQL:
where employee.HomePhone.Contains(string.Format("({0})",ac)
) &&
System.NotSupportedException:
Method 'System.String Format(System.String,System.Object)'
has no supported translation to SQL.
This is understandable, as SQL has no concept of .NET Framework methods for performing actions, but it is something to keep in mind as you design your queries that this is a limitation when using LINQ to SQL.
After the first execution, the query is compiled, and for every iteration after that, we do not pay the transformation cost for turning the expression tree into the parameterized SQL.
Compiling your queries is something that should be done for parameterized queries that get a lot of traffic, but if a query is infrequently used, it may not be worth the effort. As always, profile your code to see the areas where this could be useful.
1.4. Sort Results in a Culture-Sensitive Manner
Problem
You want to ensure that when you sort in a query, the sort order is for an applicationspecific culture that may not be the same as the current thread’s current culture.
Solution
Use the overload of the OrderBy
query operator, which accepts a custom comparer in order to specify the culture in which to perform comparisons:
// Create CultureInfo for Danish in Denmark. CultureInfo danish = new CultureInfo("da-DK"); CultureStringComparer comparer = new CultureStringComparer(danish,CompareOptions. None); var query = names.OrderBy
(n => n,comparer
);
Discussion
Handling localization issues such as sorting for a specific culture is a relatively trivial task in .NET if the current culture of the current thread is the culture you want to use. The framework classes that assist in handling culture issues in C# are accessed by including the System.Globalization
namespace. This namespace would be included in order to make the code in the solution run. One example of not using the thread current culture would be in an application that needs to display a sorted list of words in Danish on a version of Windows XP that is set for U.S. English. The current thread in the application may have a CultureInfo
for “en-US” and, by default, the sort order for OrderBy
will use the current culture sort settings. To specify that this list should sort according to Danish rules, a bit of work is necessary in the form of a custom comparer:
CultureStringComparer comparer = new CultureStringComparer(danish,CompareOptions. None);
The comparer variable is an instance of a custom comparer class CultureStringComparer
defined as implementing the IComparer<T>
interface specialized for strings. This class is used to provide the culture settings for the sort order:
public class CultureStringComparer : IComparer<string> { private CultureStringComparer() { } public CultureStringComparer(CultureInfo cultureInfo, CompareOptions options) { if (cultureInfo == null) throw new ArgumentNullException("cultureInfo"); CurrentCultureInfo = cultureInfo; Options = options; } public int Compare(string x, string y) { return CurrentCultureInfo.CompareInfo.Compare(x, y, Options); } public CultureInfo CurrentCultureInfo { get; set; } public CompareOptions Options { get; set; } }
To demonstrate how this could be used, first we compile a list of words to order by. Since the Danish language treats the character “Æ” as an individual letter, sorting it after “Z” in the alphabet, and the English language treats the character “Æ” as a special symbol, sorting it before the letter “A” in the alphabet, this will demonstrate the sort difference:
string[] names = { "Jello", "Apple", "Bar", "Æble", "Forsooth", "Orange", "Zanzibar" };
Now, we can set up the CultureInfos
for both Danish and U.S. English and call OrderBy
with the comparer specific to each culture. This query is not using the query expression syntax, but rather uses the functional style of IEnumerable<string>.OrderBy()
:
// Create CultureInfo for Danish in Denmark. CultureInfo danish = new CultureInfo("da-DK"); // Create CultureInfo for English in the U.S. CultureInfo american = new CultureInfo("en-US"); CultureStringComparer comparer = new CultureStringComparer(danish
, CompareOptions. None); var query =names.OrderBy
(n => n, comparer); Console.WriteLine("Ordered by specific culture : " + comparer.CurrentCultur eInfo. Name); foreach (string name in query) { Console.WriteLine(name); } comparer.CurrentCultureInfo =american
; query =names.OrderBy
(n => n, comparer); Console.WriteLine("Ordered by specific culture : " + comparer.CurrentCultureInfo. Name); foreach (string name in query) { Console.WriteLine(name); }
The output results below show that the word Æble is last in the Danish list and first in the U.S. English list:
Ordered by specific culture : da-DK Apple Bar Forsooth Jello Orange Zanzibar Æble Ordered by specific culture : en-US Æble Apple Bar Forsooth Jello Orange Zanzibar
See Also
The “OrderBy,” “CultureInfo,” and “IComparer<T>” topics in the MSDN documentation.
1.5. Adding Functional Extensions for Use with LINQ
Problem
There are operations you perform on collections frequently that currently reside in utility classes. You would like to be able to have these operations be used on collections in a more seamless manner than having to pass the reference to the collection to the utility class.
Solution
Use extension methods to help achieve a more functional style of programming for your collection operations. For example, to add a weighted moving average calculation operation to numeric collections, implement a set of WeightedMovingAverage
extension methods in a static class and then call them as part of those collections:
decimal[] prices = new decimal[10] { 13.5M, 17.8M, 92.3M, 0.1M, 15.7M, 19.99M, 9.08M, 6.33M, 2.1M, 14.88M }; Console.WriteLine(prices.WeightedMovingAverage
()); double[] dprices = new double[10] { 13.5, 17.8, 92.3, 0.1, 15.7, 19.99, 9.08, 6.33, 2.1, 14.88 }; Console.WriteLine(dprices.WeightedMovingAverage
()); float[] fprices = new float[10] { 13.5F, 17.8F, 92.3F, 0.1F, 15.7F, 19.99F, 9.08F, 6.33F, 2.1F, 14.88F }; Console.WriteLine(fprices.WeightedMovingAverage
()); int[] iprices = new int[10] { 13, 17, 92, 0, 15, 19, 9, 6, 2, 14 }; Console.WriteLine(iprices.WeightedMovingAverage
()); long[] lprices = new long[10] { 13, 17, 92, 0, 15, 19, 9, 6, 2, 14 }; Console.WriteLine(lprices.WeightedMovingAverage
());
To provide WeightedMovingAverage
for the full range of numeric types, methods for both the nullable
and non-nullable
numeric types are provided in the LinqExtensions
class:
public static class LinqExtensions { public static decimal? WeightedMovingAverage(this IEnumerable<decimal?> source) { if (source == null) throw new ArgumentNullException("source"); decimal aggregate = 0.0M; decimal weight; int item = 1; // count how many items are not null and use that // as the weighting factor int count = source.Count(val => val.HasValue); foreach (var nullable in source) { if (nullable.HasValue) { weight = item / count; aggregate += nullable.GetValueOrDefault() * weight; count++; } } if (count > 0) { return new decimal?(aggregate / count); } return null; } // The same method pattern as above is followed for each of the other // types and their nullable counterparts (double / double?, int / int?, etc.) #region Extend Average... }
Discussion
Extension methods allow you to create operations that appear to be part of a collection. They are static methods that can be called as if they were instance methods, allowing you to extend existing types. Extension methods must also be declared in static classes that are not nested. Once a static class is defined with extension methods, the using directive for the namespace of the class makes those extensions available in the source file.
Tip
It is worth noting that if an instance method exists with the same signature as the extension method, the extension method will never be called. Conflicting extension method declarations will resolve to the method in the closest enclosing namespace.
You cannot use extension methods to create:
Properties (get and set methods)
Operators (+, -, = , etc…)
Events
Declaring an extension method is done by specifying the this
keyword in front of the first parameter of a method declaration, and the type of that parameter is the type being extended. For example, in the Nullable<decimal>
version of the WeightedMovingAverage
method, collections that support IEnumerable<decimal?>
(or IEnumerable<Nullable<decimal>>
) are supported:
public static decimal? WeightedMovingAverage(this IEnumerable<decimal?> source) { if (source == null) throw new ArgumentNullException("source"); decimal aggregate = 0.0M; decimal weight; int item = 1; // count how many items are not null and use that // as the weighting factor int count = source.Count(val => val.HasValue); foreach (var nullable in source) { if (nullable.HasValue) { weight = item / count; aggregate += nullable.GetValueOrDefault() * weight; count++; } } if (count > 0) { return new decimal?(aggregate / count); } return null; }
The extension methods that support much of the LINQ functionality are on the System.Linq.Extensions
class, including an Average
method. The Average
method has most of the numeric types but did not provide an overload for short (Int16)
. That’s easily rectified by adding them ourselves for short
and Nullable<short>:
#region Extend Average public static double? Average(this IEnumerable<short?> source) { if (source == null) throw new ArgumentNullException("source"); double aggregate = 0.0; int count = 0; foreach (var nullable in source) { if (nullable.HasValue) { aggregate += nullable.GetValueOrDefault(); count++; } } if (count > 0) { return new double?(aggregate / count); } return null; } public static double Average(this IEnumerable<short> source) { if.(source == null) throw new ArgumentNullException("source"); double aggregate = 0.0; // use the count of the items from the source int count = source.Count(); foreach (var value in source) { aggregate += value; } if (count > 0) { return aggregate / count; } else return 0.0; } public static double? Average<TSource>(this IEnumerable<TSource> source, Func<TSource, short?> selector) { return source.Select<TSource, short?>(selector).Average(); } public static double Average<TSource>(this IEnumerable<TSource> source, Func<TSource, short> selector) { return source.Select<TSource, short>(selector).Average(); } #endregion // Extend Average
We can then call Average
on short-based collections just like WeightedMovingAverage:
short[] sprices = new short[10] { 13, 17, 92, 0, 15, 19, 9, 6, 2, 14 }; Console.WriteLine(sprices.WeightedMovingAverage()); // System.Linq.Extensions doesn't implement Average for short but we do for them! Console.WriteLine(sprices.Average());
1.6. Query and Join Across Data Repositories
Problem
You have two sets of data from different data domains, and you want to be able to combine the data and work with it.
Solution
Use LINQ to bridge across the disparate data domains. LINQ is intended to be used in the same manner across different data domains and supports combining those sets of data with join syntax.
To demonstrate this, we will join an XML file full of Categories with the data from a database (Northwind) with Products and combine the two to create a new set of data for product information that holds the product name, the category description, and the category name:
Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString); ProductsTableAdapter adapter = new ProductsTableAdapter(); Products products = new Products(); adapter.Fill(products._Products); XElement xmlCategories = XElement.Load("Categories.xml"); var expr = from product in products._Products where product.Units_In_Stock > 100 join xc in xmlCategories.Elements("Category") on product.Category_ID equals int.Parse(xc.Attribute("CategoryID").Value) select new { ProductName = product.Product_Name, Category = xc.Attribute("CategoryName").Value, CategoryDescription = xc.Attribute("Description").Value }; foreach (var productInfo in expr) { Console.WriteLine("ProductName: " + productInfo.ProductName + " Category: " + productInfo.Category + " Category Description: " + productInfo.CategoryDescription); }
The new set of data is printed to the console, but this could easily have been rerouted to another method, transformed in another query, or written out to a third data format:
ProductName: Grandma's Boysenberry Spread Category: Condiments Category Description: Sweet and savory sauces, relishes, spreads, and seasonings ProductName: Gustaf's Knäckebröd Category: Grains/Cereals Category Description: Breads, crackers, pasta, and cereal ProductName: Geitost Category: Dairy Products Category Description: Cheeses ProductName: Sasquatch Ale Category: Beverages Category Description: Soft drinks, coffees, teas, beer, and ale ProductName: Inlagd Sill Category: Seafood Category Description: Seaweed and fish ProductName: Boston Crab Meat Category: Seafood Category Description: Seaweed and fish ProductName: Pâté chinois Category: Meat/Poultry Category Description: Prepared meats ProductName: Sirop d'érable Category: Condiments Category Description: Sweet and savory sauces, relishes, spreads, and seasonings ProductName: Röd Kaviar Category: Seafood Category Description: Seaweed and fish ProductName: Rhönbräu Klosterbier Category: Beverages Category Description: Soft drinks, coffees, teas, beer, and ale
Discussion
The solution combines data from two different data domains: XML and a SQL Database. To do this before LINQ, you would have to not only create a third data repository by hand to hold the result, but you would also have to write the specific code for each domain to query that domain for its part of the data (XPath
for XML; SQL for database) and then manually transform the result sets from each domain into the new data repository. LINQ gives the ability to write the query to combine the two sets of data, automatically constructs a type via projecting a new Anonymous Type, and places the pertinent data in the new type, all in the same syntax. Not only does this simplify the code, but it allows you to concentrate more on getting the data you want and less on exactly how to read both data domains.
This example uses both LINQ to DataSet and LINQ to XML to access the multiple data domains:
Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString); ProductsTableAdapter adapter = new ProductsTableAdapter(); Products products = new Products(); adapter.Fill(products._Products); XElement xmlCategories = XElement.Load("Categories.xml");
Northwind
is a DataContext
class. A DataContext
is analogous to an ADO.NET Connection
and Command
object rolled into one. You use it to establish your connection, execute queries, or access tables directly via Entity Classes. A DataContext
can be generated directly from the database through Visual Studio by adding a new “LINQ to SQL Classes” item or from the command line using the SQLMetal.exe
. This provides access to the local Northwind
database for the query. A Products DataSet
is loaded from the Products table in the Northwind
database for use in the query. For more on DataContext
, see Recipe 1.6.
The Northwind DataContext
can be generated using SQLMetal.exe
using the following command line syntax:
SqlMetal /server:. /database:Northwind /code:Northwind.cs
XElement
is one of the main classes in LINQ to XML. It enables the loading of existing XML, creation of new XML, or retrieving of the XML text for the element via ToString
. Example 1-1 shows the Categories.xml
file that will be loaded. For more on XElement
and LINQ to XML, see Chapter 15.
<?xml version="1.0" encoding="utf-8"?> <Categories> <Category CategoryID="1" CategoryName="Beverages" Description="Soft drinks, coffees, teas, beer, and ale" /> <Category CategoryID="2" CategoryName="Condiments" Description="Sweet and savory sauces, relishes, spreads, and seasonings" /> <Category CategoryID="3" CategoryName="Confections" Description="Desserts, candies, sweetbreads" /> <Category CategoryID="4" CategoryName="Dairy Products" Description="Cheeses" /> <Category CategoryID="5" CategoryName="Grains/Cereals" Description="Breads, crackers, pasta, and cereal" /> <Category CategoryID="6" CategoryName="Meat/Poultry" Description="Prepared meats" /> <Category CategoryID="7" CategoryName="Produce" Description="Dried fruit and bean curd" /> <Category CategoryID="8" CategoryName="Seafood" Description="Seaweed and fish" /> </Categories>
The two sets of data are joined using LINQ and, in particular, the join
keyword. The data is joined by matching the category id in the products table with the category id in the xml file to combine the data. In SQL terms, the join
keyword represents an inner join:
var expr = from product in products._Products where product.Units_In_Stock > 100 join xc in xmlCategories.Elements("Category") on product.Category_ID equals int.Parse(xc.Attribute("CategoryID").Value)
Once the join result is complete, a new type is projected using the select
keyword:
select new { ProductName = product.Product_Name, Category = xc.Attribute("CategoryName").Value, CategoryDescription = xc.Attribute("Description").Value };
This allows us to combine different data elements from the two sets of data to make a third set that can look completely different than either of the original two.
Doing joins on two sets of database data would be a bad idea, as the database can do this much faster for those sets, but when you need to join disparate data sets, LINQ can lend a helping hand.
1.7. Querying Configuration Files with LINQ
Problem
Sets of data can be stored in many different locations, such as configuration files. You want to be able to query your configuration files for sets of information.
Solution
Use LINQ to query against the configuration sections. In the example below, this is done by retrieving all chapter titles with even numbers and the word “and” in the title from the custom configuration section containing chapter information:
CSharpRecipesConfigurationSection recipeConfig = ConfigurationManager.GetSection("CSharpRecipesConfiguration") as CSharpRecipesConfigurationSection; var expr = from ChapterConfigurationElement chapter in recipeConfig.Chapters.OfType<ChapterConfigurationElement>() where (chapter.Title.Contains("and")) && ((int.Parse(chapter.Number) % 2) == 0) select new { ChapterNumber = "Chapter " + chapter.Number, chapter.Title }; foreach (var chapterInfo in expr) { Console.WriteLine(chapterInfo.ChapterNumber + ": " + chapterInfo.Title); }
The configuration section being queried looks like this:
<CSharpRecipesConfiguration CurrentEdition="3"> <Chapters> <add Number="1" Title="Language Integrated Query (LINQ)"/> <add Number="2" Title="Strings and Characters"/> <add Number="3" Title="Classes and Structures"/> <add Number="4" Title="Generics"/> <add Number="5" Title="Collections"/> <add Number="6" Title="Iterators, Partial Types and Partial Methods"/> <add Number="7" Title="Exception Handling"/> <add Number="8" Title="Diagnostics"/> <add Number="9" Title="Delegates, Events, and Functional Programming"/> <add Number="10" Title="Regular Expressions"/> <add Number="11" Title="Data Structures & Algorithms"/> <add Number="12" Title="Filesystem I/O"/> <add Number="13" Title="Reflection"/> <add Number="14" Title="Web"/> <add Number="15" Title="XML"/> <add Number="16" Title="Networking"/> <add Number="17" Title="Security"/> <add Number="18" Title="Threading and Synchronization"/> <add Number="19" Title="Toolbox"/> <add Number="20" Title="Numbers & Enumerations"/> </Chapters> <Editions> <add Number="1" PublicationYear="2004"/> <add Number="2" PublicationYear="2006"/> <add Number="3" PublicationYear="2007"/> </Editions> </CSharpRecipesConfiguration>
The output from the query is:
Chapter 2: Strings and Characters Chapter 6: Iterators, Partial Types and Partial Methods Chapter 18: Threading and Synchronization
Discussion
Configuration files in .NET play a significant role in achieving manageability and ease of deployment for .NET-based applications. It can be challenging to get all of the various settings right in the hierarchy of configuration files that can affect an application, so understanding how to write utilities to programmatically check configuration file settings is of great use during development, testing, deployment, and ongoing management of an application.
Even though the ConfigurationElementCollection
class (the base of sets of data inconfiguration files) only supports IEnumerable
and not IEnumerable<T>
, we can stilluse it to get the elements we need by using the OfType<ChapterConfigurationElement>
method on the collection, which selects elements of that type from the collection:
var expr = from ChapterConfigurationElement chapter in recipeConfig.Chapters.OfType<ChapterConfigurationElement>()
ChapterConfigurationElement
is a custom configuration section class that holds the chapter number and title:
/// <summary> /// Holds the information about a chapter in the configuration file /// </summary> public class ChapterConfigurationElement : ConfigurationElement { /// <summary> /// Default constructor /// </summary> public ChapterConfigurationElement() { } /// <summary> /// The number of the Chapter /// </summary> [ConfigurationProperty("Number", IsRequired=true)] public string Number { get { return (string)this["Number"]; } set { this["Number"] = value; } } /// <summary> /// The title of the Chapter /// </summary> [ConfigurationProperty("Title", IsRequired=true)] public string Title { get { return (string)this["Title"]; } set { this["Title"] = value; } } }
This technique can be used on the standard configuration files such as machine.config
as well. This example determines which sections in machine.config
require access permissions. For this collection, OfType<ConfigurationSection>
is used, as this is a standard section:
System.Configuration.Configuration machineConfig = ConfigurationManager.OpenMachineConfiguration(); var query = from ConfigurationSection section in machineConfig.Sections. OfType<ConfigurationSection>() where section.SectionInformation.RequirePermission select section; foreach (ConfigurationSection section in query) { Console.WriteLine(section.SectionInformation.Name); }
The sections detected will look something like this:
system.data windows system.webServer mscorlib system.data.oledb system.data.oracleclient system.data.sqlclient configProtectedData satelliteassemblies system.data.dataset startup system.data.odbc system.diagnostics runtime system.codedom system.runtime.remoting assemblyBinding system.windows.forms
1.8. Creating XML Straight from a Database
Solution
Use LINQ to SQL and LINQ to XML to retrieve and transform the data all in one query. In this case, we will select the top five customers in the Northwind
database whose contact is the owner and those owners who placed orders totaling more than $10,000, then create XML containing the company name, contact name, phone number, and total amount of the orders. Finally, the results are written out to the BigSpenders.xml file:
Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString); // Log the generated SQL to the console dataContext.Log = Console.Out; var bigSpenders = new XElement("BigSpenders", from top5 in ( from customer in ( from c in dataContext.Customers // get the customers where the contact is the owner // and they placed orders where c.ContactTitle.Contains("Owner") && c.Orders.Count > 0 join orderData in ( from c in dataContext.Customers // get the customers where the contact is the owner // and they placed orders where c.ContactTitle.Contains("Owner") && c.Orders.Count > 0 from o in c.Orders // get the order details join od in dataContext.OrderDetails on o.OrderID equals od.OrderID select new { c.CompanyName, c.CustomerID, o.OrderID, // have to calc order value from orderdetails //(UnitPrice*Quantity as Total)- (Total*Discount) // as NetOrderTotal NetOrderTotal = ( (((double)od.UnitPrice) * od.Quantity) (((double)od.UnitPrice) * od.Quantity) * od.Discount)) } ) on c.CustomerID equals orderData.CustomerID into customerOrders select new { c.CompanyName, c.ContactName, c.Phone, // Get the total amount spent by the customer TotalSpend = customerOrders.Sum(order => order.NetOrderTotal) } ) // place focus on the customers that spent > 10000 where customer.TotalSpend > 10000 orderby customer.TotalSpend descending // only take the top five spenders select customer).Take(5) ) // format the data as XML select new XElement("Customer", new XAttribute("companyName", top5.CompanyName), new XAttribute("contactName", top5.ContactName), new XAttribute("phoneNumber", top5.Phone), new XAttribute("amountSpent", top5.TotalSpend))); using (XmlWriter writer = XmlWriter.Create("BigSpenders.xml")) { bigSpenders.WriteTo(writer); }
Discussion
LINQ to SQL is the part of LINQ to ADO.NET that facilitates rapid database development. It is targeted at the scenarios where you want to program almost directly against the database schema. Most of these scenarios have one-to-one correlations between strongly typed classes and database tables. If you are in more of an enterprise development scenario with lots of stored procedures and databases that have moved away from “one table equals one entity” scenarios, you would want to look into LINQ to Entities.
To use LINQ to SQL, there are two design tools to help you get started, the visual designer for LINQ to SQL in Visual Studio 2008 and the command line utility SqlMetal.exe
in the SDK. You can access the visual designer by adding a new or opening an existing “LINQ to SQL Classes” item (*.dbml file) to the project, which opens the designer. Both of these help you to build out the DataContext
and Entity Classes for your database that can then be used with LINQ (or other programming constructs if you wish). A DataContext
is analogous to an ADO.NET Connection
and Command
object rolled into one. You use it to establish your connection, execute queries, or access tables directly via Entity Classes. The Northwind
Data Context is a strongly typed instance of a DataContext
generated by SqlMetal
and is partially shown here:
public partial class Northwind : System.Data.Linq.DataContext { private static System.Data.Linq.Mapping.MappingSource mappingSource = new AttributeMappingSource(); #region Extensibility Method Definitions /// removed code for extensibility points for clarity #endregion static Northwind() { } public Northwind(string connection) : base(connection, mappingSource) { OnCreated(); } public Northwind(System.Data.IDbConnection connection) : base(connection, mappingSource) { OnCreated(); } public Northwind(string connection, System.Data.Linq.Mapping.MappingSource mappingSource) : base(connection, mappingSource) { OnCreated(); } public Northwind(System.Data.IDbConnection connection, System.Data.Linq. Mapping.MappingSource mappingSource) : base(connection, mappingSource) { OnCreated(); } public System.Data.Linq.Table<Customers> Customers { get { return this.GetTable<Customers>(); } } // More Table<EntityClass> definitions, one for each table in the database }
The Entity Class definitions for the Northwind
database are all present in the generated code as well, with each table having an Entity Class defined for it. The Entity Classes are indicated by the Table
attribute with no parameters. This means that the name of the Entity Class matches the table name:
[Table()] public partial class Customers : INotifyPropertyChanging, INotifyPropertyChanged { #region Extensibility Method Definitions partial void OnLoaded(); partial void OnValidate(); partial void OnCreated(); partial void OnCustomerIDChanging(string value); partial void OnCustomerIDChanged(); // more extensibility methods to indicate change states for each property... #endregion public event PropertyChangingEventHandler PropertyChanging; public event PropertyChangedEventHandler PropertyChanged;
The standard property change notifications are implemented via INotifyPropertyChanging
and INotifyPropertyChanged
and have PropertyChanging
and PropertyChanged
events for conveying the change to a property. There are also a set of partial methods that will report when a specific property is modified on this Entity Class if the partial method is implemented in another partial class definition for the Entity Class. If no other partial class definition is found, the compiler will remove those notifications. Partial methods enable the declaration of a method signature in one file of a partial class declaration and the implementation of the method in another. If the signature is found but the implementation is not, the signature is removed by the compiler.
The properties in the Entity Class match up to the columns in the database via the Column
attribute, where the Name
value is the database column name and the Storage
value is the internal storage for the class of the data. Events for the property changes are wired into the setter for the property:
[Column(Name="Company Name", Storage="_CompanyName", DbType="NVarChar(40) NOT NULL", CanBeNull=false)] public string CompanyName { get { return this._CompanyName; } set { if ((this._CompanyName != value)) { this.OnCompanyNameChanging(value); this.SendPropertyChanging(); this._CompanyName = value; this.SendPropertyChanged("CompanyName"); this.OnCompanyNameChanged(); } } }
For a one-to-many child relationship, an EntitySet<T>
of the child Entity Class is declared with an Association
attribute. The Association
attribute specifies the relationship information between the parent and child Entity Classes, as shown here for the Orders
property on Customer:
[Association(Name="Orders_FK00", Storage="_Orders", ThisKey="CustomerID", OtherKey="CustomerID", DeleteRule="NO ACTION")] public EntitySet<Orders> Orders { get { return this._Orders; } set { this._Orders.Assign(value); } }
LINQ to SQL covers much more than what has been shown here; we encourage you to investigate it more, but let’s now see the other data domain we are dealing with: LINQ to XML.
LINQ to XML is not only how you perform queries against XML; it is a more developer-friendly way to work with XML. One of the main classes in LINQ to XML is XElement.XElement
allows you to create XML in a manner that more closely resembles the structure of the XML itself. This may not seem like a big deal, but when you can see the XML taking shape in your code, it makes it easier to know where you are. (Ever forget which XmlWriter.WriteEndElement
you were on? We have!) You can get more details and examples about using XElement
in Chapter 15, so we won’t go much further into it here, but as you can see, it is very easy to build up XML in a query.
The first part of the query deals with setting up the main XML element "BigSpenders
“, getting the initial set of customers where the contact is the owner:
var bigSpenders = new XElement("BigSpenders", from top5 in ( (from customer in ( from c in dataContext.Customers // get the customers where the contact is the owner // and they placed orders where c.ContactTitle.Contains("Owner") && c.Orders.Count > 0
The middle of the query deals with joining the order and order detail information with the customer information to get the NetOrderTotal
for the order. It also creates order data containing that value and the customer and order ids and the customer name. We need the NetOrderTotal
in the last part of the query, so stay tuned!
join orderData in ( from c in dataContext.Customers // get the customers where the contact is the owner // and they placed orders where c.ContactTitle.Contains("Owner") && c.Orders.Count > 0 from o in c.Orders // get the order details join od in dataContext.OrderDetails on o.OrderID equals od.OrderID select new { c.CompanyName, c.CustomerID, o.OrderID, // have to calc order value from orderdetails //(UnitPrice*Quantity as Total) (Total*Discount) // as NetOrderTotalNetOrderTotal = (
(((double)od.UnitPrice) * od.Quantity) -
((((double)od.UnitPrice) * od.Quantity) * od.Discount))
} ) on c.CustomerID equals orderData.CustomerID into customerOrders
The last part of the query determines the TotalSpend
for that customer across all orders using the Sum
function on NetOrderTotal
for the generated customerOrders
collection. The last criteria evaluated is that only the top five customers with a TotalSpend
value > 10000 are selected by using the Take
function. Take
is the equivalent to TOP
in SQL. The records are then used to construct one inner Customer
element with attributes that nest inside the BigSpenders
root element started in the first part of the query:
select new { c.CompanyName, c.ContactName, c.Phone, // Get the total amount spent by the customer TotalSpend = customerOrders.Sum(order => order. NetOrderTotal) } ) // only worry about customers that spent > 10000 where customer.TotalSpend > 10000 orderby customer.TotalSpend descending // only take the top 5 spenders select customer).Take(5) ) // format the data as XML select new XElement("Customer", new XAttribute("companyName", top5.CompanyName), new XAttribute("contactName", top5.ContactName), new XAttribute("phoneNumber", top5.Phone), new XAttribute("amountSpent", top5.TotalSpend)));
Tip
It is much easier to build large-nested queries as individual queries first and then put them together once you are sure the inner query is working.
At this point, for all of the code here, nothing has happened yet. That’s right, until the query is accessed, nothing happens through the magic of deferred execution. LINQ has constructed a query expression, but nothing has talked to the database; there is no XML in memory, nada. Once the WriteTo method is called on the bigSpenders
query expression, then the query is evaluated by LINQ to SQL, and the XML is constructed. The WriteTo
method writes out the constructed XML to the XmlWriter
provided, and we are done:
using (XmlWriter writer = XmlWriter.Create("BigSpenders.xml")) { bigSpenders.WriteTo(writer); }
If you are interested in what that SQL will look like, connect the DataContext.Log
property to a TextWriter
(like the console):
// Log the generated SQL to the console dataContext.Log = Console.Out;
This query generates SQL that looks like this:
Generated SQL for query - output via DataContext.Log SELECT [t10].[CompanyName], [t10].[ContactName], [t10].[Phone], [t10].[TotalSpend] FROM ( SELECT TOP (5) [t0].[Company Name] AS [CompanyName], [t0].[Contact Name] AS [ContactName], [t0].[Phone], [t9].[value] AS [TotalSpend] FROM [Customers] AS [t0] OUTER APPLY ( SELECT COUNT(*) AS [value] FROM [Orders] AS [t1] WHERE [t1].[Customer ID] = [t0].[Customer ID] ) AS [t2] OUTER APPLY ( SELECT SUM([t8].[value]) AS [value] FROM ( SELECT [t3].[Customer ID], [t6].[Order ID], ([t7].[Unit Price] * (CONVERT(Decimal(29,4),[t7].[Quantity]))) - ([t7].[Unit Price] * (CONVERT(Decimal(29,4),[t7].[Quantity])) * (CONVERT(Decimal(29,4),[t7].[Discount]))) AS [value], [t7].[Order ID] AS [Order ID2], [t3].[Contact Title] AS [ContactTitle], [t5].[value] AS [value2], [t6].[Customer ID] AS [CustomerID] FROM [Customers] AS [t3] OUTER APPLY ( SELECT COUNT(*) AS [value] FROM [Orders] AS [t4] WHERE [t4].[Customer ID] = [t3].[Customer ID] ) AS [t5] CROSS JOIN [Orders] AS [t6] CROSS JOIN [Order Details] AS [t7] ) AS [t8] WHERE ([t0].[Customer ID] = [t8].[Customer ID]) AND ([t8].[Order ID] = [ t8].[Order ID2]) AND ([t8].[ContactTitle] LIKE @p0) AND ([t8].[value2] > @p1) AN D ([t8].[CustomerID] = [t8].[Customer ID]) ) AS [t9] WHERE ([t9].[value] > @p2) AND ([t0].[Contact Title] LIKE @p3) AND ([t2].[va lue] > @p4) ORDER BY [t9].[value] DESC ) AS [t10] ORDER BY [t10].[TotalSpend] DESC -- @p0: Input String (Size = 0; Prec = 0; Scale = 0) [%Owner%] -- @p1: Input Int32 (Size = 0; Prec = 0; Scale = 0) [0] -- @p2: Input Decimal (Size = 0; Prec = 29; Scale = 4) [10000] -- @p3: Input String (Size = 0; Prec = 0; Scale = 0) [%Owner%] -- @p4: Input Int32 (Size = 0; Prec = 0; Scale = 0) [0] -- Context: SqlProvider(SqlCE) Model: AttributedMetaModel Build: 3.5.20706.1
<BigSpenders> <Customer companyName="Folk och fa HB" contactName="Maria Larsson" phoneNumber="0695-34 67 21" amountSpent="39805.162472039461" /> <Customer companyName="White Clover Markets" contactName="Karl Jablonski" phoneNumber="(206) 555-4112" amountSpent="35957.604972146451" /> <Customer companyName="Bon app'" contactName="Laurence Lebihan" phoneNumber="91.24.45.40" amountSpent="22311.577472746558" /> <Customer companyName="LINO-Delicateses" contactName="Felipe Izquierdo" phoneNumber="(8) 34-56-12" amountSpent="20458.544984650609" /> <Customer companyName="Simons bistro" contactName="Jytte Petersen" phoneNumber="31 12 34 56" amountSpent="18978.777493602414" /> </BigSpenders>
1.9. Being Selective About Your Query Results
Solution
Use the TakeWhile
extension method to retrieve all results until the criteria is matched:
Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString);
var query =
dataContext.Suppliers.GroupJoin(dataContext.Products,
s => s.SupplierID, p => p.SupplierID,
(s, products) => new
{
s.CompanyName,
s.ContactName,
s.Phone,
Products = products
}).OrderByDescending(supplierData => supplierData.Products.Count())
.TakeWhile(supplierData => supplierData.Products.Count() > 3);
Console.WriteLine("Suppliers that provide more than three products: {0}", query.
Count());
foreach (var supplierData in query)
{
Console.WriteLine(" Company Name : {0}",supplierData.CompanyName);
Console.WriteLine(" Contact Name : {0}", supplierData.ContactName);
Console.WriteLine(" Contact Phone : {0}", supplierData.Phone);
Console.WriteLine(" Products Supplied : {0}", supplierData.Products.Count());
foreach (var productData in supplierData.Products)
{
Console.WriteLine(" Product: " + productData.ProductName);
}
}
You can also use the SkipWhile
extension method to retrieve all results once the criteria are matched:
Northwind dataContext = new Northwind(Settings.Default.NorthwindConnectionString);
var query =
dataContext.Suppliers.GroupJoin(dataContext.Products,
s => s.SupplierID, p => p.SupplierID,
(s, products) => new
{
s.CompanyName,
s.ContactName,
s.Phone,
Products = products
}).OrderByDescending(supplierData => supplierData.Products.Count())
.SkipWhile(supplierData =>
{
return supplierData.Products.Count() > 3;
});
Console.WriteLine("Suppliers that provide three or less products: {0}",
query.Count());
foreach (var supplierData in query)
{
Console.WriteLine(" Company Name : {0}",supplierData.CompanyName);
Console.WriteLine(" Contact Name : {0}", supplierData.ContactName);
Console.WriteLine(" Contact Phone : {0}", supplierData.Phone);
Console.WriteLine(" Products Supplied : {0}", supplierData.Products.Count());
foreach (var productData in supplierData.Products)
{
Console.WriteLine(" Product: " + productData.ProductName);
}
}
Discussion
In this example using LINQ to SQL, the number of products each supplier provides is determined, and the result set is sorted in descending order by product count:
var query = dataContext.Suppliers.GroupJoin(dataContext.Products, s => s.SupplierID, p => p.SupplierID, (s, products) => new { s.CompanyName, s.ContactName, s.Phone, Products = products }).OrderByDescending(supplierData => supplierData.Products.Count())
From that result, the supplier data for suppliers is only accepted into the final result set if they provide more than 3 products and the results are displayed. TakeWhile
is used with a lambda expression to determine if the product count is greater than 3, and if so, the supplier is accepted into the result set:
.TakeWhile(supplierData =>
{
return supplierData.Products.Count() > 3;
});
If SkipWhile
was used instead, all of the suppliers that provide 3 or fewer products would be returned:
.SkipWhile(supplierData =>
{
return supplierData.Products.Count() > 3;
});
Being able to write code-based conditions allows for more flexibility than the regular Take
and Skip
methods, which are absolute based on record count, but keep in mind that once the condition is hit for either TakeWhile
or SkipWhile
, you get all records after that, which is why sorting the result set before using these is important.
The query also uses GroupJoin
, which is comparable to a SQL LEFT or RIGHT OUTER JOIN, but the result is not flattened. GroupJoin
produces a hierarchical result set instead of a tabular one, which is used to get the collection of Products by Supplier in this example:
dataContext.Suppliers.GroupJoin
(dataContext.Products,
s => s.SupplierID, p => p.SupplierID,
1.10. Using LINQ with Collections That Don’t Support IEnumerable<T>
Problem
There are a whole bunch of collections that don’t support the generic versions of IEnumerable
or ICollection
but that do support the original nongeneric versions of the IEnumerable
or ICollection
interfaces, and you would like to be able to query those collections using LINQ.
Solution
The type cannot be inferred from the original IEnumeration
or ICollection
interfaces, so it must be provided using either the OfType<T>
or Cast<T>
extension methods or by specifying the type in the from
clause, which inserts a Cast<T> for you. The first example uses Cast<XmlNode>
to let LINQ know that the elements in the XmlNodeList
returned from XmlDocument.SelectNodes
are of type XmlNode
. For an example of how to use the OfType<T>
extension method, see the Discussion section:
// Make some XML with some types that you can use with LINQ // that don't support IEnumerable<T> directly XElement xmlFragment = new XElement("NonGenericLinqableTypes", new XElement("IEnumerable", new XElement("System.Collections", new XElement("ArrayList"), new XElement("BitArray"), new XElement("Hashtable"), new XElement("Queue"), new XElement("SortedList"), new XElement("Stack")),* new XElement("System.Net", new XElement("CredentialCache")), new XElement("System.Xml", new XElement("XmlNodeList")), new XElement("System.Xml.XPath", new XElement("XPathNodeIterator"))), new XElement("ICollection", new XElement("System.Diagnostics", new XElement("EventLogEntryCollection")), new XElement("System.Net", new XElement("CookieCollection")), new XElement("System.Security.AccessControl", new XElement("GenericAcl")), new XElement("System.Security", new XElement("PermissionSet")))); XmlDocument doc = new XmlDocument(); doc.LoadXml(xmlFragment.ToString()); // Select the names of the nodes under IEnumerable that have children and are // named System.Collections and contain a capital S and return that list in descending order var query = from node in doc.SelectNodes("/NonGenericLinqableTypes/IEnumerable/*"). Cast<XmlNode>() where node.HasChildNodes && node.Name == "System.Collections" from XmlNode xmlNode in node.ChildNodes where xmlNode.Name.Contains('S') orderby xmlNode.Name descending select xmlNode.Name; foreach (string name in query) { Console.WriteLine(name); }
The second example works against the Application event log and retrieves the errors that occurred in the last 6 hours. The type of the element in the collection (EventLogEntry
) is provided next to the from
keyword, which allows LINQ to infer the rest of the information it needs about the collection element type:
EventLog log = new EventLog("Application"); var query = from EventLogEntry entry in log.Entries where entry.EntryType == EventLogEntryType.Error && entry.TimeGenerated > DateTime.Now.Subtract(new TimeSpan(6, 0, 0)) select entry.Message; Console.WriteLine("There were " + query.Count<string>() + " Application Event Log error messages in the last 6 hours!"); foreach (string message in query) { Console.WriteLine(message); }
Discussion
Cast<T>
will transform the IEnumerable
into IEnumerable<T>
so that LINQ can access each of the items in the collection in a strongly typed manner. Before using Cast<T>
, it would behoove you to check that all elements of the collection really are of type T, or you will get an InvalidCastException
if the type of the element is not convertible to the type T specified, because all elements will be cast using the type. Placing the type of the element next to the from
keyword acts just like a Cast<T>
:
ArrayList stuff = new ArrayList();
stuff.Add(DateTime.Now);
stuff.Add(DateTime.Now);
stuff.Add(1);
stuff.Add(DateTime.Now);
var expr = from item in stuff.Cast
<DateTime>()
select item;
// attempting to cast the third element throws InvalidCastException
foreach (DateTime item in expr)
{
Console.WriteLine(item);
}
Tip
Note that again because of the deferred execution semantics that the exception that occurs with Cast<T>
or from
only happens once that element has been iterated to.
Another way to approach this issue would be to use OfType<T>
, as it will only return the elements of a specific type and not try to cast elements from one type to another:
var expr = from item in stuff.OfType
<DateTime>()
select item;
// only three elements, all DateTime returned. No exceptions
foreach (DateTime item in expr)
{
Console.WriteLine(item);
}
Get C# 3.0 Cookbook, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.