You have two DataSet
objects with the same schema but containing different data and need to determine the difference between the data in the two.
Compare the two DataSet
objects with the GetDataSetDifference()
method in this solution and return the differences between the data as a DiffGram
.
The solution creates two DataSet
objects each containing a different subset of records from the HumanResources.Department
table in AdventureWorks
. The GetDataSetDifference()
method takes two DataSet
objects with identical schemas as arguments and returns a DiffGram
of the differences between the data in the two.
The C# code in Program.cs in the project DetermineDataDifferenceDataSets
is shown in Example 4-1.
Example 4-1. File: Program.cs for DetermineDataDifferenceDataSets solution
using System; using System.Data; using System.Data.SqlClient; using System.IO; namespace DetermineDataDifferenceDataSets { class Program { static void Main(string[] args) { string sqlConnectString = "Data Source=(local);" + "Integrated security=SSPI;Initial Catalog=AdventureWorks;"; // Fill DataSet A string sqlSelectA = "SELECT * FROM HumanResources.Department " + "WHERE DepartmentID BETWEEN 1 AND 5"; DataSet dsA = new DataSet(); SqlDataAdapter da = new SqlDataAdapter(sqlSelectA, sqlConnectString); da.TableMappings.Add("Table", "TableA"); da.FillSchema(dsA, SchemaType.Source); da.Fill(dsA); // Set the primary key dsA.Tables["TableA"].PrimaryKey = new DataColumn[] { dsA.Tables["TableA"].Columns["DepartmentID"] }; // Fill DataSet B string sqlSelectB = "SELECT * FROM HumanResources.Department " + "WHERE DepartmentID BETWEEN 4 AND 8"; DataSet dsB = new DataSet(); da = new SqlDataAdapter(sqlSelectB, sqlConnectString); da.TableMappings.Add("Table", "TableB"); da.FillSchema(dsB, SchemaType.Source); da.Fill(dsB); // Set the primary key dsB.Tables["TableB"].PrimaryKey = new DataColumn[] { dsB.Tables["TableB"].Columns["DepartmentID"] }; Console.WriteLine(GetDataSet Difference(dsA, dsB)); Console.WriteLine("\nPress any key to continue."); Console.ReadKey(); } private static string GetDataSetDifference(DataSet ds1, DataSet ds2) { // Accept any edits within the DataSet objects. ds1.AcceptChanges(); ds2.AcceptChanges(); // Create a DataSet to store the differences. DataSet ds = new DataSet(); DataTable dt1Copy = null; // Iterate over the collection of tables in the first DataSet. for (int i = 0; i < ds1.Tables.Count; i++) { DataTable dt1 = ds1.Tables[i]; DataTable dt2 = ds2.Tables[i]; // Create a copy of the table in the first DataSet. dt1Copy = dt1.Copy(); // Iterate over the collection of rows in the // copy of the table from the first DataSet. foreach (DataRow row1 in dt1Copy.Rows) { DataRow row2 = dt2.Rows.Find(row1["DepartmentID"]); if (row2 == null) { // Delete rows not in table 2 from table 1. row1.Delete(); } else { // Modify table 1 rows that are different from // table 2 rows. for (int j = 0; j < dt1Copy.Columns.Count; j++) { if (row2[j] == DBNull.Value) { // Column in table 2 is null, // but not null in table 1 if (row1[j] != DBNull.Value) row1[j] = DBNull.Value; } else if (row1[j] == DBNull.Value) { // Column in table 1 is null, // but not null in table 2 row1[j] = row2[j]; } else if (row1[j].ToString() != row2[j].ToString()) { // Neither column in table 1 nor // table 2 is null, and the // values in the columns are // different. row1[j] = row2[j]; } } } } foreach (DataRow row2 in dt2.Rows) { DataRow row1 = dt1Copy.Rows.Find(row2["DepartmentID"]); if (row1 == null) { // Insert rows into table 1 that are in table 2 // but not in table 1. dt1Copy.LoadDataRow(row2.ItemArray, false); } } // Add the table to the difference DataSet. ds.Tables.Add(dt1Copy); } // Write a XML DiffGram with containing the differences between tables. StringWriter sw = new StringWriter(); ds.WriteXml(sw, XmlWriteMode.DiffGram); return sw.ToString(); } } }
The output is shown in Figure 4-1.
A DiffGram
is an XML format used to specify original and current values for the data elements in a DataSet
. It does not include any schema information. The DiffGram
is used by .NET Framework applications as the serialization format for the contents of a DataSet
including changes made to the Dataset
.
A DiffGram
is XML-based, which makes it platform-and application-independent. It is not, however, widely used or understood outside of Microsoft .NET applications.
The DiffGram
format is divided into three sections: current, original, and errors. The original and current data in the DiffGram
can also be used to report the differences between data in two DataSet
objects. For more information about the DiffGram
XML format, see Recipe 9.4.
The solution contains a method GetDataSetDifference()
that takes two DataSet
objects with the same schema as arguments and returns a DiffGram
containing the differences in data when the second DataSet
is compared to the first. Table 4-1 describes how the differences between the DataSet
objects appear in the DiffGram
.
Table 4-1. DiffGram representation of DataSet differences
Condition | DiffGram representation |
---|---|
Row is the same in both | Row data appears only in the current data section of the |
Row is in both | Row data appears in the current data section of the |
Row is in | Row data appears in the current data section of the |
Row is | Row data appears only in the original |
The sample begins by loading two different subsets of data from the HumanResources. Department
table into separate DataSet
objects. In this example, the DataSet
objects both contain just a single table. To determine the difference between the DataSet
objects, the tables within the DataSet
objects are compared as described next and changes are applied to the data in a copy of the first DataSet
until it matches the second DataSet
. Once all differences in all tables are processed, the DiffGram
of the copy of the first DataSet
contains the difference in the second DataSet
when compared to the first DataSet
.
More specifically, a copy of each table is made as it is processed. The data in the copy of the first table is modified to make it consistent with the data in the second table. The modified copy of the first table is then added to the DataSet
containing the differences between the two DataSet
objects.
The process of modifying the data in the copy of the first table to match the data in second table involves several steps:
Rows that are in the copy of the first table but not in the second table (based on the primary key value) are deleted from the copy of the first table.
If the row is found in the second table, the columns are compared and any differences in the columns in the second table are changed in the column in the first table.
Rows that are in the second table but not in the copy of the first table are inserted into the copy of the first table without accepting changes.
Get ADO.NET 3.5 Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.