O'Reilly logo

Access Data Analysis Cookbook by Wayne S. Freeze, Ken Bluttman

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Combining Data with Union Queries

Problem

I need to combine sets of data so I can run analyses on them. The sets of data are identical but sit in different tables. There's no way to combine the data in the query grid. I could use append queries to copy data from the various tables to a master table, but this is inefficient. The data in the smaller tables changes from time to time, and having to rerun the appends is a nuisance. Isn't there a way to simply combine the data at any time as needed, so the latest data in the smaller tables is always present?

Solution

A union query is the perfect vehicle for combining identically structured data. To create a union query, place Union SQL clauses between the Select statements that query the tables.

Figure 1-35 shows three tables with an identical structure. Let's take a look at how to combine the data from these three tables.

Union queries must be written in the SQL pane of the query designer. It is not possible to represent them in the query grid. Here's a SQL statement written in the SQL pane of the query designer:

	SELECT * From SeasonScores_Putnam
	Union
	SELECT * From SeasonScores_Rockland
	Union
	SELECT * From SeasonScores_Westchester;
Three tables with identically structured data

Figure 1-35. Three tables with identically structured data

Running the query returns a single set of data, shown in Figure 1-36.

The result of running a union query

Figure 1-36. The result of running a union query

All the records from the three tables are now together in one place. This query can be saved and then used as the source for other queries and further analysis. For example, this saved query can be used in a query that calculates an average or some other summarization. If and when any data changes back in the source tables, the new data will flow through to the output of this union query, as each time it is rerun, it uses the latest data from the source tables.

Discussion

A hard-and-fast rule is that all the selects feeding into a union query must have the same number of fields. In the previous example, this was a given because the three source tables were identical in structure. However, imagine assembling a list of names from various data tables, such as a Contacts table, a Prospects table, and a HolidayList table.

Figure 1-37 shows the design of these three tables. Each table has a field for a first name and a last name, although the fields are not named exactly the same. Also note that the number of fields is not consistent among all the tables. To avoid this being an issue, you must specify actual field names in the Select statements and ensure that you specify the same number of fields from each table.

Three tables with similar information

Figure 1-37. Three tables with similar information

A union SQL statement that will combine the first and last names from these tables can be written like this:

	SELECT FirstName, LastName From Contacts;
	Union
	Select [First Name], [Last Name] From HolidayList
	Union
	Select [Prospect First Name], [Prospect Last Name]
	From Prospects
	Order By LastName, FirstName

Tip

Brackets ([]) must be placed around field and table names that contain spaces.

The result of the query is shown in Figure 1-38. The field names presented in the query (FirstName and LastName) are taken from the first Select statement.

A union query based on three tables

Figure 1-38. A union query based on three tables

While each source table on its own may be free of duplicates, it is possible that some duplicates will occur in the combined output. For example, the same person might be in the Contacts table and the Prospects table. SQL provides a way to handle duplicates that appear when union queries are run.

By default, a union query will drop duplicates. If you want to include them in the result, you'll need to use the Union All construct, as shown here:

	SELECT FirstName, LastName From Contacts;
	Union All
	Select [First Name], [Last Name] From HolidayList
	Union All
	Select [Prospect First Name], [Prospect Last Name]
	From Prospects
	Order By LastName, FirstName

Including the All keyword forces duplicates to be preserved in the query result.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required