Aggregating Data

Aggregate functions allow you to discover the properties of a group of rows. You use them for purposes such as discovering how many rows there are in a table, how many rows in a table share a property (such as having the same name or date of birth), finding averages (such as the average temperature in November), or finding the maximum or minimum values of rows that meet some condition (such as finding the coldest day in August).

This section explains the GROUP BY and HAVING clauses, the two most commonly used SQL statements for aggregation. But first, it explains the DISTINCT clause, which is used to report unique results for the output of a query. When neither the DISTINCT nor the GROUP BY clause is specified, the returned raw data can still be processed using the aggregate functions that we describe in this section.

The DISTINCT Clause

To begin our discussion on aggregate functions, we’ll focus on the DISTINCT clause. This isn’t really an aggregate function, but more of a post-processing filter that allows you to remove duplicates. We’ve added it into this section because, like aggregate functions, it’s concerned with picking examples from the output of a query, rather than processing individual rows.

An example is the best way to understand DISTINCT. Consider this query:

mysql> SELECT DISTINCT artist_name FROM
    -> artist INNER JOIN album USING (artist_id); +---------------------------+ | artist_name | +---------------------------+ | New Order | | Nick Cave & The ...

Get Learning MySQL now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.