Chapter 4
No Duplicates, No Nulls
I haven’t even mentioned yet the way the silly notions
Discussed so far interreact and lead us into oceans
Of complication and despond and general distress.
Are two nulls equal (duplicates)? I fear, both NO and YES.
—Anon.: Where Bugs Go
In the previous chapter, I said the following (approximately):
Relations never contain duplicate tuples, because the body of a relation is a set (a set of tuples) and sets in mathematics don’t contain duplicate elements.
Relations never contain nulls, because the body of a relation is a set of tuples, and tuples in turn never contain nulls.
I also suggested that since there was so much to be said about these topics, it was better to devote a separate chapter to them. This is that chapter. Note: By definition, the topics in question are SQL topics, not relational ones; in this chapter, therefore, I’ll use the terminology of SQL rather than that of the relational model (for the most part, at any rate).
WHAT’S WRONG WITH DUPLICATES?
There are numerous practical arguments in support of the position that duplicate rows (“duplicates” for short) should be prohibited. Here I want to emphasize just one—but I think it’s a powerful one.1 However, it does rely on certain notions I haven’t discussed yet in this book, so I need to make a couple of preliminary assumptions:
I assume you know that relational DBMSs include a component called the optimizer,2 whose job is to try to figure out the best way to implement user queries ...