O'Reilly logo

SQL and Relational Theory, 2nd Edition by C.J. Date

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

PROPERTIES OF RELATIONS

Now let’s get back to our examination of basic relational concepts. In this section, I want to focus on some specific properties of relations themselves. First of all, every relation has a heading and a body: The heading is a set of attributes (where by the term attribute I mean, very specifically, an attribute-name/type-name pair, and no two attributes in the same heading have the same attribute name), and the body is a set of tuples that conform to that heading. In the case of the suppliers relation in Figure 1-3, for example, there are four attributes in the heading and five tuples in the body. Note, therefore, that a relation doesn’t really contain tuples—it contains a body, and that body in turn contains the tuples—but we do usually talk as if relations contained tuples directly, for simplicity.

By the way, although it’s strictly correct to say the heading consists of attribute-name/type-name pairs, it’s usual to omit the type names in pictures like Figure 1-3 and hence to pretend the heading is just a set of attribute names. For example, the STATUS attribute does have a type—INTEGER, let’s say—but I didn’t show it in Figure 1-3. But you should never forget it’s there!

Next, the number of attributes in the heading is the degree (sometimes the arity), and the number of tuples in the body is the cardinality. For example, relation S in Figure 1-3 has degree 4 and cardinality 5; likewise, relation P in that figure has degree 5 and cardinality 6, and relation SP in that figure has degree 3 and cardinality 12. Note: The term degree is used in connection with tuples also.[11] For example, the tuples in relation S are (like relation S itself) all of degree 4.

Next, relations never contain duplicate tuples. This property follows because a body is defined to be a set of tuples, and sets in mathematics don’t contain duplicate elements. Now, SQL fails here, as I’m sure you know: SQL tables are allowed to contain duplicate rows and thus aren’t relations, in general. Please understand, therefore, that throughout this book I always use the term “relation” to mean a relation—without duplicate tuples, by definition—and not an SQL table. Please understand too that relational operations always produce a result without duplicate tuples, again by definition. For example, projecting the suppliers relation of Figure 1-3 on CITY produces the result shown here on the left and not the one on the right:

image with no caption

(The result on the left can be obtained via the SQL query SELECT DISTINCT CITY FROM S. Omitting that DISTINCT leads to the nonrelational result on the right. Note in particular that the table on the right has no double underlining; that’s because it has no key, and hence no primary key a fortiori.)

Next, the tuples of a relation are unordered, top to bottom. This property follows because, again, a body is defined to be a set, and sets in mathematics have no ordering to their elements (thus, for example, {a,b,c} and {c,a,b} are the same set in mathematics, and a similar remark naturally applies to the relational model). Of course, when we draw a relation as a table on paper, we do have to show the rows in some top to bottom order, but that ordering doesn’t correspond to anything relational. In the case of the suppliers relation as depicted in Figure 1-3, for example, I could have shown the rows in any order—say supplier S3, then S1, then S5, then S4, then S2—and the picture would still represent the same relation. Note: The fact that relations have no ordering to their tuples doesn’t mean queries can’t include an ORDER BY specification, but it does mean such queries produce a result that’s not a relation. ORDER BY is useful for displaying results, but it isn’t a relational operator as such.

In similar fashion, the attributes of a relation are also unordered, left to right, because a heading too is a mathematical set. Again, when we draw a relation as a table on paper, we have to show the columns in some left to right order, but that ordering doesn’t correspond to anything relational. In the case of the suppliers relation as depicted in Figure 1-3, for example, I could have shown the columns in any left to right order—say STATUS, SNAME, CITY, SNO—and the picture would still represent the same relation in the relational model. Incidentally, SQL fails here too: SQL tables do have a left to right ordering to their columns (another reason why SQL tables aren’t relations, in general). For example, these two pictures represent the same relation but different SQL tables:

image with no caption

(The corresponding SQL queries are SELECT SNO, CITY FROM S and SELECT CITY, SNO FROM S, respectively. Now, you might be thinking that the differences between these two queries, and between these two tables, are hardly very significant; in fact, however, they have some serious consequences, some of which I’ll be touching on in later chapters. See, for example, the discussion of SQL’s explicit JOIN operator in Chapter 6.)

Finally, relations are always normalized (equivalently, they’re in first normal form, 1NF).[12] Informally, what this means is that, in terms of the tabular picture of a relation, at every row and column intersection we always see just a single value. More formally, it means that every tuple in every relation contains just a single value, of the appropriate type, in every attribute position. Note: I’ll have quite a lot more to say on this particular issue in the next chapter.

Before I finish with this section, I’d like to emphasize something I’ve touched on several times already: namely, the fact that there’s a logical difference between a relation as such, on the one hand, and a picture of a relation as shown in, for example, Figure 1-1 and Figure 1-3, on the other. To say it one more time, the constructs in Figure 1-1 and Figure 1-3 aren’t relations at all but, rather, pictures of relations—which I generally refer to as tables, despite the fact that table is a loaded word in SQL contexts. Of course, relations and tables do have certain points of resemblance, and in informal contexts it’s usual, and usually acceptable, to say they’re the same thing. But when we’re trying to be precise—and right now I am trying to be a little bit precise—then we do have to recognize that the two concepts are not identical.

As an aside, I observe that, more generally, there’s a logical difference between a thing of any kind and a picture of that thing. There’s a famous painting by Magritte that beautifully illustrates the point I’m trying to make here. The painting is of an ordinary tobacco pipe, but underneath Magritte has written Ceçi n’est pas une pipe ... the point being, of course, that obviously the painting isn’t a pipe—instead, it’s a picture of a pipe.

All of that being said, I should now say too that it’s actually a major advantage of the relational model that its basic abstract object, the relation, does have such a simple representation on paper; it’s that simple representation on paper that makes relational systems easy to use and easy to understand, and makes it easy to reason about the way such systems behave. However, it’s unfortunately also the case that that simple representation does suggest some things that aren’t true (e.g., that there’s a top to bottom tuple ordering).

And one further point: I’ve said there’s a logical difference between a relation and a picture of a relation. The concept of logical difference derives from a dictum of Wittgenstein’s:

All logical differences are big differences.

This notion is an extraordinarily useful one; as a “mind tool,” it’s a great aid to clear and precise thinking, and it can be very helpful in pinpointing and analyzing some of the confusions that are, unfortunately, all too common in the database world. I’ll be appealing to it many times in the pages ahead. Meanwhile, let me point out that we’ve encountered quite a few important logical differences already. Here are some of them:

  • SQL vs. the relational model

  • Model vs. implementation

  • Data model (first sense) vs. data model (second sense)

And we’ll be meeting many more in the pages ahead.

Some Crucial Points

At this juncture I’d like to mention some crucial points that I’ll be elaborating on in later chapters (especially Chapter 3). The points in question are these:

  • Every subset of a tuple is a tuple: For example, consider the tuple for supplier S1 in Figure 1-3. That tuple has four components, corresponding to the four attributes SNO, SNAME, STATUS, and CITY. And if we remove (say) the SNAME component, what’s left is indeed still a tuple: viz., a tuple with three components (a tuple of degree three).

  • Every subset of a heading is a heading: For example, consider the heading of the suppliers relation in Figure 1-3. That heading has four attributes: SNO, SNAME, STATUS, and CITY. And if we remove (say) the SNAME and STATUS attributes, what’s left is still a heading, a heading of degree two.

  • Every subset of a body is a body: For example, consider the body of the suppliers relation in Figure 1-3. That body has five tuples, corresponding to the five suppliers S1, S2, S3, S4, and S5. And if we remove (say) the S1 and S3 tuples, what’s left is still a body, a body of cardinality three.

Note: Perhaps I should state for the record here that throughout this book—in accordance with normal practice—I take expressions of the form “B is a subset of A” to include the possibility that A and B might be equal. Thus, for example, every tuple is a subset of itself (and so is every heading, and so is every body). When I want to exclude such a possibility, I’ll talk explicitly in terms of proper subsets. For example, our usual tuple for supplier S1 is certainly a subset of itself, but it isn’t a proper subset of itself. What’s more, the foregoing remarks apply equally to supersets, mutatis mutandis; for example, the tuple for supplier S1 is a superset of itself, but not a proper superset of itself.[13]

I’d also like to say something about the crucial notion of equality—especially as that notion applies to tuples and relations specifically. In general, two values are equal if and only if they’re the very same value. For example, the integer 3 is equal to the integer 3, and not to anything else—in particular, not to any other integer. In exactly the same way, two tuples are equal if and only if they’re the very same tuple. With reference to Figure 1-1, for example, the tuple for supplier S1 is equal to the tuple for supplier S1, and not to anything else—in particular, not to any other tuple. In other words, two tuples are equal if and only if (a) they involve exactly the same attributes and (b) corresponding attribute values are equal in turn.

Moreover (this might seem obvious, but it needs to be said), two tuples are duplicates of each other if and only if they’re equal.

Turning now to relations: In exactly the same way, two relations are equal if and only if they’re the very same relation. With reference to Figure 1-1, for example, the suppliers relation is equal to the suppliers relation and not to anything else—in particular, not to any other relation. In other words, two relations are equal if and only if, in turn, their headings are equal and their bodies are equal.

As I’ve already said, I’ll be returning to these matters in Chapter 3. Here let me just add that the notion of tuple equality in particular is absolutely fundamental—just about everything in the relational model is crucially dependent on it, as we’ll see.



[11] It’s also used in connection with keys (see Chapter 5).

[12] “First” normal form because, as I’m sure you know, it’s possible to define a series of “higher” normal forms—second normal form, third normal form, and so on—that are relevant to the discipline of database design.

[13] What I’ve described in this paragraph is the standard mathematical convention; however, you might have encountered a different convention in less formal contexts. To be specific, some people use “B is a subset of A” to mean what I mean when I say B is a proper subset of A, and use “B is a subset of or equal to A” to mean what I mean when I say B is a subset of A. Similarly for supersets, of course, mutatis mutandis.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required