RELATIONS vs. RELVARS

Now, it’s entirely possible that you already knew everything I’ve been telling you in this chapter so far; in fact, I rather hope you did, though I also hope that didn’t mean you found the material boring. Anyway, now I come to something you might not know already. The fact is, historically there’s been a lot of confusion over yet another logical difference: namely, that between relations as such, on the one hand, and relation variables on the other.

Forget about databases for a moment; consider instead the following simple programming language example. Suppose I say in some programming language:

    DECLARE N INTEGER ... ;

Then N here is not an integer. Rather, it’s a variable, whose values are integers as such—different integers at different times. We all understand that. Well, in exactly the same way, if I say in SQL—

     CREATE TABLE T ... ;

—then T is not a table: It’s a variable, a table variable or (as I would prefer to say, ignoring various SQL quirks such as duplicate rows and left to right column ordering) a relation variable, whose values are relations as such (different relations at different times).

Take another look at Figure 1-3, the suppliers-and-parts database. That figure shows three relation values—namely, the relation values that happen to exist in the database at some particular time. But if we were to look at the database at some different time, we would probably see three different relation values appearing in their place. In other words, S, P, and SP in that database are really variables: relation variables, to be precise. For example, suppose the relation variable S currently has the value—the relation value, that is—shown in Figure 1-3, and suppose we delete the set of tuples (actually there’s only one) for suppliers in Athens:

     DELETE S WHERE CITY = 'Athens' ;

Here’s the result:

SNO

SNAME

STATUS

CITY

S1

Smith

20

London

S2

Jones

10

Paris

S3

Blake

30

Paris

S4

Clark

20

London

Conceptually, what’s happened here is that the old value of S has been replaced in its entirety by a new value. Of course, the old value (with five tuples) and the new one (with four) are very similar, in a sense, but they certainly are different values. In fact, the DELETE just shown is logically equivalent to, and indeed shorthand for, the following relational assignment:

     S := S MINUS ( S WHERE CITY = 'Athens' ) ;

As with all assignments, the effect here is that (a) the source expression on the right side is evaluated and then (b) the value that’s the result of that evaluation is then assigned to the target variable on the left side, with the overall result already explained.

Aside: I can’t show the foregoing assignment in SQL because SQL doesn’t directly support relational assignment. Instead, I’ve shown it (as well as the original DELETE) in a more or less self-explanatory language called Tutorial D. Tutorial D is the language Hugh Darwen and I use to illustrate relational ideas in our book Databases, Types, and the Relational Model: The Third Manifesto (see Appendix G)—and I’ll use it in the present book too, when I’m explaining relational concepts.[16] But since my intended audience is SQL practitioners, I’ll show SQL analogs as well, most of the time. Note: A BNF grammar for Tutorial D can be found in Appendix D. End of aside.

To repeat, DELETE is shorthand for a certain relational assignment—and, of course, an analogous remark applies to INSERT and UPDATE also: They too are basically just shorthand for certain relational assignments. Thus, as I mentioned in the section A REVIEW OF THE ORIGINAL MODEL, relational assignment is the fundamental update operator in the relational model; indeed it’s the only update operator we really need, logically speaking.

So there’s a logical difference between relation values and relation variables. The trouble is, the database literature has historically used the same term, relation, to stand for both, and that practice has certainly led to confusion.[17] In this book, therefore, I’ll distinguish very carefully between the two from this point forward—I’ll talk in terms of relation values when I mean relation values and relation variables when I mean relation variables. However, I’ll also abbreviate relation value, most of the time, to just relation (exactly as we abbreviate integer value most of the time to just integer). And I’ll abbreviate relation variable most of the time to relvar; for example, I’ll say the suppliers-and-parts database contains three relvars (more precisely, three base relvars).

As an exercise, you might like to go back over the text of this chapter so far and see exactly where I used the term relation when I really ought to have been using the term relvar instead (or as well).



[16] Several reviewers complained about this fact—that is, they felt I should be using SQL itself, not some nonstandard language like Tutorial D, in order to illustrate relational ideas. (One even suggested the book be renamed “Tutorial D and Relational Theory”!) But SQL as such was never intended to be a vehicle for illustrating relational ideas, while Tutorial D explicitly was; and in any case, SQL simply isn’t adequate to the task. Indeed, if it were, a book like this one wouldn’t be necessary in the first place.

[17] SQL makes the same mistake, of course, because it too has just one term, table, that has to be understood as sometimes meaning a table value and sometimes a table variable.

Get SQL and Relational Theory, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.