AM I NORMAL? BASICS OF RELATIONAL DATABASE DESIGN

Now that some ground rules have been covered, we can proceed to designing our database. A database design still remains more of an art than an exact science, but there are few simple rules to follow to organize raw data into a well-behaved relational data model. Traditionally, the process is called database normalization, and its steps are measured in forms: first normal form (1NF), second normal form (2NF), and so on up to fifth normal form (5NF).

The rules for the 1NF state that:

  • Each row has to be unique.
  • There should be no repeating groups of data.
  • All columns have to contain only atomic values.

Atomic value refers to a singular indivisible piece of data. Hint: A column that contains a list of comma-separated values won't be considered atomic. Nothing helps more to achieve uniqueness than integrity — entity integrity. Every 1NF table must have a primary key.

The 2NF builds upon the predecessor, the 1NF, and adds an additional rule:

  • No partial functional dependencies are allowed, or only columns with full dependency on the primary key are allowed.

Now, what is partial functional dependency? It's when nonkey columns do not have full allegiance to the primary key and might be dependent on others. For example, in the data model shown on Figure 3-1 the column BK_PUBLISHER in the BOOKS table does not really depend on the primary key BK_ID, and ought to be isolated into a table of its own; we leave it in the BOOKS table partially ...

Get Discovering SQL: A Hands-On Guide for Beginners now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.