Chapter 4. Database Design Principles
In Chapter 1 I tried to present a convincing case for why most databases should be modeled as relational databases, rather than single-table flat databases. I tried to make it clear why I split the single LIBRARY_FLAT table into four separate tables: AUTHORS, BOOKS, PUBLISHERS, and BOOK/AUTHOR.
However, for large real-life databases, it is not always clear how to split the data into multiple tables. As I mentioned in Chapter 1, the goal is to minimize redundancy, without losing any information.
The problem of effective database design is a complex one. Most people consider it an art rather than a science. This means that intuition plays a major role in good design. Nonetheless, there is a considerable theory of database design, and it can be quite complicated. My goal in this chapter is to touch upon the general ideas, without becoming involved in the details. Hopefully, this discussion will provide a helpful guide to the intuition needed for database design.
Redundancy
As we saw in Chapter 1, redundant data tends to inflate the size of a database, which can be a very serious problem for medium to large databases. Moreover, redundancy can lead to several types of anomalies, as discussed earlier. To understand the problems that can arise from redundancy, we need to take a closer look at what redundancy means.
Let us begin by observing that the attributes of a table scheme can be classified into three groups:
Attributes used strictly for identification ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access