Data Design

Tables and columns present a logical view of the data in a relational database. The flexibility of a relational database gives you many options for grouping the atomic pieces of data, represented by the columns, into a set of tables. To use Oracle effectively, you must understand and follow some firmly established principles of database design.

The topic of database design is vast and deep: we won’t even pretend to offer more than a cursory overview. For more information, we recommend the book Oracle Design by Dave Ensor and Ian Stevenson (see Appendix B for details).

When E. F. Codd created the concept of a relational database in the 1960s, he also began work on the concept of normalized data design. The theory behind normalized data design is pretty straightforward: a table should contain only the information that’s directly related to the key value of the table. The process of assembling these logical units of information is called normalization of the database design.

The concept of normalized table design was tailored to the capabilities of the relational database. Because you could join data from different tables together in a query, there was no need to keep all the information associated with a particular object together in a single record. You could decompose the information into associated units and simply join the appropriate units together when you needed information that crossed table boundaries.

There are many different methodologies for normalizing data. Here’s one example. You start by defining all the data required by your application:

  1. Identify the objects your application needs to know (the entities ). Examples of entities, as shown in Figure 4-3, include employees, locations, and jobs.

  2. Identify the individual pieces of data, referred to by data modelers as attributes , for these entities. In Figure 4-3, employee name and salary are attributes. Typically, entities correspond to tables and attributes correspond to columns.

  3. As a potential last step in the process, identify relationships between the entities based on your business. These relationships are implemented in the database schema through the use of structures such as foreign keys. For example, the primary key of the DEPARTMENT NUMBER table would be a foreign key column in the EMPLOYEE NAME table used to identify the DEPARTMENT NUMBER in which an employee works.

Normalization is good because it avoids redundant data. Storing the department in every employee record not only would waste space but also would lead to an unmanageable data maintenance issue. If the department name changed, you would have to update every employee record. By normalizing the department data into a table and simply pointing to the appropriate row from the employee rows, you avoid duplication of data and the associated problems.

However, there is an even more important reason to go through the process of designing a normalized database. You can benefit from normalization because of the planning process that normalizing a data design entails. By really thinking about the way the intended applications use data, you get a much clearer picture of the needs the system is designed to serve. This understanding leads to a much more focused database and application.

Normalization also reduces the amount of data that any one row in a table contains. The less data in a row, the less I/O is needed to retrieve it, which helps to avoid this performance bottleneck. In addition, the smaller the data in a row, the more rows are retrieved per data block, which increases the likelihood that more than one desired row will be retrieved in an I/O operation. And the smaller the row, the more rows will be kept in Oracle’s system buffers, which also increases the likelihood that a row will be available in memory when it’s needed, thereby avoiding the need for any disk I/O at all.

Finally, the process of normalization includes the creation of foreign key relationships and other data constraints. These relationships add a level of data integrity to your database design. We’ll describe these relationships in more detail in Section 4.5.

Figure 4-3 shows a simple list of attributes grouped into entities and linked by a foreign key relationship.

The normalization process

Figure 4-3. The normalization process

Creating a normalized data design isn’t the only data design work you will have to do. Once you’ve completed an optimal logical database design, you must go back and consider what indexes you should add to improve the anticipated performance of the database and whether you should designate any tables as part of a cluster or hash cluster.

Since adding these types of performance-enhancing data structures doesn’t affect the logical representation of the database, you can always make these types of modifications later when you see the way an application uses the database in test mode or in production.

Get Oracle Essentials: Oracle9i, Oracle8i and Oracle8, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.