66 Modernizing IBM Eserver iSeries Application Data Acess - A Roadmap Cornerstone
6.1 Database normalization
Normalization is the process of removing redundant data from your tables in order to improve
storage efficiency, data integrity, and scalability
. A table in a relational database is said to be
in a certain normal form if it satisfies certain constraints. Edgar Codd’s original definition
defined three such forms, but there are now other forms accepted. Each normal form
represents a stronger condition than the previous one, which means that a higher level of
normalization cannot be achieved until the previous levels have been achieved.
The
First Normal Form (or 1NF) involves removal of redundant data from horizontal rows.
We want to ensure that there is no duplication of data in a given row, and that every column
stores the least amount of information possible (making the field atomic). For example,
normalization eliminates repeating groups by putting each into a separate table and
connecting them with a primary key-foreign key relationship.
The
Second Normal Form (or 2NF) deals with redundancy of data in vertical columns.
The
Third Normal Form deals with looking for data in the tables that is not fully dependant on
the primary key, but dependant on another value in the table. This is an ideal form for OLTP
environments.
It is not within the scope of this book to explain the normalization process. The reality is that
many of the iSeries customers have not taken the time and effort to normalize their
databases. In this stage of the modernization process it would be a good idea to take some
time and revisit the database design.
The steps involved in this stage are:
1. Eliminate unnecessary columns from the SQL tables.
In the process of normalization some unnecessary columns will be eliminated or moved to
other tables. Many tables contain columns that were intended for some purpose, however,
over the course of business these columns are no longer used or were never used at all.
This is the opportunity to identify and remove these columns.
2. Establish a data dictionary.
A data dictionary was actually started in stage 1 when we were defining standard
abbreviations for table names. The following is a list of some of the contents of the data
dictionary:
Object naming conventions
Column naming conventions
Function naming conventions
Application naming conventions
Standard abbreviations
All existing database columns
Object relationships
Business rules
3. Establish data domains.
This is the process of grouping columns with like attributes into classes or domains. We
can implement the data dictionary with established domains using Field Reference files,
since the SQL CREATE TABLE statement can now reference this file.
4. Create a logical database model.
5. Implement the model.

Get Modernizing IBM eServer iSeries Application Data Access - A Roadmap Cornerstone now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.