Tuesday, December 12, 2006

Database Normalization

I was talking to a developer a while ago and when I started talking about database normalization the other person was kind of lost in a vaccum. I was wondering what happened, but then realized that he does not know much about the normalization and he finds it kind of boring. Boring because all the text books give very lengthy explanation of normalization and most of the times they are boring to read :)


Being a Developer It's important to understand the differences between 3NF, BCNF, 4NF, and 5NF along with other normal forms. Here are concise definitions of each normal form.

First normal form (1NF)
Each table must have a primary key, i.e., a minimal set of attributes that can uniquely identify a record. Eliminate repeating groups (categories of data that would seem to be required a different number of times on different records) by defining keyed and non-keyed attributes appropriately. Atomicity: Each attribute must contain a single value, not a set of values.

Second normal form (2NF)
The database must meet all the requirements of the 1NF. In addition to that, if a table has a composite key, all attributes must be related to the whole key. And, data that is redundantly duplicated across multiple rows of a table is moved out to a separate table.

Third normal form (3NF)
In Addition to all the requirements for 2NF, The data stored in a table must be dependent only on the primary key and not on any other field in the table. Any field that is dependent not only on the primary key but also on another field is moved out to a separate table.

Boyce-Codd normal form (BCNF)
There must be no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey).

Fourth normal form (4NF)
There must be no non-trivial multi-valued dependencies of attribute sets on something other than a superset of a candidate key. A table is said to be in 4NF if and only if it is in the BCNF and multi-valued dependencies are functional dependencies. The 4NF removes unwanted data structures: multi-valued dependencies.

Fifth normal form (5NF)
There must be no non-trivial join dependencies that do not follow from the key constraints. A table is said to be in the 5NF if and only if it is in 4NF and every join dependency in it is implied by the candidate keys.

Happy Reading :)

Until Next time... :)

No comments: