Database Normalization
Database is a collection of data that is organized in an easily way to accessed, managed and update. It is basically a place where the information is stored and operations are performed to store and retrieve the data. Database normalization is a technique that is used to reduce or avoids the redundancy and dependency of data. In normalization big tables are split in small tables and built the relationship between them. It is needed to avoid update and delete anomalies.
The main objective of normalization is to create relations where every dependency is on the key.
Data integrity: data in the database is consistent and satisfy all integrity constraints.
Data redundancy: if data in database is present in two different locations.
If we remove the redundancy it prevent insertion, deletion and update errors since the data is only available in one attribute of one table in the database.
There are many types of normalization
-
First normal form (1NF)
-
Second normal form(2NF)
-
Third normal form(3NF)
-
Boyce codd normal form(BCNF)
-
Fourth normal form(4NF)
-
Fifth normal form(5NF)
First normal form: in first normal form each cell contains the single value and attributes are atomic.
Example:
First name | Last name |
Tom
Marry Ramadhan |
Smith |
Peter
Michael |
Jackson |
Normalized to 1NF
First Name | Last Name |
Tom | Smith |
Marry | Smith |
Ramadhan | Smith |
Peter | Jackson |
Michael | Jackson |
Second Normal Form: To understand the 2NF we get to know the about the Prime attribute and the non prime attribute. Prime attribute is the part of the prime key and non prime attribute is not a part of prime key. All non prime attributes are fully dependent on the prime attribute.
Student Id | Name | Project Id | Project name |
1 | Tom | 1 | Management |
2 | Marry | 1 | Management |
3 | John | 2 | ERP |
Normalized in Second normal form:
Student id | Name | Project id |
1 | Tom | 1 |
2 | Marry | 1 |
3 | John | 2 |
Project Id | Project Name |
1 | Management |
2 | ERP |
Third Normal form: If the 2NF already in the table and the no exists of transitive dependency is there.
Student | Id | Project | Marks |
Tom | 1 | A | 90 |
John | 2 | B | 70 |
Mary | 3 | A | 90 |
Normalized form in 3NF:
Student | Id | Project |
Tom | 1 | A |
John | 2 | B |
Mary | 3 | A |
Project | Marks |
A | 90 |
B | 70 |
Boyce-codd normal form: It is a higher version of 3NF. The problems that are involved in the 3NF are solved in BCNF. In third normal form table the attributes does not have the multiple overlapping candidate keys is said to be in BCNF. More than one candidate is present and some common attributes are there in the relation.
Fourth normal form: A table is in fourth normal form only when it is in BCNF and contains not more than one multi valued. It involves the many to many relations.
Student | Skills | Hobbies |
1 | A | X |
1 | A | Y |
1 | B | X |
1 | B | Y |
2 | B | X |
2 | B | Z |
2 | C | X |
2 | C | Z |
Normalized in 4NF:
Student | Skills |
1 | A |
1 | B |
2 | B |
2 | C |
Student | Hobbies |
1 | X |
1 | Y |
2 | X |
2 | Z |
Fifth normal form: fifth normal form is also known as the project join normal form. If it is in 4NF and it cannot have a lossless decomposition into any number of smaller tables. In 5NF table should have only candidate key and its primary key should consist of only a single column.
Seller | Company | Product |
Tom | LG | Refrigerator |
Smith | Samsung | LED |
Smith | Samsung | Washing machine |
Smith | Samsung | Air conditioner |
John | Whirlpool | Television |
John | Daikin | Split ac |
Normalized in 5NF
Seller | Company |
Tom | LG |
Smith | Samsung |
John | Whirlpool |
John | Daikin |
Company | Product |
LG | Refrigerator |
Samsung | LED |
Samsung | Washing machine |
Samsung | Air conditioner |
Whirlpool | Television |
Daikin | Split ac |
Seller | Product |
Tom | Refrigerator |
Smith | LED |
Smith | Washing machine |
Smith | Air conditioner |
John | Television |
John | Split ac |
Leave A Comment