Normalization In Database
Normalization In Database: The process to convert the complex data structure into simple and stable data structure is called normalization.
The purpose of normalization is to produce a stable set of relations. Basically, a normalized relation has a simple structure, than an un-normalized one. By applying the normalization principles, we can improve the stability of the relations.
Normalization Steps
Normalization is often accomplished in steps, each of which corresponds to a normal form. A normal form is a state of a relation that can be determined by applying simple rules regarding dependencies to that relation.
A brief description of major steps are given as
First Normal Form (1st NF)
Any repeating group from the relation is removed. After applying this rule, each cell of the relation has a single value.
Second Normal Form (2nd NF)
Any partial functional dependency of the relation is removed.
Third Normal Form (3rd NF)
Any transitive dependency is removed.
Boyce-Cod Normal Form
Every determinant must be a candidate key.
Fourth Normal Form (4th NF)
Any multi-valued dependency is removed.
Fifth Normal Form (5TH NF)
Any remaining anomaly is removed.
The table will be solve by normalization is as follows
First Normal Form
For example the data of employees of an organization is shown as the following table
Employee ID | Name | Department | Salary | Course | Date Completed |
AC-201 | Baber | Business | 25000 | MS-Office | 06-02-2019 |
MK-101 | Ahmad | Marketing | 27000 | Quality Management | 09-09-2017 |
IT-301 | Ali | IT | 26000 | Visual Basic C++ Java | 01-02-2013 02-03-2014 04-05-2015 |
After the 1st Normal form the table will be show as
Employee ID | Name | Department | Salary | Course | Date Completed |
AC-201 | Baber | Business | 25000 | MS-Office | 06-02-2019 |
MK-101 | Ahmad | Marketing | 27000 | Quality Management | 09-09-2017 |
IT-301 | Ali | IT | 26000 | Visual Basic | 01-02-2013 |
IT-301 | Ali | IT | 26000 | C++ | 02-03-2014 |
IT-301 | Ali | IT | 26000 | Java | 04-05-2015 |
Second Normal Form
A relation in the second normal form if and only if it is in the first normal form and all the non key attributes are fully functionally dependent on the key attribute. It is an important step in normalization in database.
In this case
EMPLOYEE (ID, name, Department, salary, Course, Date Completed)
The functional dependencies in this relation are as follows
Emp ID Name, Dept, Salary
Emp ID, Course Date Completed
EMPLOYEE Relation
Employee ID | Name | Department | Salary |
AC-201 | Baber | Business | 25000 |
MK-101 | Ahmad | Marketing | 27000 |
IT-301 | Ali | IT | 26000 |
IT-301 | Ali | IT | 26000 |
IT-301 | Ali | IT | 26000 |
Course Relation
Employee ID | Course | Date Completed |
AC-201 | MS-Office | 06-02-2019 |
MK-101 | Quality Management | 09-09-2017 |
IT-301 | Visual Basic | 01-02-2013 |
IT-301 | C++ | 02-03-2014 |
IT-301 | Java | 04-05-2015 |
Third Normal Form
A relation in third normal form if it is in second normal form and no transitive dependency exist.
A third normal form also defined as A non-key attribute must not depend on any other non-key attribute of the relation.
The table of PATIENT-HISTORY (PatientID, Visit Date, Physician, Diagnosis, Treatment)
Patient ID | Visit Date | Physician | Diagnosis | Treatment |
P-100809 | 01-02-2104 | Dr Ali | Chest Infection | Free |
P-100809 | 02-03-2015 | Dr Ahmad | Cold | Free |
P-100809 | 03-04-2016 | Dr Jawad | Hepatitis-A | Paid |
P-200145 | 04-05-2017 | Dr Ahsan | Eye Infection | Paid |
P-200145 | 05-06-2018 | Dr Usman | Cough | Free |
P-200145 | 06-07-2018 | Dr Atif | Flu | Free |
Now the dependency
DIAGNOSIS (Diagnosis, Treatment)
Diagnosis | Treatment |
Chest Infection | Free |
Cold | Free |
Hepatitis-A | Paid |
Eye Infection | Paid |
Cough | Free |
Flu | Free |
Fourth Normal Form
A relation is in Fourth Normal Form if and only if it is in Boyce Coded Normal Form and has no multi-valued dependencies.
The table of MEDICINES relation.
Disease | Physician | Medicine |
Cough | Dr. Imran Dr. Tahir Dr. Asghar | Actifed-DM Panadol |
Headache | Dr. Imran Dr. Zahid | Paracetamol Panadol |
After the removing the multi-value dependency the table will be shown as
Disease | Physician | Medicine |
Cough | Dr. Imran | Actifed-DM |
Cough | Dr. Tahir | Actifed-DM |
Cough | Dr. Asghar | Actifed-DM |
Cough | Dr. Imran | Panadol |
Cough | Dr. Tahir | Panadol |
Cough | Dr. Asghar | Panadol |
Headache | Dr. Imran | Paracetamol |
Headache | Dr. Zahid | Paracetamol |
Headache | Dr. Imran | Panadol |
Headache | Dr. Zahid | Panadol |
Fifth Normal Form
A relation is in fifth normal form if it is in fourth normal form and does not have a join dependency. It is also known as Project-Join Normal form.
A relation that has a join dependency cannot be divided into two or more relations such that the resulting relations can be re-combined to form the original relation. It is final step of normalization in database.