Normal Forms (NF) are a set of guidelines or rules used to design relational database schemas that minimize redundancy and dependency. The goal of using normal forms is to organize the data efficiently to avoid anomalies like update, insert, and delete anomalies, which can lead to data inconsistencies.
Each normal form addresses a different type of redundancy or anomaly. The process of achieving higher normal forms involves ensuring the relation (or table) meets certain conditions. There are several normal forms, and each normal form builds on the previous one, aiming to refine the structure of the database schema.
Let's go through each normal form and the steps necessary to achieve them.
A relation (table) is in First Normal Form (1NF) if:
| EmployeeID | Name | Phone Numbers |
|---|---|---|
| 1 | Alice | 12345, 67890 |
| 2 | Bob | 23456 |
Here, Phone Numbers is not atomic because it contains multiple values for each employee.
| EmployeeID | Name | Phone Number |
|---|---|---|
| 1 | Alice | 12345 |
| 1 | Alice | 67890 |
| 2 | Bob | 23456 |
Now, each column contains only atomic values, and the table is in 1NF.
A relation is in Second Normal Form (2NF) if:
Consider a table storing information about students, courses, and instructors:
| StudentID | CourseID | Instructor | InstructorPhone |
|---|---|---|---|
| 1 | C101 | Dr. Smith | 12345 |
| 1 | C102 | Dr. Johnson | 67890 |
| 2 | C101 | Dr. Smith | 12345 |
Here, StudentID, CourseID is the primary key, but Instructor and InstructorPhone are only dependent on CourseID, not the entire composite key. This is a partial dependency.
We need to separate the data into two tables to eliminate the partial dependency:
Student-Course Table:
| StudentID | CourseID |
|---|---|
| 1 | C101 |
| 1 | C102 |
| 2 | C101 |
Course-Details Table:
| CourseID | Instructor | InstructorPhone |
|---|---|---|
| C101 | Dr. Smith | 12345 |
| C102 | Dr. Johnson | 67890 |
Now, there are no partial dependencies, and the table is in 2NF.
A relation is in Third Normal Form (3NF) if:
| StudentID | CourseID | Instructor | InstructorPhone | Department |
|---|---|---|---|---|
| 1 | C101 | Dr. Smith | 12345 | HR |
| 1 | C102 | Dr. Johnson | 67890 | IT |
Here, InstructorPhone is dependent on Instructor, and Instructor is dependent on CourseID. Thus, InstructorPhone is transitively dependent on CourseID via Instructor.
To remove the transitive dependency, we decompose the table into two:
Student-Course Table:
| StudentID | CourseID |
|---|---|
| 1 | C101 |
| 1 | C102 |
Course-Details Table:
| CourseID | Instructor | Department |
|---|---|---|
| C101 | Dr. Smith | HR |
| C102 | Dr. Johnson | IT |
Instructor-Phone Table:
| Instructor | InstructorPhone |
|---|---|
| Dr. Smith | 12345 |
| Dr. Johnson | 67890 |
Now, the table is in 3NF because there are no transitive dependencies.
A relation is in Boyce-Codd Normal Form (BCNF) if:
Consider a table storing information about courses and instructors:
| CourseID | Instructor | Room |
|---|---|---|
| C101 | Dr. Smith | R1 |
| C102 | Dr. Johnson | R2 |
If Instructor → Room, then Instructor determines Room, but Instructor is not a superkey (since CourseID is also required to uniquely identify a record). This violates BCNF.
We decompose the table to ensure all determinants are superkeys:
Instructor-Details Table:
| Instructor | Room |
|---|---|
| Dr. Smith | R1 |
| Dr. Johnson | R2 |
Course Table:
| CourseID | Instructor |
|---|---|
| C101 | Dr. Smith |
| C102 | Dr. Johnson |
Now, the relation is in BCNF because Instructor is a superkey in the first table, and no non-key attribute determines anything other than the key.
A relation is in Fourth Normal Form (4NF) if:
Consider a table storing information about employees and their skills and languages:
| EmployeeID | Skill | Language |
|---|---|---|
| 1 | Programming | English |
| 1 | Management | Spanish |
Here, EmployeeID determines both Skill and Language, but there is no direct relationship between Skill and Language. This is a multi-valued dependency.
We split the table into two:
Employee-Skill Table:
| EmployeeID | Skill |
|---|---|
| 1 | Programming |
| 1 | Management |
Employee-Language Table:
| EmployeeID | Language |
|---|---|
| 1 | English |
| 1 | Spanish |
Now, the table is in 4NF because there are no multi-valued dependencies.
By following these normal forms, we create database schemas that are efficient, minimize redundancy, and reduce the likelihood of data anomalies.
Open this section to load past papers