Normal Forms (NF) are a set of guidelines or rules used in database design to ensure that a database is structured efficiently, with minimal redundancy and maximum data integrity. The process of normalizing a database involves breaking down complex tables into simpler ones, ensuring that data dependencies are properly managed, and avoiding potential anomalies.
Normalization aims to achieve different normal forms (1NF, 2NF, 3NF, BCNF, and so on), each of which builds upon the previous one. By applying these rules step-by-step, we ensure that the database is free from unnecessary duplication and maintains logical consistency.
A table is in First Normal Form (1NF) if it satisfies the following conditions:
Consider the following table:
| StudentID | Name | Courses |
|---|---|---|
| 101 | Alice | Math, Science |
| 102 | Bob | English, Math |
This table is not in 1NF because the "Courses" column contains multiple values for a single student (e.g., "Math, Science").
To bring it into 1NF, we split the multiple values in the "Courses" column into separate rows:
| StudentID | Name | Course |
|---|---|---|
| 101 | Alice | Math |
| 101 | Alice | Science |
| 102 | Bob | English |
| 102 | Bob | Math |
Now, each field contains only atomic values, and the table is in 1NF.
A table is in Second Normal Form (2NF) if it satisfies the following conditions:
Consider the following table with a composite primary key (StudentID, CourseID):
| StudentID | CourseID | Instructor | Grade |
|---|---|---|---|
| 101 | Math | Mr. Smith | A |
| 101 | Science | Mrs. Brown | B |
| 102 | Math | Mr. Smith | A |
Here, (StudentID, CourseID) is the primary key, but Instructor depends only on CourseID, not on the entire composite key. This is a partial dependency.
To bring this table into 2NF, we separate the data into two tables: one for the student-course relationships and one for the course-instructor relationships.
Student-Course Table (2NF):
| StudentID | CourseID | Grade |
|---|---|---|
| 101 | Math | A |
| 101 | Science | B |
| 102 | Math | A |
Course Table:
| CourseID | Instructor |
|---|---|
| Math | Mr. Smith |
| Science | Mrs. Brown |
Now, there are no partial dependencies, and the tables are in 2NF.
A table is in Third Normal Form (3NF) if it satisfies the following conditions:
Consider the following table:
| EmployeeID | Name | Department | DepartmentLocation |
|---|---|---|---|
| 101 | Alice | HR | Building A |
| 102 | Bob | IT | Building B |
In this case, DepartmentLocation depends on Department, which in turn depends on EmployeeID. This is a transitive dependency.
To bring the table into 3NF, we separate the data into two tables: one for the employees and their departments, and another for department locations.
Employee Table:
| EmployeeID | Name | Department |
|---|---|---|
| 101 | Alice | HR |
| 102 | Bob | IT |
Department Table:
| Department | DepartmentLocation |
|---|---|
| HR | Building A |
| IT | Building B |
Now, there are no transitive dependencies, and the tables are in 3NF.
A table is in Boyce-Codd Normal Form (BCNF) if it satisfies the following conditions:
Consider the following table:
| StudentID | CourseID | Instructor |
|---|---|---|
| 101 | Math | Mr. Smith |
| 101 | Science | Mrs. Brown |
| 102 | Math | Mr. Smith |
In this case, (StudentID, CourseID) is the primary key. However, the functional dependency CourseID → Instructor violates BCNF because CourseID is not a superkey but still determines the Instructor.
To bring this table into BCNF, we separate it into two tables: one for the course-instructor relationship and one for the student-course relationship.
Course Table (BCNF):
| CourseID | Instructor |
|---|---|
| Math | Mr. Smith |
| Science | Mrs. Brown |
Student-Course Table:
| StudentID | CourseID |
|---|---|
| 101 | Math |
| 101 | Science |
| 102 | Math |
Now, the table is in BCNF because in each functional dependency, the determinant is a superkey.
A table is in Fourth Normal Form (4NF) if it satisfies the following conditions:
Consider the following table:
| EmployeeID | Skill | Language |
|---|---|---|
| 101 | Java | English |
| 101 | Python | Spanish |
| 102 | SQL | French |
In this case, EmployeeID determines both Skill and Language, but these attributes are independent of each other. This creates a multi-valued dependency.
To bring this into 4NF, we separate the attributes into two tables, removing the multi-valued dependency.
Employee-Skill Table (4NF):
| EmployeeID | Skill |
|---|---|
| 101 | Java |
| 101 | Python |
| 102 | SQL |
Employee-Language Table (4NF):
| EmployeeID | Language |
|---|---|
| 101 | English |
| 101 | Spanish |
| 102 | French |
Now, the tables are in 4NF as the multi-valued dependency has been eliminated.
A table is in Fifth Normal Form (5NF) if it satisfies the following conditions:
5NF is rarely encountered in practical database design and typically applies to very complex scenarios involving intricate data relationships.
Normalization through normal forms is a crucial process in database design. Each normal form addresses specific types of redundancy and ensures that the database structure is optimized for integrity, consistency, and minimal data duplication. While normalization helps achieve an efficient database, it is important to strike a balance with performance requirements since highly normalized tables may require more complex queries or join operations, which can impact query performance.
Open this section to load past papers