In relational databases, a key is a set of one or more attributes (columns) that uniquely identifies a tuple (row) in a relation (table). Keys are used to enforce uniqueness and establish relationships between different tables. Properly defining keys is essential for ensuring data integrity, consistency, and enabling efficient querying.
There are different types of keys in relational databases, each serving a specific purpose. Below is a detailed explanation of the key types and their roles:
A superkey is any set of attributes (columns) that uniquely identifies each tuple in a relation. A superkey may include extra attributes that are not strictly necessary to guarantee uniqueness, but still maintain the uniqueness property.
Students table, the set of attributes {StudentID, Name} could be a superkey if it ensures uniqueness. Even though StudentID alone is sufficient to uniquely identify a student, adding Name doesn't violate uniqueness but is unnecessary.Consider the Students table:
| StudentID | Name | Age | Major |
|---|---|---|---|
| 1 | John Doe | 20 | Computer Science |
| 2 | Jane Smith | 22 | Mathematics |
| 3 | Alice Lee | 19 | Biology |
{StudentID} (since StudentID alone can uniquely identify each row){StudentID, Name} (this also uniquely identifies the row, but it’s not minimal because StudentID alone is sufficient)A candidate key is a minimal superkey, meaning it is a superkey with no unnecessary attributes. In other words, if any attribute is removed from a candidate key, it will no longer uniquely identify the tuples in the relation.
Students table, {StudentID} is a candidate key because it uniquely identifies each student, and no attribute can be removed to make it smaller without losing uniqueness.In the Students table:
| StudentID | Name | Age | Major |
|---|---|---|---|
| 1 | John Doe | 20 | Computer Science |
| 2 | Jane Smith | 22 | Mathematics |
| 3 | Alice Lee | 19 | Biology |
{StudentID} (since StudentID alone can uniquely identify each student){Email} (if email addresses are unique for each student, this could also be a candidate key)In this case, both {StudentID} and {Email} are candidate keys (assuming email addresses are unique), but {StudentID} is minimal (no extra attributes), whereas {Email} is also minimal.
A primary key is a special candidate key that is selected to uniquely identify tuples in a relation. Each table can have only one primary key, and the attributes that make up the primary key cannot contain NULL values.
In the Students table:
| StudentID | Name | Age | Major |
|---|---|---|---|
| 1 | John Doe | 20 | Computer Science |
| 2 | Jane Smith | 22 | Mathematics |
| 3 | Alice Lee | 19 | Biology |
{StudentID} could be selected as the primary key because it uniquely identifies each student, and it is minimal (there are no unnecessary attributes). It is also non-null, meaning every student must have a valid StudentID.A foreign key is an attribute (or set of attributes) in one relation that refers to the primary key in another relation. It is used to establish and enforce a link between the data in two tables. A foreign key ensures referential integrity by making sure that the values in the foreign key column(s) match existing values in the referenced primary key.
Consider two tables, Students and Courses:
Students Table:| StudentID | Name |
|---|---|
| 1 | John Doe |
| 2 | Jane Smith |
Courses Table:| CourseID | CourseName | StudentID |
|---|---|---|
| 101 | Computer Science | 1 |
| 102 | Mathematics | 2 |
In the Courses table, StudentID is a foreign key that references the StudentID in the Students table. This ensures that every StudentID in the Courses table corresponds to an existing StudentID in the Students table.
A composite key is a key that consists of more than one attribute (column) to uniquely identify tuples in a relation. A composite key is used when no single attribute is sufficient to guarantee uniqueness, and a combination of attributes is required.
CourseRegistrations, neither StudentID nor CourseID alone might be sufficient to uniquely identify a record. A combination of both attributes can be used as a composite key.Consider a CourseRegistrations table:
| StudentID | CourseID | Semester |
|---|---|---|
| 1 | 101 | Spring 2024 |
| 2 | 102 | Fall 2024 |
Here, neither StudentID nor CourseID alone can be used as a primary key, but the combination of both StudentID and CourseID is a composite key that uniquely identifies each row.
A unique key is similar to a primary key in that it ensures the uniqueness of the data in a table. However, a unique key allows NULL values, while a primary key does not.
In the Students table, if the Email column is unique but allows NULL, it could be defined as a unique key.
| StudentID | Name | |
|---|---|---|
| 1 | John Doe | johndoe@email.com |
| 2 | Jane Smith | janesmith@email.com |
| 3 | Alice Lee | NULL |
In this case, the Email attribute is a unique key, but it allows a NULL value.
Keys are fundamental to ensuring data integrity, uniqueness, and relationships in relational databases. The main types of keys are:
Properly defining keys is essential for the structure and integrity of a relational database.
Open this section to load past papers