Hashing is a technique used to map data of arbitrary size (such as strings, integers, etc.) to fixed-size values using a hash function. It is widely used in various applications like hash tables, databases, caches, and encryption. The goal of hashing is to allow for efficient data retrieval, insertion, and deletion operations.
A hash function takes an input (key) and computes an integer hash value. The function should have the following characteristics:
For simplicity, a basic hash function for integers could be:
int hashFunction(int key, int tableSize) {
return key % tableSize;
}
In this example, the key is divided by the table size, and the remainder is the hash value. The remainder will be used as the index in the hash table.
The division method is one of the simplest hash functions, where the hash value is computed by dividing the key by the table size and using the remainder.
int hashFunction(int key, int tableSize) {
return key % tableSize;
}
In this method, the key is multiplied by a constant value and the result is then taken modulo the table size. A typical implementation might look like:
int hashFunction(int key, int tableSize) {
float A = 0.6180339887; // Constant between 0 and 1
return int(tableSize * (key * A - int(key * A)));
}
This method involves choosing a random hash function from a family of hash functions at runtime. The idea is to make it harder for an adversary to predict the hash function and cause a lot of collisions.
Since it's possible for multiple keys to produce the same hash value (a collision), collision resolution techniques are necessary to ensure efficient storage and retrieval.
In chaining, each bucket of the hash table stores a linked list (or another list structure) of elements. If multiple keys hash to the same bucket, they are added to the list at that bucket.
Advantages:
Disadvantages:
Code Example (Chaining):
#include <iostream>
#include <list>
using namespace std;
class HashTable {
int size;
list<int> *table;
public:
HashTable(int s) {
size = s;
table = new list<int>[size];
}
void insert(int key) {
int index = key % size;
table[index].push_back(key); // Insert key into the linked list
}
void remove(int key) {
int index = key % size;
table[index].remove(key); // Remove key from the linked list
}
bool search(int key) {
int index = key % size;
for (auto it = table[index].begin(); it != table[index].end(); ++it) {
if (*it == key)
return true;
}
return false; // Not found
}
};
int main() {
HashTable ht(10);
ht.insert(15);
ht.insert(25);
ht.insert(35);
if (ht.search(25)) {
cout << "Found 25" << endl;
} else {
cout << "25 not found" << endl;
}
ht.remove(25);
if (ht.search(25)) {
cout << "Found 25" << endl;
} else {
cout << "25 not found" << endl;
}
return 0;
}
Found 25
25 not found
In open addressing, when a collision occurs, the algorithm searches for the next available bucket in the hash table according to a probe sequence. The most common probing techniques are:
Linear Probing: If a collision occurs at index i, check index i+1, then i+2, and so on, until an empty slot is found.
Quadratic Probing: Instead of checking the next index, check i+1^2, i+2^2, i+3^2, etc., to resolve collisions.
Double Hashing: Use a second hash function to compute the probe sequence.
Code Example (Linear Probing):
#include <iostream>
using namespace std;
class HashTable {
int size;
int *table;
public:
HashTable(int s) {
size = s;
table = new int[size];
for (int i = 0; i < size; i++) {
table[i] = -1; // Initialize all elements as -1 (empty)
}
}
void insert(int key) {
int index = key % size;
while (table[index] != -1) {
index = (index + 1) % size; // Linear probing
}
table[index] = key;
}
void remove(int key) {
int index = key % size;
while (table[index] != -1) {
if (table[index] == key) {
table[index] = -1; // Mark the slot as empty
return;
}
index = (index + 1) % size;
}
}
bool search(int key) {
int index = key % size;
while (table[index] != -1) {
if (table[index] == key) {
return true;
}
index = (index + 1) % size;
}
return false;
}
};
int main() {
HashTable ht(10);
ht.insert(15);
ht.insert(25);
ht.insert(35);
if (ht.search(25)) {
cout << "Found 25" << endl;
} else {
cout << "25 not found" << endl;
}
ht.remove(25);
if (ht.search(25)) {
cout << "Found 25" << endl;
} else {
cout << "25 not found" << endl;
}
return 0;
}
Found 25
25 not found
Insertion:
Search:
Deletion:
Open this section to load past papers