Hashing and indexing are crucial concepts in data management that help improve the efficiency of data retrieval and storage.
Definition:
Hashing is a technique used to convert input data of arbitrary size into a fixed-size string of characters, typically for fast data retrieval. It uses a hash function to map data to a specific location (or bucket) in a hash table.
Key Components:
Common Operations:
Collision Resolution: Collisions occur when two inputs hash to the same index. Common strategies to handle collisions include:
Example Implementation: Here’s a simple hash table implementation in C++ using separate chaining:
#include <iostream>
#include <list>
#include <vector>
using namespace std;
class HashTable {
private:
vector<list<int>> table;
int size;
public:
HashTable(int s) : size(s) {
table.resize(size);
}
// Hash function
int hash(int key) {
return key % size;
}
// Insert an element
void insert(int key) {
int index = hash(key);
table[index].push_back(key);
}
// Search for an element
bool search(int key) {
int index = hash(key);
for (int x : table[index]) {
if (x == key) {
return true;
}
}
return false;
}
// Remove an element
void remove(int key) {
int index = hash(key);
table[index].remove(key);
}
};
int main() {
HashTable ht(10);
ht.insert(5);
ht.insert(15);
cout << "Search 5: " << ht.search(5) << endl; // Outputs: 1 (true)
cout << "Search 10: " << ht.search(10) << endl; // Outputs: 0 (false)
ht.remove(5);
cout << "Search 5 after removal: " << ht.search(5) << endl; // Outputs: 0 (false)
return 0;
}
Definition:
Indexing is a data structure technique used to quickly locate and access the data in a database or data structure. An index is a separate data structure that maintains a reference to the actual data and allows for faster searches.
Key Features:
Types of Indexes:
B-Trees and B+ Trees: Common data structures used for indexing in databases, providing balanced search times and efficient insertions/deletions.
Benefits of Indexing:
Example Use Case: When searching for a record in a database, using an index can significantly reduce the number of records that need to be examined. Instead of checking each record one by one, the index allows direct access to the records of interest.
Hashing and indexing are powerful techniques for optimizing data retrieval. Hashing provides an efficient way to store and access data with constant time complexity on average, while indexing enhances search performance in databases. Understanding these concepts is crucial for developing efficient applications and managing large datasets effectively.
Open this section to load past papers