ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Data Structures
    COMP2117
    Progress0 / 37 topics
    Topics
    1. Abstract Data Types2. Complexity Analysis3. Big Oh Notation4. Stacks (Linked Lists and Array Implementations)5. Recursion and analyzing recursive algorithms6. Divide and Conquer Algorithms7. Sorting Algorithms8. Selection Sort9. Insertion Sort10. Merge Sort11. Quick Sort12. Bubble Sort13. Heap Sort14. Shell Sort15. Radix Sort16. Bucket Sort17. Queue18. Dequeuer19. Priority Queues (linked and array implementations of queues)20. Linked List and Its Various Types21. Sorted Linked List22. Searching an Unsorted Array23. Binary Search for Sorted Arrays24. Hashing and Indexing25. Open Addressing and Chaining26. Trees and Tree Traversals27. Binary Search Trees28. Heaps29. M-Way Trees30. Balanced Trees31. Graphs32. Breadth-First Traversal33. Depth-First Traversal34. Topological Order35. Shortest Path36. Adjacency Matrix and Adjacency List Implementations37. Memory Management and Garbage Collection
    COMP2117›Hashing and Indexing
    Data StructuresTopic 24 of 37

    Hashing and Indexing

    4 minread
    647words
    Beginnerlevel

    Hashing and Indexing

    Hashing and indexing are crucial concepts in data management that help improve the efficiency of data retrieval and storage.

    Hashing

    Definition:
    Hashing is a technique used to convert input data of arbitrary size into a fixed-size string of characters, typically for fast data retrieval. It uses a hash function to map data to a specific location (or bucket) in a hash table.

    Key Components:

    • Hash Function: A function that takes an input and produces a fixed-size hash value. Good hash functions minimize collisions and distribute data evenly.
    • Hash Table: An array or a list where the hash values are stored. Each index corresponds to a unique hash value.

    Common Operations:

    1. Insertion: Compute the hash value for the data and place it in the corresponding index.
    2. Searching: Compute the hash value for the query and check the corresponding index.
    3. Deletion: Compute the hash value for the data and remove it from the corresponding index.

    Collision Resolution: Collisions occur when two inputs hash to the same index. Common strategies to handle collisions include:

    • Chaining: Store multiple elements at the same index using a linked list.
    • Open Addressing: Find another empty index within the hash table using techniques like linear probing, quadratic probing, or double hashing.

    Example Implementation: Here’s a simple hash table implementation in C++ using separate chaining:

    #include <iostream>
    #include <list>
    #include <vector>
    using namespace std;
    
    class HashTable {
    private:
        vector<list<int>> table;
        int size;
    
    public:
        HashTable(int s) : size(s) {
            table.resize(size);
        }
    
        // Hash function
        int hash(int key) {
            return key % size;
        }
    
        // Insert an element
        void insert(int key) {
            int index = hash(key);
            table[index].push_back(key);
        }
    
        // Search for an element
        bool search(int key) {
            int index = hash(key);
            for (int x : table[index]) {
                if (x == key) {
                    return true;
                }
            }
            return false;
        }
    
        // Remove an element
        void remove(int key) {
            int index = hash(key);
            table[index].remove(key);
        }
    };
    
    int main() {
        HashTable ht(10);
        ht.insert(5);
        ht.insert(15);
        
        cout << "Search 5: " << ht.search(5) << endl; // Outputs: 1 (true)
        cout << "Search 10: " << ht.search(10) << endl; // Outputs: 0 (false)
    
        ht.remove(5);
        cout << "Search 5 after removal: " << ht.search(5) << endl; // Outputs: 0 (false)
    
        return 0;
    }
    

    Indexing

    Definition:
    Indexing is a data structure technique used to quickly locate and access the data in a database or data structure. An index is a separate data structure that maintains a reference to the actual data and allows for faster searches.

    Key Features:

    • Types of Indexes:

      • Primary Index: Built on the primary key of a table.
      • Secondary Index: Additional indexes to improve query performance on non-primary key columns.
      • Unique Index: Ensures that no duplicate values are present in the indexed column.
    • B-Trees and B+ Trees: Common data structures used for indexing in databases, providing balanced search times and efficient insertions/deletions.

    Benefits of Indexing:

    • Faster Search Operations: By avoiding full table scans, indexing speeds up query execution.
    • Efficient Sorting: Indexes can also facilitate quick sorting of records.

    Example Use Case: When searching for a record in a database, using an index can significantly reduce the number of records that need to be examined. Instead of checking each record one by one, the index allows direct access to the records of interest.

    Summary

    Hashing and indexing are powerful techniques for optimizing data retrieval. Hashing provides an efficient way to store and access data with constant time complexity on average, while indexing enhances search performance in databases. Understanding these concepts is crucial for developing efficient applications and managing large datasets effectively.

    Previous topic 23
    Binary Search for Sorted Arrays
    Next topic 25
    Open Addressing and Chaining

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time4 min
      Word count647
      Code examples0
      DifficultyBeginner