ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Design and Analysis of Algorithms
    COMP3138
    Progress0 / 53 topics
    Topics
    1. Introduction to Algorithm Design2. Data Structures3. Efficiency in Algorithms4. Analysis of Algorithms5. Mathematical Review6. Mathematical Analysis of Algorithms7. Types of Functions8. Order of Growth9. Asymptotic Notations10. Sorting Algorithms11. Selection Sort Algorithm12. Example and Analysis of Selection Sort13. Insertion Sort Algorithm14. Divide and Conquer Algorithms15. Merge Sort Algorithm16. Quick Sort Algorithm17. Bucket Sort Algorithm18. Radix Sort Algorithm19. Counting Sort Algorithm20. Heap Sort Basics21. Heap Algorithms22. Heap Properties and Examples23. Heap Operations24. Heap Sort Algorithm Analysis25. Heap Insertion and Deletion26. Tree Based Algorithms and Hashing27. Red Black Tree Basics28. Binary Search Tree Basics29. Tree Searching Algorithms30. Analysis of Tree Searching Algorithms31. Analysis of Insertion and Deletion in BST32. Hashing Basics33. Examples of Hash Functions34. Analysis of Collision Resolution Techniques35. Dynamic Programming36. 0-1 Knapsack Problem37. Fractional Knapsack Problem38. Longest Common Subsequence39. Shortest Path Finding40. Matrix Chain Multiplication41. Assembly Line Chain Problem42. Greedy Algorithms43. Prim's Algorithm44. Kruskal's Algorithm45. Dijkstra's Algorithm46. Huffman Coding47. NP-Completeness48. Polynomial Time Verification49. Reducibility50. NP-Completeness Proofs51. Randomized Algorithms52. Particle Swarm Optimization53. Genetic Algorithms
    COMP3138›Analysis of Collision Resolution Techniques
    Design and Analysis of AlgorithmsTopic 34 of 53

    Analysis of Collision Resolution Techniques

    7 minread
    1,243words
    Intermediatelevel

    Analysis of Collision Resolution Techniques

    Collisions in hash tables occur when two different keys hash to the same index. When a collision happens, we need a mechanism to resolve it in a way that allows the hash table to still function correctly. There are several techniques for collision resolution, each with its own trade-offs in terms of time complexity, space usage, and efficiency.

    The most commonly used collision resolution techniques are:

    1. Chaining (Separate Chaining)
    2. Open Addressing (with strategies like Linear Probing, Quadratic Probing, and Double Hashing)

    Let's analyze each technique in detail, comparing them based on various factors like performance, space usage, and ease of implementation.


    1. Chaining (Separate Chaining)

    In chaining, each bucket (or slot) in the hash table holds a collection (typically a linked list or other data structure like a balanced tree or dynamic array) to store multiple key-value pairs that hash to the same index. When a collision occurs, the new element is simply added to the list at the corresponding index.

    How Chaining Works:

    1. Hash the key to find the appropriate index in the array.
    2. If the bucket at that index is empty, store the key-value pair in the bucket.
    3. If the bucket is already occupied (i.e., a collision), add the new key-value pair to the list (or another collection) at that bucket.

    Advantages of Chaining:

    • Simple and intuitive: It's relatively easy to implement, and no rehashing is required when there are collisions.
    • Handles collisions gracefully: Even if many collisions occur, the hash table size does not have to be resized as in open addressing.
    • Flexible space usage: Each bucket can grow dynamically without the need for resizing the entire hash table.

    Disadvantages of Chaining:

    • Extra space: Each bucket needs to store additional information (i.e., linked list nodes or other data structures). This leads to higher space complexity than open addressing, particularly if there are many collisions.
    • Poor performance under heavy collision: If the hash function results in many collisions (i.e., many keys end up in the same bucket), the linked list at that bucket will grow large, which leads to poor performance for search, insert, and delete operations. In the worst case, where all keys hash to the same index, the time complexity for all operations can degrade to O(n).
    • Cache locality issues: Since the elements are scattered across memory (due to the linked list), accessing them may not be cache-friendly, resulting in slower access compared to open addressing.

    Time Complexity:

    • Search: On average O(1 + α), where α is the load factor (the ratio of the number of elements to the number of buckets). In the worst case (all elements hash to the same bucket), it becomes O(n).
    • Insertion: O(1) on average. In the worst case, O(n).
    • Deletion: O(1) on average. In the worst case, O(n).

    2. Open Addressing

    In open addressing, when a collision occurs, the hash table searches for the next available slot within the array itself, without using additional data structures like linked lists. The primary methods of open addressing are:

    • Linear Probing
    • Quadratic Probing
    • Double Hashing

    How Open Addressing Works:

    1. Hash the key to find the index.
    2. If the index is occupied, search for the next available slot using a probing technique (either linear, quadratic, or double hashing).
    3. Once an empty slot is found, store the key-value pair in that slot.
    Linear Probing:
    • When a collision occurs at index i, we check the next index (i + 1) % table_size, then (i + 2) % table_size, and so on, until an empty slot is found.
    • This method is prone to clustering, where groups of contiguous slots become filled, leading to increased search times.
    Quadratic Probing:
    • Instead of checking adjacent slots in a linear fashion, quadratic probing checks slots at intervals of 1^2, 2^2, 3^2, etc. (i.e., (i + j^2) % table_size for j = 1, 2, 3, ...).
    • This reduces primary clustering but may still result in secondary clustering (collisions that happen in the same pattern for different keys).
    Double Hashing:
    • If a collision occurs, double hashing uses a second hash function to compute a new index. The formula is:
      hash2(key) = secondary_hash_function(key)
      new_index = (hash1(key) + i * hash2(key)) % table_size
      
    • This approach reduces clustering significantly but requires careful design of the second hash function.

    3. Comparison of Open Addressing Techniques

    Advantages of Open Addressing:

    • Space efficiency: Open addressing doesn't require extra space for linked lists or other structures, as it stores all elements directly in the array.
    • Better cache locality: Since all elements are stored in a single array, access to them is more cache-friendly, leading to better performance in terms of memory access.
    • Simple implementation: It is easy to implement since no extra data structures (like linked lists or trees) are required.

    Disadvantages of Open Addressing:

    • Performance degradation: When the load factor increases, the table becomes more crowded, leading to more probing and slower performance. As a result, insertion and search times can degrade significantly if the table is not resized or rehashed in time.
    • Clustering: Methods like linear probing are prone to clustering (i.e., groups of consecutive slots become filled, making further searches and insertions slow).
    • Resizing: Open addressing requires careful resizing (rehashing) of the table when the load factor exceeds a certain threshold (usually around 0.7) to maintain efficient performance.

    Time Complexity:

    • Search:

      • Linear Probing: O(1) on average. In the worst case, O(n).
      • Quadratic Probing: O(1) on average. Worst-case can still degrade to O(n) in cases of high clustering.
      • Double Hashing: O(1) on average. Worst case can degrade to O(n) but typically performs better than linear or quadratic probing under similar conditions.
    • Insertion: O(1) on average. In the worst case, O(n).

    • Deletion: O(1) on average. In the worst case, O(n).


    4. Performance Comparison

    Let's compare the performance of chaining and open addressing based on the load factor and table resizing.

    Collision Resolution Technique Time Complexity (Average) Time Complexity (Worst Case) Space Complexity
    Chaining (Separate Chaining) O(1 + α) O(n) O(n)
    Linear Probing O(1) O(n) O(n)
    Quadratic Probing O(1) O(n) O(n)
    Double Hashing O(1) O(n) O(n)
    • Chaining has the advantage that it is more resilient to high load factors and doesn't require complex probing or resizing strategies, but it uses extra space for linked lists.
    • Open addressing is more space-efficient and has better cache locality but requires resizing and can suffer from clustering if the probing strategy is not well-designed.

    Conclusion

    Each collision resolution technique has its advantages and trade-offs.

    • Chaining is easier to implement and more robust against high load factors but requires additional space to store linked lists.
    • Open addressing is space-efficient and offers better memory locality, but its performance degrades when the load factor is high or if the probing strategy results in clustering.

    The choice of collision resolution method should depend on factors like:

    • The expected load factor (how full the hash table will be).
    • The frequency of collisions.
    • The available memory.
    • The desired time complexity for operations.

    For example:

    • For a hash table with a low load factor and frequent insertions and deletions, open addressing (especially with double hashing) might be preferred.
    • For a hash table with a high load factor or when frequent rehashing isn't feasible, chaining might be a better choice.
    Previous topic 33
    Examples of Hash Functions
    Next topic 35
    Dynamic Programming

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time7 min
      Word count1,243
      Code examples0
      DifficultyIntermediate