COMP3138›Bucket Sort Algorithm

Design and Analysis of AlgorithmsTopic 17 of 53

Bucket Sort Algorithm

8 minread

1,286words

Intermediatelevel

Bucket Sort Algorithm

Bucket Sort is a non-comparative sorting algorithm that works by distributing the elements of the array into a number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm or by applying the bucket sort algorithm recursively. After sorting the buckets, the elements are combined to produce the sorted array.

Bucket Sort is particularly effective when the input is uniformly distributed over a range, and it performs well when the number of buckets is appropriately chosen relative to the number of elements.

Steps in Bucket Sort

Create Buckets: Divide the range of input values into n equally spaced intervals (buckets). The number of buckets is usually chosen to be approximately equal to the number of elements to be sorted, but this can vary depending on the specific dataset.
Distribute the Elements into Buckets: Place each element from the input array into one of the buckets. This is typically done based on the value of the element and the range of values each bucket represents.
Sort Each Bucket: Sort the elements within each bucket. This can be done using any other sorting algorithm (like Insertion Sort, Merge Sort, or Quick Sort). Often, Insertion Sort is used because it works well for small datasets and is simple to implement.
Concatenate the Sorted Buckets: Finally, after each bucket is sorted, combine the buckets into a single sorted array by concatenating them in order.

Bucket Sort Pseudocode

Here’s a simple pseudocode implementation for Bucket Sort:

// Function to perform bucket sort
void bucketSort(float arr[], int n) {
    // Create n empty buckets
    vector<float> buckets[n];

    // Put array elements in different buckets
    for (int i = 0; i < n; i++) {
        int index = n * arr[i];  // Assuming arr[i] is in the range [0, 1)
        buckets[index].push_back(arr[i]);
    }

    // Sort each bucket using a sorting algorithm (e.g., insertion sort)
    for (int i = 0; i < n; i++) {
        sort(buckets[i].begin(), buckets[i].end());  // You can use insertion sort here
    }

    // Concatenate all the sorted buckets
    int index = 0;
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < buckets[i].size(); j++) {
            arr[index++] = buckets[i][j];
        }
    }
}

Example of Bucket Sort

Let’s walk through an example of Bucket Sort with an array of floating-point numbers between 0 and 1.

Suppose we have the array:

arr = [0.42, 0.32, 0.18, 0.25, 0.63, 0.53, 0.51]

Step 1: Create Buckets

Assume we choose n = 5 buckets. We will use the range [0, 1) and divide it into 5 equal parts:

Bucket 1: [0.00, 0.20)
Bucket 2: [0.20, 0.40)
Bucket 3: [0.40, 0.60)
Bucket 4: [0.60, 0.80)
Bucket 5: [0.80, 1.00)

Step 2: Distribute Elements into Buckets

We now distribute the elements from the array into the corresponding buckets:

0.42 goes into Bucket 3.
0.32 goes into Bucket 2.
0.18 goes into Bucket 1.
0.25 goes into Bucket 2.
0.63 goes into Bucket 4.
0.53 goes into Bucket 4.
0.51 goes into Bucket 4.

After distribution, the buckets look like this:

Bucket 1: [0.18]
Bucket 2: [0.32, 0.25]
Bucket 3: [0.42]
Bucket 4: [0.63, 0.53, 0.51]
Bucket 5: [] (empty)

Step 3: Sort Each Bucket

Next, we sort each bucket.

Bucket 1: [0.18] — already sorted.
Bucket 2: [0.32, 0.25] — sorted as [0.25, 0.32].
Bucket 3: [0.42] — already sorted.
Bucket 4: [0.63, 0.53, 0.51] — sorted as [0.51, 0.53, 0.63].

Step 4: Concatenate the Sorted Buckets

Finally, concatenate all the sorted buckets to form the final sorted array:

arr = [0.18, 0.25, 0.32, 0.42, 0.51, 0.53, 0.63]

Time Complexity of Bucket Sort

The time complexity of Bucket Sort depends on the following factors:

Bucket Distribution:
- The time to distribute the elements into the buckets is O(n), where n is the number of elements in the array.
Sorting Each Bucket:
- If each bucket is sorted using a comparison-based sorting algorithm like Merge Sort or Quick Sort, the time complexity for sorting each bucket depends on the number of elements in the bucket. If the average number of elements in each bucket is k, and there are n total elements, the sorting time for each bucket is O(k log k).
- If we use Insertion Sort (which is efficient for small buckets), the sorting time for each bucket becomes O(k²) in the worst case.
- In the best case, if the elements are uniformly distributed and each bucket contains roughly n / k elements, the time complexity for sorting the buckets can be approximated as O(n).
Concatenation:
- Concatenating the buckets together is O(n), as we just need to copy each element from the buckets back into the original array.

Thus, the overall time complexity of Bucket Sort can be described as:

Best case: O(n + k), where k is the number of elements in each bucket (assuming elements are uniformly distributed).
Average case: O(n + n²/k + k log k), depending on the bucket sorting algorithm used.
Worst case: O(n²), if all elements end up in a single bucket and the bucket is sorted using a comparison-based algorithm.

Space Complexity of Bucket Sort

The space complexity of Bucket Sort is:

O(n + k), where n is the number of elements and k is the number of buckets.
We need space for the n elements, and we also need additional space for the k buckets. Typically, k is proportional to n (e.g., k ≈ n), so the space complexity is O(n).

Advantages of Bucket Sort

Efficient for Uniformly Distributed Data: If the input elements are uniformly distributed, Bucket Sort can be very efficient. Its time complexity can approach O(n) under optimal conditions.
In-Place Sorting Option: The algorithm can be adapted to sort elements in-place if necessary.
Linear Time in Best Case: Bucket Sort can achieve O(n) time complexity when the elements are uniformly distributed and each bucket is small.

Disadvantages of Bucket Sort

Assumes Uniform Distribution: The algorithm assumes that the data is uniformly distributed across a known range. If this assumption does not hold, Bucket Sort might perform poorly.
Extra Space: Bucket Sort requires additional space for the buckets, which may be a drawback if memory is limited.
Choice of Bucket Size: The performance heavily depends on the choice of the number of buckets and the distribution of the input data.

Use Cases of Bucket Sort

Uniformly Distributed Data: Bucket Sort works best when the input data is uniformly distributed over a range. For example, sorting floating-point numbers in the range [0, 1) or sorting decimal numbers can be ideal for Bucket Sort.
External Sorting: Bucket Sort can be used for external sorting algorithms, where data is too large to fit into memory and needs to be processed in chunks (buckets).

Conclusion

Bucket Sort is an efficient and specialized sorting algorithm that is useful when the input data is uniformly distributed over a known range. It works by dividing the elements into buckets, sorting each bucket individually, and then concatenating the results. Bucket Sort performs well in certain cases, achieving linear time complexity, but its performance heavily depends on the distribution of the input and the number of buckets used. It is particularly effective for sorting floating-point numbers or data with a known range.

Past Papers

Open this section to load past papers

Click on Show Past Papers to see past papers.

// Function to perform bucket sort void bucketSort(float arr[], int n) { // Create n empty buckets vector<float> buckets[n]; // Put array elements in different buckets for (int i = 0; i < n; i++) { int index = n * arr[i]; // Assuming arr[i] is in the range [0, 1) buckets[index].push_back(arr[i]); } // Sort each bucket using a sorting algorithm (e.g., insertion sort) for (int i = 0; i < n; i++) { sort(buckets[i].begin(), buckets[i].end()); // You can use insertion sort here } // Concatenate all the sorted buckets int index = 0; for (int i = 0; i < n; i++) { for (int j = 0; j < buckets[i].size(); j++) { arr[index++] = buckets[i][j]; } } }