Bucket Sort is a non-comparative sorting algorithm that works by distributing the elements of the array into a number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm or by applying the bucket sort algorithm recursively. After sorting the buckets, the elements are combined to produce the sorted array.
Bucket Sort is particularly effective when the input is uniformly distributed over a range, and it performs well when the number of buckets is appropriately chosen relative to the number of elements.
Create Buckets: Divide the range of input values into n equally spaced intervals (buckets). The number of buckets is usually chosen to be approximately equal to the number of elements to be sorted, but this can vary depending on the specific dataset.
Distribute the Elements into Buckets: Place each element from the input array into one of the buckets. This is typically done based on the value of the element and the range of values each bucket represents.
Sort Each Bucket: Sort the elements within each bucket. This can be done using any other sorting algorithm (like Insertion Sort, Merge Sort, or Quick Sort). Often, Insertion Sort is used because it works well for small datasets and is simple to implement.
Concatenate the Sorted Buckets: Finally, after each bucket is sorted, combine the buckets into a single sorted array by concatenating them in order.
Here’s a simple pseudocode implementation for Bucket Sort:
// Function to perform bucket sort
void bucketSort(float arr[], int n) {
// Create n empty buckets
vector<float> buckets[n];
// Put array elements in different buckets
for (int i = 0; i < n; i++) {
int index = n * arr[i]; // Assuming arr[i] is in the range [0, 1)
buckets[index].push_back(arr[i]);
}
// Sort each bucket using a sorting algorithm (e.g., insertion sort)
for (int i = 0; i < n; i++) {
sort(buckets[i].begin(), buckets[i].end()); // You can use insertion sort here
}
// Concatenate all the sorted buckets
int index = 0;
for (int i = 0; i < n; i++) {
for (int j = 0; j < buckets[i].size(); j++) {
arr[index++] = buckets[i][j];
}
}
}
Let’s walk through an example of Bucket Sort with an array of floating-point numbers between 0 and 1.
Suppose we have the array:
arr = [0.42, 0.32, 0.18, 0.25, 0.63, 0.53, 0.51]
Assume we choose n = 5 buckets. We will use the range [0, 1) and divide it into 5 equal parts:
We now distribute the elements from the array into the corresponding buckets:
0.42 goes into Bucket 3.0.32 goes into Bucket 2.0.18 goes into Bucket 1.0.25 goes into Bucket 2.0.63 goes into Bucket 4.0.53 goes into Bucket 4.0.51 goes into Bucket 4.After distribution, the buckets look like this:
[0.18][0.32, 0.25][0.42][0.63, 0.53, 0.51][] (empty)Next, we sort each bucket.
[0.18] — already sorted.[0.32, 0.25] — sorted as [0.25, 0.32].[0.42] — already sorted.[0.63, 0.53, 0.51] — sorted as [0.51, 0.53, 0.63].Finally, concatenate all the sorted buckets to form the final sorted array:
arr = [0.18, 0.25, 0.32, 0.42, 0.51, 0.53, 0.63]
The time complexity of Bucket Sort depends on the following factors:
Bucket Distribution:
n is the number of elements in the array.Sorting Each Bucket:
n total elements, the sorting time for each bucket is O(k log k).Concatenation:
Thus, the overall time complexity of Bucket Sort can be described as:
k is the number of elements in each bucket (assuming elements are uniformly distributed).The space complexity of Bucket Sort is:
n is the number of elements and k is the number of buckets.n elements, and we also need additional space for the k buckets. Typically, k is proportional to n (e.g., k ≈ n), so the space complexity is O(n).Bucket Sort is an efficient and specialized sorting algorithm that is useful when the input data is uniformly distributed over a known range. It works by dividing the elements into buckets, sorting each bucket individually, and then concatenating the results. Bucket Sort performs well in certain cases, achieving linear time complexity, but its performance heavily depends on the distribution of the input and the number of buckets used. It is particularly effective for sorting floating-point numbers or data with a known range.
Open this section to load past papers