Bucket sort efficiently organizes elements by distributing them into distinct buckets, facilitating subsequent sorting using alternative algorithms. This technique, pivotal in software engineering interviews, categorizes data into buckets based on uniform distribution. Once grouped, each bucket undergoes sorting, and the sorted elements are then amalgamated into a cohesive, ordered sequence. Bucket sort’s approach streamlines sorting processes, ensuring effective data arrangement through systematic bucket allocation and subsequent organization.
Uniformly Distributed Data
Uniformly distributed data
has difference between every adjacent element almost the same.- Examples and graph to represent uniformly distributed data.
An array is said to be uniformly distributed if the difference between each adjacent element inside the array is almost the same. Or, in other terms, the elements of the array spread evenly throughout the whole span/range of the array.
For example:
- Consider this array:
[10, 21, 29, 41, 52]
The difference between each adjacent term is almost equal to10
. Hence, this array has uniformly distributed data and can be sorted using the bucket sort algorithm. - Consider another array:
[1,4,23,5,44,9,6,43]
This array is not uniformly distributed because the number of elements between the range[0-10] = 5
(i.e.,1,4,5,9, and 6
), whereas the number of elements between[10-20]
is0
, and the same for the range[30-40]
.
The data is scattered over different data ranges but is concentrated within some specific ranges, whereas sparse among the others. Hence, this array is not uniformly distributed.
This is how a uniform distribution data looks like:
The complete data is scattered equally within each range and hence depicts uniformly distributed data.
Scatter-Gather-Approach
Solving large and complex problems can be a little hard at times. Scatter-gather-approach tries to simplify such complex problems by first scattering the complete data into clusters, solving these sub-problems individually, and finally gathering them all together to get the final result.
This is how Bucket sort uses scatter-gather-approach:
Bucket Sort Algorithm
- Works on Uniformly Distributed data
- Follows Scatter-Gather-Approach
- Sorting technique with time complexity
O(n)
Bucket sort
is a sorting technique that uses the Scatter-Gather-Approach to sort the array.
It divides the unsorted array into separate groups and calls them buckets. Sort the individual buckets, and then gather them all together to form the final sorted array.
- If the elements of the array are floats ranging between
0
and1
, we primarily make10
buckets, numbered from0
to9
, and then insert elements into these buckets depending upon their most significant number.
The bucket index is calculated as:(int) (elementNumber * 10)
- If the elements of the array are integers, as shown in the figure below, we simply calculate the range:
range = (maximumNumber - minimumNumber) / noOfBuckets
and divide the whole range into buckets and then perform bucket sorting.
Graphically, this is how the bucket sort with integer elements look like:
:::
:::section{.main}
Working of Bucket Sort
Let us understand how the algorithm works with the help of the above example. Consider this unsorted array
Step 1:
Since we have the array with integer elements, we’ll calculate the range first.
maximumElement = 24
minimumElement = 1
noOfBuckets = 5 (will be given as parameter)
range = (int)(24 - 1) / 5
= 4
Hence, we’ll have buckets of the following ranges:
[1-5] [6-10] [11-15] [16-20] [21-25]
Step 2: Scatter
Now, iterate through the unsorted array and keep inserting the numbers in the bucket of their corresponding range.
For example:
array[0] = 11, Range = [11-15]
array[1] = 9, Range = [6-10]
array[2] = 21, Range = [21-25]
array[3] = 8, Range = [6-10]
array[4] = 17, Range = [16-20]
and so on..
Finally, the buckets will look like this:
Step 3:
Now sort all the elements in each of the buckets, and the sorted buckets look like this:
Step 4: Gather
At last, visit each bucket and gather all the numbers together. Merge them all, and we’ll get the sorted array.
Hence, the sorted array is:
PseudeCode
bucketSorting(arr, n):
Create n empty buckets
loop through each element of the arr and do the following
Calculate bucketIndex
Insert the element into the corresponding bucket number
3) Sort the individual buckets
4) Gather all the elements together
Hence, we have the sorted array.
Bucket Sort Algorithm for floating points numbers
Below written is the complete code for bucket sort.
At first, we create a vector, bucket and then create n buckets inside this vector. The whole array is then traversed, and elements are placed in the bucket corresponding to their range.
JAVA Code
import java.util.*;
import java.util.Collections;
class Main {
// JAVA program to perform bucket sorting
// on array of size n
static void bucketSorting(float array[], int n)
{
if (n <= 0)
return;
// 1) Create n empty buckets
Vector<Float>[] buckets = new Vector[n];
for (int i = 0; i < n; i++) {
buckets[i] = new Vector<Float>();
}
// 2) Put array elements in different buckets
for (int i = 0; i < n; i++) {
float idx = array[i] * n;
buckets[(int)idx].add(array[i]);
}
// 3) Sort individual buckets
for (int i = 0; i < n; i++) {
Collections.sort(buckets[i]);
}
// 4) Gather all buckets together into array[]
int arrayIndex = 0;
for (int i = 0; i < n; i++) {
int bucketSize = buckets[i].size();
for (int k = 0; k < bucketSize; k++) {
array[arrayIndex ++] = buckets[i].get(k);
}
}
}
// Driver code
public static void main(String args[])
{
float array[] = { (float)0.42, (float)0.32,
(float)0.35, (float)0.52,
(float)0.39, (float)0.47,
(float)0.50};
int n = array.length;
bucketSorting(array, n);
System.out.println("Sorted Array is: ");
for (float element : array) {
System.out.print(element + " ");
}
}
}
Input: [0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68]
no. of buckets: 5
Output:
Sorted Array is: [0.12, 0.17, 0.21, 0.23, 0.26, 0.39, 0.68, 0.72, 0.78, 0.94]
Note
: This applies if the elements of the array range between 0
and 1
. What if the elements are integers? In that case, we’ll set the range of each bucket a little differently, which we will learn in the next section.
Following images show the working of code with the inputs taken above:
Bucket Sorting for Integer elements
- Find range =
(max - min) / noOfBuckets
- bucketIndex =
(array[i] - minimum) / range
- Finally, sort and gather
For integer elements, we need to follow the following algorithm:
- Calculate the maximum and the minimum element of the array
- Calculate the range:
range = (maximum - minimum) / n
,
where n is the number of buckets (Given as parameter) - Create n empty buckets and initialize them with 0
- Loop through the unsorted array and perform the following:
a) Calculate bucketIndexbucketIndex = (array[i] - minimum) / range
b) Insert the ith element of the array into the bucket[bucketIndex] - Sort the individual buckets
- Gather all the elements together
Hence, we have the sorted array with us. This is how it works:
Python code – Bucket Sort with Integer elements
# Python program for numbers having integer part
def bucketSorting(array, noOfBuckets):
maxElement = max(array)
minElement = min(array)
# Calculate Range
Range = (maxElement - minElement) / noOfBuckets
temp = []
# create n(noOfBuckets) empty buckets
for i in range(noOfBuckets):
temp.append([])
# Insert elments of array in corresponding buckets
for i in range(len(array)):
diff = (array[i] - minElement) / Range - int((array[i] - minElement) / Range)
if(diff == 0 and array[i] != minElement):
temp[int((array[i] - minElement) / Range) - 1].append(array[i])
else:
temp[int((array[i] - minElement) / Range)].append(array[i])
# Sort the buckets individually
for i in range(len(temp)):
if len(temp[i]) != 0:
temp[i].sort()
# Gather elements from all the buckets to get sorted array
k = 0
for lst in temp:
if lst:
for i in lst:
array[k] = i
k = k+1
# Driver Code
array = [11, 9, 21, 8, 17, 19, 13, 1, 24, 12]
noOfBuckets = 5
bucketSorting(array, noOfBuckets)
print("Sorted Array after Bucket Sorting is: ", array)
Output
Input: [11, 9, 21, 8, 17, 19, 13, 1, 24, 12]
no. of buckets: 5
Output:
Sorted Array after Bucket Sorting is:
1, 8, 9, 11, 12, 13, 17, 19, 21, 24
Bucket Sort time complexity
- Best Case Time Complexity:
O(n+k)
- Average Case Time Complexity:
O(n)
- Worst Case Time Complexity:
O(n^2^)
Best Case Time Complexity:
If the array elements are uniformly distributed, bucket size will almost be the same for all the buckets. Hence, this will be the best case which will take up the least amount of time.
Sorting time complexity will reduce even further if all the elements inside each bucket are already sorted.
To create n buckets and scatter each element from the array, time complexity = O(n)
. If we use Insertion sort to sort each bucket, time complexity = O(k)
. Hence, best case time complexity for bucket sort = O(n+k)
,
wheren
= number of elements, andk
= number of buckets
Worst Case Time Complexity
If the array elements are not uniformly distributed, i.e., elements are concentrated within specific ranges.
This will result in one or more buckets having more elements than other buckets, making bucket sort like any other sorting technique, where every element is compared to the other.
Time complexity increases even further if the elements in the array are present in the reverse order. If insertion sort is used, the worst-case time complexity can go up to O(n^2^).
Bucket Sort Space Complexity
- Space Complexity :
O(n+k)
Space Complexity for bucket sort is O(n+k)
, wheren
= number of elements in the array, andk
= number of buckets formed
Space taken by each bucket is O(k)
, and inside each bucket, we have n elements scattered. Hence, the space complexity becomes O(n+k)
.
Applications
- Uniformly Distributed data
- Floating point numbers between range
0.0
to `1.0
Bucket Sort
is a very different type of sorting algorithm as it does not involve direct comparison between the numbers. It can only be applied to uniformly distributed data.- Whenever we have floating-point numbers between
0
and1
, bucket sort might be the best sorting approach. Reason – if we use merge-sort, quick-sort, heap-sort, etc, the problem will take a minimum ofO(nlogn)
time complexity. Also, counting sort cannot be used because floating point numbers cannot be used as index. Hence, bucket sort is best suited for sorting the array with floating point numbers of range[0.0-1.0]
.
Supercharge Your Coding Skills! Enroll Now in Our DSA Course and Master Algorithmic Excellence.
Conclusion
- The
bucket Sort
algorithm sorts the elements of the array by first segregating the array into a number of buckets, sorting each bucket, and then gathering the elements back to form the sorted array. Bucket Sort
is used to sort an array where elements are uniformly distributed, or where the elements of the array range between0
and1
.Bucket sort
can exhibit the best case time complexity ofO(n+k)
, wheren
is the number of buckets andk
is the bucket size.- The way buckets are provided ranges differ in cases when the elements of the array are floats and integers. This is discussed in detail above.