Rashita Mehta

Bucket Sort Algorithm

Bucket sort efficiently organizes elements by distributing them into distinct buckets, facilitating subsequent sorting using alternative algorithms. This technique, pivotal in software engineering interviews, categorizes data into buckets based on uniform distribution. Once grouped, each bucket undergoes sorting, and the sorted elements are then amalgamated into a cohesive, ordered sequence. Bucket sort’s approach streamlines sorting processes, ensuring effective data arrangement through systematic bucket allocation and subsequent organization.

Uniformly Distributed Data

  • Uniformly distributed data has difference between every adjacent element almost the same.
  • Examples and graph to represent uniformly distributed data.

An array is said to be uniformly distributed if the difference between each adjacent element inside the array is almost the same. Or, in other terms, the elements of the array spread evenly throughout the whole span/range of the array.

For example:

  1. Consider this array: [10, 21, 29, 41, 52]
    The difference between each adjacent term is almost equal to 10. Hence, this array has uniformly distributed data and can be sorted using the bucket sort algorithm.
  2. Consider another array: [1,4,23,5,44,9,6,43]
    This array is not uniformly distributed because the number of elements between the range [0-10] = 5 (i.e., 1,4,5,9, and 6), whereas the number of elements between [10-20] is 0, and the same for the range [30-40].
    The data is scattered over different data ranges but is concentrated within some specific ranges, whereas sparse among the others. Hence, this array is not uniformly distributed.

This is how a uniform distribution data looks like:
The complete data is scattered equally within each range and hence depicts uniformly distributed data.

what is bucket sort

Scatter-Gather-Approach

Solving large and complex problems can be a little hard at times. Scatter-gather-approach tries to simplify such complex problems by first scattering the complete data into clusters, solving these sub-problems individually, and finally gathering them all together to get the final result.
This is how Bucket sort uses scatter-gather-approach:

Bucket sort scatter-gather-approach

Bucket Sort Algorithm

  • Works on Uniformly Distributed data
  • Follows Scatter-Gather-Approach
  • Sorting technique with time complexity O(n)

Bucket sort is a sorting technique that uses the Scatter-Gather-Approach to sort the array.

It divides the unsorted array into separate groups and calls them buckets. Sort the individual buckets, and then gather them all together to form the final sorted array.

  • If the elements of the array are floats ranging between 0 and 1, we primarily make 10 buckets, numbered from 0 to 9, and then insert elements into these buckets depending upon their most significant number.
    The bucket index is calculated as: (int) (elementNumber * 10)
  • If the elements of the array are integers, as shown in the figure below, we simply calculate the range:
    range = (maximumNumber - minimumNumber) / noOfBuckets
    and divide the whole range into buckets and then perform bucket sorting.

Graphically, this is how the bucket sort with integer elements look like:

bucket sort with integer elements
:::
:::section{.main}

Working of Bucket Sort

Let us understand how the algorithm works with the help of the above example. Consider this unsorted array

bucket sort unsorted array

Step 1:

Since we have the array with integer elements, we’ll calculate the range first.

maximumElement = 24
minimumElement = 1
noOfBuckets = 5 (will be given as parameter)
range = (int)(24 - 1) / 5
= 4

Hence, we’ll have buckets of the following ranges:
[1-5] [6-10] [11-15] [16-20] [21-25]

bucket sort range

Step 2: Scatter

Now, iterate through the unsorted array and keep inserting the numbers in the bucket of their corresponding range.

For example:

array[0] = 11, Range = [11-15]
array[1] = 9, Range = [6-10]
array[2] = 21, Range = [21-25]
array[3] = 8, Range = [6-10]
array[4] = 17, Range = [16-20]

and so on..

Finally, the buckets will look like this:

Bucket Sort with unsorted array

Step 3:

Now sort all the elements in each of the buckets, and the sorted buckets look like this:

sorted buckets

Step 4: Gather

At last, visit each bucket and gather all the numbers together. Merge them all, and we’ll get the sorted array.

Hence, the sorted array is:
Bucket sorted array

PseudeCode

bucketSorting(arr, n):
Create n empty buckets
loop through each element of the arr and do the following
Calculate bucketIndex
Insert the element into the corresponding bucket number
3) Sort the individual buckets
4) Gather all the elements together

Hence, we have the sorted array.

Bucket Sort Algorithm for floating points numbers

Below written is the complete code for bucket sort.

At first, we create a vector, bucket and then create n buckets inside this vector. The whole array is then traversed, and elements are placed in the bucket corresponding to their range.

JAVA Code

import java.util.*;
import java.util.Collections;

class Main {

    // JAVA program to perform bucket sorting
    // on array of size n
    static void bucketSorting(float array[], int n)
    {
        if (n <= 0)
            return;

        // 1) Create n empty buckets
        Vector<Float>[] buckets = new Vector[n];

        for (int i = 0; i < n; i++) {
            buckets[i] = new Vector<Float>();
        }

        // 2) Put array elements in different buckets
        for (int i = 0; i < n; i++) {
            float idx = array[i] * n;
            buckets[(int)idx].add(array[i]);
        }

        // 3) Sort individual buckets
        for (int i = 0; i < n; i++) {
            Collections.sort(buckets[i]);
        }

        // 4) Gather all buckets together into array[]
        int arrayIndex = 0;
        for (int i = 0; i < n; i++) {
            int bucketSize = buckets[i].size();
            for (int k = 0; k < bucketSize; k++) {
                array[arrayIndex ++] = buckets[i].get(k);
            }
        }
    }

    // Driver code
    public static void main(String args[])
    {
        float array[] = { (float)0.42, (float)0.32,
                        (float)0.35, (float)0.52,
                        (float)0.39, (float)0.47,
                        (float)0.50};

        int n = array.length;
        bucketSorting(array, n);

        System.out.println("Sorted Array is: ");
        for (float element : array) {
            System.out.print(element + " ");
        }
    }
}
Input: [0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68]

no. of buckets: 5

Output:

Sorted Array is: [0.12, 0.17, 0.21, 0.23, 0.26, 0.39, 0.68, 0.72, 0.78, 0.94]

Note: This applies if the elements of the array range between 0 and 1. What if the elements are integers? In that case, we’ll set the range of each bucket a little differently, which we will learn in the next section.

Following images show the working of code with the inputs taken above:

working of bucket sort code
working of bucket sort coding
bucket sort working coding
bucket sort working coding inputs

Bucket Sorting for Integer elements

  • Find range = (max - min) / noOfBuckets
  • bucketIndex = (array[i] - minimum) / range
  • Finally, sort and gather

For integer elements, we need to follow the following algorithm:

  1. Calculate the maximum and the minimum element of the array
  2. Calculate the range:
    range = (maximum - minimum) / n,
    where n is the number of buckets (Given as parameter)
  3. Create n empty buckets and initialize them with 0
  4. Loop through the unsorted array and perform the following:
    a) Calculate bucketIndex
    bucketIndex = (array[i] - minimum) / range
    b) Insert the ith element of the array into the bucket[bucketIndex]
  5. Sort the individual buckets
  6. Gather all the elements together

Hence, we have the sorted array with us. This is how it works:

Python code – Bucket Sort with Integer elements

    # Python program for numbers having integer part
    def bucketSorting(array, noOfBuckets):
        maxElement = max(array)
        minElement = min(array)
    
        # Calculate Range
        Range = (maxElement - minElement) / noOfBuckets
    
        temp = []
    
        # create n(noOfBuckets) empty buckets
        for i in range(noOfBuckets):
            temp.append([])
    
        # Insert elments of array in corresponding buckets
        for i in range(len(array)):
            diff = (array[i] - minElement) / Range - int((array[i] - minElement) / Range)
    
            if(diff == 0 and array[i] != minElement):
                temp[int((array[i] - minElement) / Range) - 1].append(array[i])
    
            else:
                temp[int((array[i] - minElement) / Range)].append(array[i])
    
        # Sort the buckets individually
        for i in range(len(temp)):
            if len(temp[i]) != 0:
                temp[i].sort()
    
        # Gather elements from all the buckets to get sorted array
        k = 0
        for lst in temp:
            if lst:
                for i in lst:
                    array[k] = i
                    k = k+1
    
    
    # Driver Code
    array = [11, 9, 21, 8, 17, 19, 13, 1, 24, 12]
    noOfBuckets = 5
    bucketSorting(array, noOfBuckets)
    print("Sorted Array after Bucket Sorting is: ", array)

    Output

    Input: [11, 9, 21, 8, 17, 19, 13, 1, 24, 12]
    no. of buckets: 5
    
    Output:
    Sorted Array after Bucket Sorting is:
    1, 8, 9, 11, 12, 13, 17, 19, 21, 24

    Bucket Sort time complexity

    • Best Case Time Complexity: O(n+k)
    • Average Case Time Complexity: O(n)
    • Worst Case Time Complexity: O(n^2^)

    Best Case Time Complexity:

    If the array elements are uniformly distributed, bucket size will almost be the same for all the buckets. Hence, this will be the best case which will take up the least amount of time.

    Sorting time complexity will reduce even further if all the elements inside each bucket are already sorted.

    To create n buckets and scatter each element from the array, time complexity = O(n). If we use Insertion sort to sort each bucket, time complexity = O(k). Hence, best case time complexity for bucket sort = O(n+k),
    where
    n = number of elements, and
    k = number of buckets

    Worst Case Time Complexity

    If the array elements are not uniformly distributed, i.e., elements are concentrated within specific ranges.

    This will result in one or more buckets having more elements than other buckets, making bucket sort like any other sorting technique, where every element is compared to the other.
    Time complexity increases even further if the elements in the array are present in the reverse order. If insertion sort is used, the worst-case time complexity can go up to O(n^2^).

    Bucket Sort Space Complexity

    • Space Complexity : O(n+k)

    Space Complexity for bucket sort is O(n+k), where
    n = number of elements in the array, and
    k = number of buckets formed
    Space taken by each bucket is O(k), and inside each bucket, we have n elements scattered. Hence, the space complexity becomes O(n+k).

    Applications

    • Uniformly Distributed data
    • Floating point numbers between range 0.0 to `1.0
    1. Bucket Sort is a very different type of sorting algorithm as it does not involve direct comparison between the numbers. It can only be applied to uniformly distributed data.
    2. Whenever we have floating-point numbers between 0 and 1, bucket sort might be the best sorting approach. Reason – if we use merge-sort, quick-sort, heap-sort, etc, the problem will take a minimum of O(nlogn) time complexity. Also, counting sort cannot be used because floating point numbers cannot be used as index. Hence, bucket sort is best suited for sorting the array with floating point numbers of range [0.0-1.0].

    Supercharge Your Coding Skills! Enroll Now in Our DSA Course and Master Algorithmic Excellence.

    Conclusion

    • The bucket Sort algorithm sorts the elements of the array by first segregating the array into a number of buckets, sorting each bucket, and then gathering the elements back to form the sorted array.
    • Bucket Sort is used to sort an array where elements are uniformly distributed, or where the elements of the array range between 0 and 1.
    • Bucket sort can exhibit the best case time complexity of O(n+k), where n is the number of buckets and k is the bucket size.
    • The way buckets are provided ranges differ in cases when the elements of the array are floats and integers. This is discussed in detail above.

    Author