Aditya Trivedi

Searching in Data Structure

Searching in data structures involves finding the location of a specific element or value within a collection. It is a fundamental operation that can greatly influence data handling efficiency in applications like databases and search engines. There are various types of searching in data structure, the most common being Linear Search and Binary Search.

Types of Searches

This method searches through a collection by examining each element one by one. In a linear search, if we’re looking for an element “i” in an array “Arr” with n elements, we start with the assumption that “i” is not in “Arr” (location = -1). We compare “i” with each element of “Arr” sequentially. If “i” matches an element at position N in “Arr”, we update the location to N, indicating “i” is found. This process is straightforward but slow for large datasets, with a time complexity of O(n).

Interval Search algorithms, such as the Binary Search, are designed for searching elements in sorted collections. These methods work by repeatedly dividing the search range in half and comparing the target value to the middle element of the interval. The search is successful if the target is equal to the middle element. If it’s less, the search continues in the lower half of the interval; if it’s more, in the upper half. This process continues until the target is found or the interval becomes empty. The critical advantage of Interval Search is its efficiency, with a time complexity of O(log n), making it significantly faster than Sequential Search for large datasets.

Different types of searching Algorithms

Linear Search operates by starting at the first element of an array and sequentially moving through each element, comparing the search element with every array element until a match is found or the end of the array is reached.

Pseudocode:

function linearSearch(array, target)
    for each item in array
        if item equals target
            return the index of the item
    return not found

In this algorithm, the target is the element being searched for within the array. The process involves iterating through each element (item) in the array, comparing it with the target. If a match is found, the index of the matching element is returned. If the end of the array is reached without finding the target, a “not found” indication is returned.

Time Complexity:

  • Best-case: O(1), occurs when the target is the first element.
  • Average-case: O(n/2), assumes the target is equally likely to be at any position.
  • Worst-case: O(n), happens when the target is the last element or not present.

Space Complexity: O(1), as it requires no additional storage beyond the input array.

Binary Search starts by comparing the target value to the middle element of the sorted array. If the target is equal to the middle element, the search is complete. If the target is smaller, the search continues in the lower half of the array; if larger, in the upper half. This halving continues until the target is found or the search space is exhausted.

Pseudocode:

function binarySearch(array, target)
    left = 0
    right = length(array) - 1
    while left <= right
        mid = left + (right - left) / 2
        if array[mid] == target
            return mid
        else if array[mid] < target
            left = mid + 1
        else
            right = mid - 1
    return not found

In this approach, left and right represent the bounds of the current search interval, and mid is the index of the midpoint. The algorithm narrows down the search interval based on the comparison between the target and the value at mid. If the target matches the value at mid, the index mid is returned. Otherwise, the search interval is adjusted accordingly, and the process repeats until the target is found or the interval is empty.

Time Complexity:

  • Best-case: O(1), occurs when the target is the middle element of the array.
  • Average-case: O(log n), as the search space is halved with each step.
  • Worst-case: O(log n), happens when the target is at the ends or not present.

Space Complexity: O(1) in iterative implementations, as it operates directly on the input array without additional storage.

In Ternary Search , the array is divided into three segments using two mid points, mid_1 and mid_2, effectively narrowing down the search area to one-third of its current size in each step. This method is best applied to unimodal functions or sorted arrays where the elements are in a specific order.

Pseudocode:

function ternarySearch(array, left, right, target)
    if (right >= left)
        mid1 = left + (right - left) / 3
        mid2 = right - (right - left) / 3

        if (array[mid1] == target)
            return mid1
        if (array[mid2] == target)
            return mid2

        if (target < array[mid1])
            return ternarySearch(array, left, mid1 - 1, target)
        elseif (target > array[mid2])
            return ternarySearch(array, mid2 + 1, right, target)
        else
            return ternarySearch(array, mid1 + 1, mid2 - 1, target)

    return not found

In Ternary Search, the search begins by determining two midpoints that divide the array into three equal parts. Based on the comparison with the midpoints, the algorithm decides which segment to follow next, discarding the other two-thirds of the array in each step. This process repeats until the target element is found or the search space is exhausted.

Time Complexity:

  • Best-case: O(1), when the element is found at the first middle.
  • Average-case: O(log3 n), as the algorithm divides the search space into three parts.
  • Worst-case: O(log3 n), slightly better than Binary Search in terms of comparisons but involves more comparisons per iteration.

Space Complexity: O(1) for the iterative version, maintaining constant space utilization.

4. Jump Search:

Instead of examining each element, Jump Search progresses in fixed-size steps through the array and performs a linear search within a localized range once the potential section containing the target is identified.

Pseudocode:

function jumpSearch(array, target, step)
    prev = 0
    while array[min(step, length(array)) - 1] < target
        prev = step
        step += sqrt(length(array))
        if prev >= length(array)
            return not found
    while array[prev] < target
        prev += 1
        if prev == min(step, length(array))
            return not found
    if array[prev] == target
        return prev
    return not found

Jump Search involves jumping ahead by a fixed step step and then performing a linear search backward from the current position until the target is found or ruled out. The choice of step, typically √n where n is the array size, is crucial for optimizing the search efficiency.

Time Complexity:

  • Best-case: O(1), when the target is at the first position.
  • Average-case: O(√n), assuming a well-chosen step size.
  • Worst-case: O(√n), occurs when the target is just before the next jump or not present.

Space Complexity: O(1), as it requires a constant amount of space.

Interpolation Search calculates an estimate of the target’s position based on the value and position of the current probe point, potentially skipping over irrelevant sections of the array. This method adapts the search area dynamically, making it more efficient than Binary Search for certain distributions.

Pseudocode:

function interpolationSearch(array, target)
    low = 0
    high = length(array) - 1
    while low <= high and target >= array[low] and target <= array[high]
        pos = low + ((target - array[low]) * (high - low) / (array[high] - array[low]))
        if array[pos] == target
            return pos
        if array[pos] < target
            low = pos + 1
        else
            high = pos - 1
    return not found

Interpolation Search begins by estimating the position pos of the target based on the linear distribution of values between the current low and high indices. The search then narrows down around pos, adjusting low and high based on the comparison with the target. This process continues until the target is found or the search bounds are invalid.

Time Complexity:

  • Best-case: O(1), when the target is exactly at the estimated position.
  • Average-case: O(log(log n)), for uniformly distributed datasets.
  • Worst-case: O(n), in cases of highly skewed distributions.

Space Complexity: O(1), as it requires a constant amount of extra space.

Fibonacci Search divides the array into sections determined by Fibonacci numbers, eliminating the need for division or multiplication operations, which makes it particularly computation-friendly. The algorithm progresses by using Fibonacci numbers to determine the split points, reducing the search space in a manner similar to the divide-and-conquer strategy used in Binary Search.

Pseudocode:

function fibonacciSearch(array, target)
    fibM2 = 0  // (m-2)'th Fibonacci No.
    fibM1 = 1  // (m-1)'th Fibonacci No.
    fibM = fibM2 + fibM1  // m'th Fibonacci

    while (fibM < length(array))
        fibM2 = fibM1
        fibM1 = fibM
        fibM = fibM2 + fibM1

    offset = -1

    while (fibM > 1)
        i = min(offset + fibM2, length(array)-1)

        if (array[i] < target)
            fibM = fibM1
            fibM1 = fibM2
            fibM2 = fibM - fibM1
            offset = i
        elseif (array[i] > target)
            fibM = fibM2
            fibM1 = fibM1 - fibM2
            fibM2 = fibM - fibM1
        else 
            return i

    if(fibM1 and array[offset+1] == target)
        return offset+1

    return not found

In this approach, Fibonacci numbers are generated up to the size of the array. The search begins from the back of the array, using Fibonacci numbers to determine the next element to compare with the target. If the target is not found, the algorithm reduces the search space based on Fibonacci rules and continues the search in the new interval.

Time Complexity:

  • Best-case: O(1), when the target is found at the initial position.
  • Average-case: O(log n), due to the efficient partitioning of the search space.
  • Worst-case: O(log n), which is competitive with Binary Search, especially considering the operation costs.

Space Complexity: O(1), as it operates in constant space, making it highly space-efficient.

Exponential Search starts with an interval of 1 and exponentially increases the interval size by doubling it until the end of the interval is greater than the target value. Once the range is identified, Binary Search is applied within this range to find the target.

Pseudocode:

function exponentialSearch(array, target)
    if array[0] == target
        return 0
    i = 1
    while i < length(array) and array[i] <= target
        i = i * 2
    return binarySearch(array, i/2, min(i, length(array)), target)

In this method, i is initially set to 1 and is doubled (i = i * 2) at each step until the value at array[i] is greater than the target or i exceeds the array bounds. A Binary Search is then conducted between the indices i/2 and i (or the end of the array, whichever is smaller) to locate the target.

Time Complexity:

  • Best-case: O(1), when the target is at the beginning of the array.
  • Average-case: O(log i), where i is the position of the target value.
  • Worst-case: O(log i), ensuring efficient searching in data structure even in the worst scenarios.

Space Complexity: O(1), as it uses a constant amount of space, similar to Binary Search.

Explore Scaler Topics Data Structures Tutorial and enhance your DSA skills with Reading Tracks and Challenges.

Conclusion

  • searching algorithms in data structure involves locating a specific element or value within a collection, crucial for efficient data handling in various applications.
  • Common searching techniques include Linear Search and Binary Search, each with distinct advantages and complexities.
  • Linear Search, though straightforward, has a time complexity of O(n), making it less efficient for large datasets.
  • Binary Search, with a time complexity of O(log n), excels in efficiency for sorted collections, making it preferable for large datasets.
  • Other advanced searching techniques in data structure include Ternary Search, Jump Search, Interpolation Search, Fibonacci Search, and Exponential Search, each tailored for specific scenarios.
  • Recommendations include using Binary Search for large datasets and Linear Search for mid-sized arrays.

Author