Aditya Saxena

Segment Tree

What is a segment tree?
A flexible tree-based data structure that is used to effectively solve problems involving range query operations.

Takeaways

A Segment Tree is a data structure that allows answering range queries over an array effectively, while still being flexible enough to allow modifying the array.

What is Segment Tree?

A segment tree enters the scene, where other techniques such as prefix computation fail. Let’s understand this contrast between the two.

Suppose we need to compute the sum of the elements of an array within a range. The prefix sum array method yields an O(1) solution, while the segment tree would result in an O(log n) solution. Doesn’t sound very helpful?

What if the operation was finding the maximum/minimum element within a range, bitwise OR over a range, gcd over a range, etc.?

Suppose we have an array Arr = [0,1,2,3,4,5]. Its prefix sum array will be preSum = [0,1,3,6,10,15]. Now to calculate the sum of elements within index bounds [1,4] we use ans = preSum[4]-preSum[0]. Easy peasy.

Unfortunately, the prefix minimum element array will be preMin = [0,0,0,0,0,0]. Is there any way of calculating the minimum element within the same range [1,4]? NO! 🙁 Not with prefix technique.
Fortunately, with segment trees’ versatility, such problems can be solved efficiently.

Prefix computation does enable making updates to the array, but it needs to recompute the whole array making O(n) time complexity overhead per update. One would not only update the element at a given index in the array but will have to also make changes in prefix array.

Thus prefix computation is definitely not the smart choice to make when updation is required. The segment tree is!

Now imagine if we could update such ranges in logarithmic time as well, wouldn’t it be cool?

Let’s see how segment tree achieves it!

A segment tree is implemented in the array-based tree format { C.P. GUYS COULD RELATE }. The parent-child hierarchy is such that each array block represents a node by storing its value. For ith node, its left and right children are stored in (i * 2)th and (i * 2 + 1)th block respectively.

The entire array of elements to be operated upon is present in the segment tree in chunks of many sizes referred to as segments; hence the name segment tree 😉.

The divide and conquer strategy here dictates that during traversal along some path from top to bottom in the tree, at each level the sample space is halved. This tremendously improves efficiency during searching, updation, etc. Let’s have a look at a pictorial representation of a segment tree:

Pictorial Representation of a Segment Tree

As observed from the diagram, The root of the tree is a complete segment, i.e. carries information related to all the elements of the array. For the next level, this data in root is divided into two halves.

The first half is assigned to the left subtree (precisely the left child) and the other half to the right one. This procedure is followed by subsequent nodes, recursively.

This brings us to leaves in the tree, which hold individual elements of the array. Thus all the leaves collectively represent the entire data (array elements) upon which the segment tree is built, individually.

Implementation of Segment Tree

The ‘buildTree()’ function builds a segment tree using the array ‘v’ and stores it into the array ‘tree’.

  • At each level of recursion, the range is divided into two halves.
  • Then recursively the left and right children nodes are built for the first half and the other half of the range respectively.
  • The current node will then hold the sum of its children’s values. The base condition is the case of a leaf, which is when the size of the range is one.

Suppose an array arr = [1,2,3]. It will be stored in the tree in 3 layers of nodes/blocks:

                [[3+3=6]]
             [[1+2=3],  [3]]
         [[1],    [2],    [3]]

C++

    //arr->the original array, tree->the segment tree array
    vector<int>arr, tree;

    //s->starting index, e->end index, node->current index
    int buildTree(int node, int s int e) {
        if (s == e) {
            tree[node] = arr[s];
            return arr[s];
        }

        //The middle element index
        int m = s+ (e-s)/2;
        tree[node] = buildTree(node * 2, s, m) + buildTree((node * 2) + 1, m + 1, e);
        return tree[node];
    }

    segmentTree(const vector<int>& V) {
        arr = V;
        int max_size=4 * V.size();
        tree.resize(max_size);
        buildTree(1, 0, V.size() - 1);
    }

JAVA

    //arr->the original array, tree->the segment tree array
    Integer tree[];
    Integer arr[];

    //s->starting index, e->end index, node->current index
    Integer buildTree(Integer s, Integer e, Integer node) {
        if (s == e) {
            tree[node] = arr[s];
            return arr[s];
        }

        //The middle element index
        Integer mid = s + (e-s)/2;
        tree[node] = buildTree(arr, s, mid, node * 2) + buildTree(arr, mid + 1, se, node * 2 + 1);
        return tree[node];
    }

    SegmentTree(Integer Arr[], Integer n) {
        Integer max_size = 4*n;
        tree = new Integer[max_size];
        arr=new Integer[n];
        for(Integer i=0;i<n;++i){
            arr[i]=Arr[i];
        }
        buildTree(0, n - 1, 1);
    }

PYTHON

#s->starting index
#e->end index
#node->current index
def buildTree(arr, s, e, tree, node):
    if (s == e):
        tree[node] = arr[s]
        return arr[s]

    #The middle element index
    mid = (s+e)/2
    tree[node] = buildTree(arr, s, mid, tree, node * 2) + buildTree(arr, mid + 1, e, tree, node * 2 + 1)
    return tree[node]

def constructST(arr, n):
    max_size = 4*n;
    tree = [0] * max_size
    buildTree(arr, 0, n - 1, tree, 1)
    return tree

Operations of Segment Tree

This example is an implementation of a segment tree specifically as a solution to the problem of finding the sum of element in an array segment in range-based queries, and therefore is not a formal generic segment tree implementation as it would vary with each problem, just as the various coffees do.

Query

The ‘query()’ function takes in as parameters the boundaries of the query, ‘l’ and ‘r’, and returns the sum of the elements in this range of ‘l’ and ‘r’ in the array. Thus a query call of range [ 1,2 ] would return the sum of elements present in the array within the index bounds of 1 and 2.

the-sum-of-values-segment-tree
segment-tree-the-sum-of-values
  • Taking an example shown in the above cool gif, a query of [0,2] is easily retrieved as [0,2] happens to be present in a node already. But for a query [0,4] some extra muscle is required.
  • We don’t have [0,4] on a node, but what we have are components like [0,2], [3,5], etc. Its almost like figuring out something from pieces of a map 😁.
  • We’ve got to extract and refine information from these components and merge them to obtain the desired result, i.e. an answer to the query [0,4].
  • We get the value of [0,2] from the pre-existing node of same label beneath [0,5]. How do we get [3,4]? From [3,5]. Instead of using the value present at node [3,5], one would look at its depth and to one’s wonder, node [3,5] happens to have [3,4] as its left child.
  • This node is exactly what we need (remember [0,2] from previous case). Now we have the sum of arrays within indices range [0,2] and [3,4] as well which on addition would yield the desired output which is query(0,4). Bravo! C++
    /*
     s->current array bound start index,
     e->current array bound end index,
     qs->current query start index,
     qe->current query end index,
     node->current tree node index
    */
    int getSumUtil(int s, int e, int qs, int qe, int node) {
      if (qs <= s && qe >= e)
        return tree[node];

      if (e < qs || s > qe)
        return 0;

      int mid = s+ (e-s)/2;
      return getSumUtil(s, mid, qs, qe, 2*node) +
        getSumUtil(mid+1, e, qs, qe, 2*node+1);
    }

    int getSum(int qs, int qe) {
      if (qs < 0 || qe > n-1 || qs > qe) {
          cout<<"Invalid Input";
          return -1;
      }
      return getSumUtil(0, n-1, qs, qe, 1);
    }

JAVA

      /*
       s->current array bound start index,
       e->current array bound end index,
       qs->current query start index,
       qe->current query end index,
       node->current tree node index
       */
      Integer getSumUtil(Integer s, Integer e, Integer qs, Integer qe, Integer node) {
          if (qs <= s && qe >= e)
              return tree[node];
          if (e < qs || s > qe)
              return 0;
          Integer mid = s + (e - s)/2;
          return getSumUtil(s, mid, qs, qe, 2 * node) + getSumUtil(mid + 1, e, qs, qe, 2 * node + 1);
      }

          Integer getSum(Integer n, Integer qs, Integer qe){
          if (qs < 0 || qe > n - 1 || qs > qe) {
              System.out.println("Invalid Input");
              return -1;
          }
          return getSumUtil(0, n - 1, qs, qe, 1);
      }

PYTHON

  #s->current array bound start index
  #e->current array bound end index
  #qs->current query start index
  #qe->current query end index
  #node->current tree node index
  def getSumUtil(tree, s, e, qs, qe, index):
      if (qs <= s and qe >= e):
          return tree[node]
      if (e < qs or s > qe):
          return 0
      mid = s+(e-s)/2
      return getSumUtil(tree, s, mid, qs, qe, 2 * node) + getSumUtil(tree, mid + 1, e, qs, qe, 2 * node + 1)

  def getSum(tree, n, qs, qe):
      if (qs < 0 or qe > n - 1 or qs > qe):
          print("Invalid Input", end="")
          return -1
      return getSumUtil(tree, 0, n - 1, qs, qe, 1)

Update

The update operation takes in the new value to be set and the corresponding index. After making the update to the tree array, it needs to make similiar relevant changes to the actual segment tree. This is done by calling the ‘updateValueUtil()’ function which recursively updates the segment tree.
To understand this better let us take a look at the previous example.
Suppose we need to set the element present at index 2 to the value of 10.

  • We first update the tree array at index 2 to 10, and then call the ‘updateValueUtil()’ function to update the segment tree.
  • Now begins the real adventure. The recursive function takes a bunch of parameters- the difference this update will produce, the index of the current node in the segment tree, start index and end index of the current range in recursion.
  • If current node index is out of range, we return. If the current node index is within the range, we update the value of the node and then recursively call the function on the left and right child of the current node. Thus the new value at node [2,2] will be 10.
  • This affects a couple of other nodes in the tree, naturally the ones having ranges inclusive of 2. These are [0,2] and [0,5]. Note that the old value at node/index 2 was 7 and the value of dif is 10-7=3. At [0,5] dif will be added to set its new value as 18(=15+3). Finally comes the turn of root node [0,5]. Its value will be updated as 47=(44+3).

C++

      void updateValueUtil(vector<int>&st, int ss, int se, int i, int diff, int si){
          if (i < ss || i > se)
              return;
          st[si] = st[si] + diff;
          if (se != ss) {
              int mid = ss + ( ( se - ss ) >> 1 );
              updateValueUtil(st, ss, mid, i, diff, 2 * si + 1);
              updateValueUtil(st, mid + 1, se, i, diff, 2 * si + 2);
          }
      }

      void updateValue(vector<int>&arr, vector<int>&st, int n, int i, int new_val){
          if (i < 0 || i > n - 1) {
              cout << "Invalid Input";
              return;
          }
          int diff = new_val - arr[i];
          arr[i] = new_val;
          updateValueUtil(st, 0, n - 1, i, diff, 0);
      }

JAVA

      void updateValueUtil(Integer ss, Integer se, Integer i, Integer diff, Integer si) {
          if (i < ss || i > se)
              return;
          st[si] = st[si] + diff;
          if (se != ss) {
              Integer mid = ss + ((se - ss) >> 1);
              updateValueUtil(ss, mid, i, diff, 2 * si + 1);
              updateValueUtil(mid + 1, se, i, diff, 2 * si + 2);
          }
      }
      void updateValue(Integer arr[], Integer n, Integer i, Integer new_val) {
          if (i < 0 || i > n - 1) {
              System.out.println("Invalid Input");
              return;
          }
          Integer diff = new_val - arr[i];
          arr[i] = new_val;
          updateValueUtil(0, n - 1, i, diff, 0);
      }

PYTHON

      def updateValueUtil(st, ss, se, i, diff, si):
          if (i < ss or i > se):
              return
          st[si] = st[si] + diff
          if (se != ss):
              mid = ss+((se-ss)/2)
              updateValueUtil(st, ss, mid, i,diff, 2 * si + 1)
              updateValueUtil(st, mid + 1, se, i,diff, 2 * si + 2)

      def updateValue(arr, st, n, i, new_val):
          if (i < 0 or i > n - 1):
              print("Invalid Input", end="")
              return
          diff = new_val - arr[i]
          arr[i] = new_val
          updateValueUtil(st, 0, n - 1, i, diff, 0)

Complexities of Segment Tree

The time complexity of tree construction is O(n), n being the number of elements put in the tree, as n calls to ‘buildTree()’ will be made overall, as there are (2*n)-1 nodes in the tree and value of each node is computed only once during tree construction.
The time complexity of a query operation is $O( log_2{n})$ as a maximum of 4 nodes will be visited at any level of recursion and the height of the tree is $log_2{n}$ which corresponds to the number of levels.
The space complexity of a segment tree is O(n).

Uses Cases of Segment Tree

  • Computational geometry: A common use case is in solving geometric problems that involve querying a range of points, such as finding the count of points that lie in a specific rectangular region or determining the closest point to a given point. This is achieved by storing the points in the segment tree and using its efficient search and update properties to answer these queries. Furthermore, segment trees can be utilized in higher dimensions for problems such as rectangle or box queries in 2D or 3D space. These applications make segment trees a powerful tool in many areas of computational geometry.
  • Geographic information systems: Segment trees in Geographic Information Systems (GIS) are crucial for spatial range queries, such as identifying all geographic features within a specific area. Their efficient search and update capabilities also enable rapid nearest neighbor queries to identify the closest feature to a given point. These functions make segment trees a valuable tool for GIS data management and analysis.
  • Machine learning: In machine learning, segment trees can be utilized in reinforcement learning algorithms, specifically in prioritized experience replay, which is used to enhance the learning process. Prioritized experience replay selectively samples important experiences to learn from based on a priority value. The segment tree data structure, with its efficient range query handling and point update capabilities, is used to store and retrieve these experiences. It allows the algorithm to efficiently sample experiences based on their priority values, improving the learning efficiency and effectiveness of the model.

Other Range-Query Data Structures

  • Sqrt-decomposition – answers each query in O(√n), preprocessing done in O(n).
  • Fenwick tree – answers each query in O(logN), preprocessing done in O(NlogN).

Conclusion

  • Used for efficiently solving problems involving range query operations.
  • A divide and conquer algorithm.
  • Time Complexity for tree construction is O(n).
  • Time complexity for querying and updation are O(log n).
  • Applicable where pre-computation fails or is costly.
  • Alternatives to segment tree are Binary Indexed Tree and SQRT tree.

See More

Author