What is a segment tree?
A flexible tree-based data structure that is used to effectively solve problems involving range query operations.
Takeaways
A Segment Tree
is a data structure that allows answering range queries over an array effectively, while still being flexible enough to allow modifying the array.
What is Segment Tree?
A segment tree
enters the scene, where other techniques such as prefix computation
fail. Let’s understand this contrast between the two.
Suppose we need to compute the sum of the elements of an array within a range. The prefix sum array method yields an O(1) solution, while the segment tree would result in an O(log n) solution. Doesn’t sound very helpful?
What if the operation was finding the maximum/minimum element within a range
, bitwise OR over a range
, gcd over a range
, etc.?
Suppose we have an array Arr = [0,1,2,3,4,5]
. Its prefix sum array will be preSum = [0,1,3,6,10,15]
. Now to calculate the sum of elements within index bounds [1,4] we use ans = preSum[4]-preSum[0]
. Easy peasy.
Unfortunately, the prefix minimum element array will be preMin = [0,0,0,0,0,0]
. Is there any way of calculating the minimum element within the same range [1,4]
? NO! đ Not with prefix technique.
Fortunately, with segment trees’ versatility, such problems can be solved efficiently.
Prefix computation does enable making updates to the array, but it needs to recompute the whole array making O(n)
time complexity overhead per update. One would not only update the element at a given index in the array but will have to also make changes in prefix array.
Thus prefix computation is definitely not the smart choice to make when updation is required. The segment tree is!
Now imagine if we could update such ranges in logarithmic time as well, wouldnât it be cool?
Letâs see how segment tree achieves it!
A segment tree is implemented in the array-based tree format { C.P. GUYS COULD RELATE }. The parent-child hierarchy is such that each array block represents a node by storing its value. For ith node, its left and right children are stored in (i * 2)th
and (i * 2 + 1)th
block respectively.
The entire array of elements to be operated upon is present in the segment tree in chunks of many sizes referred to as segments; hence the name segment tree đ.
The divide and conquer strategy
here dictates that during traversal along some path from top to bottom in the tree, at each level the sample space is halved. This tremendously improves efficiency during searching, updation, etc. Letâs have a look at a pictorial representation of a segment tree:
As observed from the diagram, The root of the tree is a complete segment, i.e. carries information related to all the elements of the array. For the next level, this data in root is divided into two halves.
The first half is assigned to the left subtree (precisely the left child) and the other half to the right one. This procedure is followed by subsequent nodes, recursively.
This brings us to leaves in the tree, which hold individual elements of the array. Thus all the leaves collectively represent the entire data (array elements) upon which the segment tree is built, individually.
Implementation of Segment Tree
The âbuildTree()â
function builds a segment tree using the array âvâ
and stores it into the array âtreeâ
.
- At each level of recursion, the range is divided into two halves.
- Then recursively the left and right children nodes are built for the first half and the other half of the range respectively.
- The current node will then hold the sum of its childrenâs values. The
base condition
is the case of a leaf, which is when the size of the range is one.
Suppose an array arr = [1,2,3]
. It will be stored in the tree in 3 layers of nodes/blocks:
[[3+3=6]]
[[1+2=3], [3]]
[[1], [2], [3]]
C++
//arr->the original array, tree->the segment tree array
vector<int>arr, tree;
//s->starting index, e->end index, node->current index
int buildTree(int node, int s int e) {
if (s == e) {
tree[node] = arr[s];
return arr[s];
}
//The middle element index
int m = s+ (e-s)/2;
tree[node] = buildTree(node * 2, s, m) + buildTree((node * 2) + 1, m + 1, e);
return tree[node];
}
segmentTree(const vector<int>& V) {
arr = V;
int max_size=4 * V.size();
tree.resize(max_size);
buildTree(1, 0, V.size() - 1);
}
JAVA
//arr->the original array, tree->the segment tree array
Integer tree[];
Integer arr[];
//s->starting index, e->end index, node->current index
Integer buildTree(Integer s, Integer e, Integer node) {
if (s == e) {
tree[node] = arr[s];
return arr[s];
}
//The middle element index
Integer mid = s + (e-s)/2;
tree[node] = buildTree(arr, s, mid, node * 2) + buildTree(arr, mid + 1, se, node * 2 + 1);
return tree[node];
}
SegmentTree(Integer Arr[], Integer n) {
Integer max_size = 4*n;
tree = new Integer[max_size];
arr=new Integer[n];
for(Integer i=0;i<n;++i){
arr[i]=Arr[i];
}
buildTree(0, n - 1, 1);
}
PYTHON
#s->starting index
#e->end index
#node->current index
def buildTree(arr, s, e, tree, node):
if (s == e):
tree[node] = arr[s]
return arr[s]
#The middle element index
mid = (s+e)/2
tree[node] = buildTree(arr, s, mid, tree, node * 2) + buildTree(arr, mid + 1, e, tree, node * 2 + 1)
return tree[node]
def constructST(arr, n):
max_size = 4*n;
tree = [0] * max_size
buildTree(arr, 0, n - 1, tree, 1)
return tree
Operations of Segment Tree
This example is an implementation of a segment tree
specifically as a solution to the problem of finding the sum of element in an array segment in range-based queries, and therefore is not a formal generic segment tree implementation as it would vary with each problem, just as the various coffees do.
Query
The âquery()â
function takes in as parameters the boundaries of the query, âlâ and ârâ, and returns the sum of the elements in this range of âlâ and ârâ in the array. Thus a query call of range [ 1,2 ] would return the sum of elements present in the array within the index bounds of 1
and 2
.
- Taking an example shown in the above cool gif, a query of [0,2] is easily retrieved as [0,2] happens to be present in a node already. But for a query [0,4] some extra muscle is required.
- We don’t have [0,4] on a node, but what we have are components like [0,2], [3,5], etc. Its almost like figuring out something from pieces of a map đ.
- We’ve got to extract and refine information from these components and merge them to obtain the desired result, i.e. an answer to the query [0,4].
- We get the value of [0,2] from the pre-existing node of same label beneath [0,5]. How do we get [3,4]? From [3,5]. Instead of using the value present at node [3,5], one would look at its depth and to one’s wonder, node [3,5] happens to have [3,4] as its left child.
- This node is exactly what we need (remember [0,2] from previous case). Now we have the sum of arrays within indices range [0,2] and [3,4] as well which on addition would yield the desired output which is
query(0,4)
. Bravo! C++
/*
s->current array bound start index,
e->current array bound end index,
qs->current query start index,
qe->current query end index,
node->current tree node index
*/
int getSumUtil(int s, int e, int qs, int qe, int node) {
if (qs <= s && qe >= e)
return tree[node];
if (e < qs || s > qe)
return 0;
int mid = s+ (e-s)/2;
return getSumUtil(s, mid, qs, qe, 2*node) +
getSumUtil(mid+1, e, qs, qe, 2*node+1);
}
int getSum(int qs, int qe) {
if (qs < 0 || qe > n-1 || qs > qe) {
cout<<"Invalid Input";
return -1;
}
return getSumUtil(0, n-1, qs, qe, 1);
}
JAVA
/*
s->current array bound start index,
e->current array bound end index,
qs->current query start index,
qe->current query end index,
node->current tree node index
*/
Integer getSumUtil(Integer s, Integer e, Integer qs, Integer qe, Integer node) {
if (qs <= s && qe >= e)
return tree[node];
if (e < qs || s > qe)
return 0;
Integer mid = s + (e - s)/2;
return getSumUtil(s, mid, qs, qe, 2 * node) + getSumUtil(mid + 1, e, qs, qe, 2 * node + 1);
}
Integer getSum(Integer n, Integer qs, Integer qe){
if (qs < 0 || qe > n - 1 || qs > qe) {
System.out.println("Invalid Input");
return -1;
}
return getSumUtil(0, n - 1, qs, qe, 1);
}
PYTHON
#s->current array bound start index
#e->current array bound end index
#qs->current query start index
#qe->current query end index
#node->current tree node index
def getSumUtil(tree, s, e, qs, qe, index):
if (qs <= s and qe >= e):
return tree[node]
if (e < qs or s > qe):
return 0
mid = s+(e-s)/2
return getSumUtil(tree, s, mid, qs, qe, 2 * node) + getSumUtil(tree, mid + 1, e, qs, qe, 2 * node + 1)
def getSum(tree, n, qs, qe):
if (qs < 0 or qe > n - 1 or qs > qe):
print("Invalid Input", end="")
return -1
return getSumUtil(tree, 0, n - 1, qs, qe, 1)
Update
The update
operation takes in the new value to be set and the corresponding index. After making the update to the tree array, it needs to make similiar relevant changes to the actual segment tree. This is done by calling the âupdateValueUtil()â
function which recursively updates the segment tree.
To understand this better let us take a look at the previous example.
Suppose we need to set the element present at index 2
to the value of 10
.
- We first update the tree array at index
2
to10
, and then call theâupdateValueUtil()â
function to update the segment tree. - Now begins the real adventure. The recursive function takes a bunch of parameters- the difference this update will produce, the index of the current node in the segment tree, start index and end index of the current range in recursion.
- If current node index is out of range, we return. If the current node index is within the range, we update the value of the node and then recursively call the function on the left and right child of the current node. Thus the new value at node [2,2] will be
10
. - This affects a couple of other nodes in the tree, naturally the ones having ranges inclusive of
2
. These are [0,2] and [0,5]. Note that the old value at node/index2
was7
and the value of dif is10-7=3
. At [0,5] dif will be added to set its new value as18(=15+3)
. Finally comes the turn of root node [0,5]. Its value will be updated as47=(44+3)
.
C++
void updateValueUtil(vector<int>&st, int ss, int se, int i, int diff, int si){
if (i < ss || i > se)
return;
st[si] = st[si] + diff;
if (se != ss) {
int mid = ss + ( ( se - ss ) >> 1 );
updateValueUtil(st, ss, mid, i, diff, 2 * si + 1);
updateValueUtil(st, mid + 1, se, i, diff, 2 * si + 2);
}
}
void updateValue(vector<int>&arr, vector<int>&st, int n, int i, int new_val){
if (i < 0 || i > n - 1) {
cout << "Invalid Input";
return;
}
int diff = new_val - arr[i];
arr[i] = new_val;
updateValueUtil(st, 0, n - 1, i, diff, 0);
}
JAVA
void updateValueUtil(Integer ss, Integer se, Integer i, Integer diff, Integer si) {
if (i < ss || i > se)
return;
st[si] = st[si] + diff;
if (se != ss) {
Integer mid = ss + ((se - ss) >> 1);
updateValueUtil(ss, mid, i, diff, 2 * si + 1);
updateValueUtil(mid + 1, se, i, diff, 2 * si + 2);
}
}
void updateValue(Integer arr[], Integer n, Integer i, Integer new_val) {
if (i < 0 || i > n - 1) {
System.out.println("Invalid Input");
return;
}
Integer diff = new_val - arr[i];
arr[i] = new_val;
updateValueUtil(0, n - 1, i, diff, 0);
}
PYTHON
def updateValueUtil(st, ss, se, i, diff, si):
if (i < ss or i > se):
return
st[si] = st[si] + diff
if (se != ss):
mid = ss+((se-ss)/2)
updateValueUtil(st, ss, mid, i,diff, 2 * si + 1)
updateValueUtil(st, mid + 1, se, i,diff, 2 * si + 2)
def updateValue(arr, st, n, i, new_val):
if (i < 0 or i > n - 1):
print("Invalid Input", end="")
return
diff = new_val - arr[i]
arr[i] = new_val
updateValueUtil(st, 0, n - 1, i, diff, 0)
Complexities of Segment Tree
The time complexity of tree construction is O(n)
, n
being the number of elements put in the tree, as n
calls to âbuildTree()â will be made overall, as there are (2*n)-1
nodes in the tree and value of each node is computed only once during tree construction.
The time complexity of a query operation is $O( log_2{n})$ as a maximum of 4
nodes will be visited at any level of recursion and the height of the tree is $log_2{n}$ which corresponds to the number of levels.
The space complexity of a segment tree is O(n)
.
Uses Cases of Segment Tree
- Computational geometry: A common use case is in solving geometric problems that involve querying a range of points, such as finding the count of points that lie in a specific rectangular region or determining the closest point to a given point. This is achieved by storing the points in the segment tree and using its efficient search and update properties to answer these queries. Furthermore, segment trees can be utilized in higher dimensions for problems such as rectangle or box queries in 2D or 3D space. These applications make segment trees a powerful tool in many areas of computational geometry.
- Geographic information systems: Segment trees in Geographic Information Systems (GIS) are crucial for spatial range queries, such as identifying all geographic features within a specific area. Their efficient search and update capabilities also enable rapid nearest neighbor queries to identify the closest feature to a given point. These functions make segment trees a valuable tool for GIS data management and analysis.
- Machine learning: In machine learning, segment trees can be utilized in reinforcement learning algorithms, specifically in prioritized experience replay, which is used to enhance the learning process. Prioritized experience replay selectively samples important experiences to learn from based on a priority value. The segment tree data structure, with its efficient range query handling and point update capabilities, is used to store and retrieve these experiences. It allows the algorithm to efficiently sample experiences based on their priority values, improving the learning efficiency and effectiveness of the model.
Other Range-Query Data Structures
Sqrt-decomposition
– answers each query in O(ân), preprocessing done in O(n).Fenwick tree
– answers each query in O(logN), preprocessing done in O(NlogN).
Conclusion
- Used for efficiently solving problems involving range query operations.
- A divide and conquer algorithm.
- Time Complexity for tree construction is
O(n)
. - Time complexity for querying and updation are
O(log n)
. - Applicable where pre-computation fails or is costly.
- Alternatives to segment tree are Binary Indexed Tree and SQRT tree.