728 views

Suppose there is a balanced binary search tree with $n$ nodes, where at each node, in addition to the key, we store the number of elements in the sub tree rooted at that node.

Now, given two elements $a$ and $b$, such that $a < b$, we want to find the number of elements $x$ in the tree that lie between $a$ and $b$, that is, $a \leq x \leq b$. This can be done with (choose the best solution).

1. $O(\log n)$  comparisons and $O(\log n)$ additions.
2. $O(\log n)$ comparisons but no further additions.
3. $O \left (\sqrt{n} \right )$ comparisons but $O(\log n)$ additions.
4. $O(\log n)$ comparisons but a constant number of additions.
5. $O(n)$ comparisons and $O(n)$ additions, using depth-first- search.
edited | 728 views
Well explained learnt something new

We first find $a$ in the BST in $O(\log n)$ time. Now there are two possibilities, b can be in the right subtree of $a$ or b can be in the right subtree of any of the parents of $a$. For the first case, we simply search for $b$, in the right subtree of $a$ and at each step we add the number of elements in the left subtree +1 (BST being balanced, this can be retrieved from the depth of the node without any finding method), if we are moving right and simply 1 if we are moving left. When we find $b$, this sum will give as the required number of elements. This requires $O(\log n)$ additions but we can do smarter by doing

$$N(a) - N(\text{LEFT}(a)) - N(\text{RIGHT}(b))$$

where, $N(x)$ denote the no. of elements in the subtree rooted at $x$ and if LEFT(a), RIGHT(b) returning 0 for NULL.

For the second case also we do the same method. But first we find the common ancestor of $a$ and $b$ (possible in $O( \log n)$- say $p$ and also count the no. of nodes in the right subtree of each node from $a$ to $p$ excluding $p$. Now, from $p$ to $a$ we proceed the counting as in the earlier case when $b$ was in the subtree at $a$. So, in the worst case we have to do $O(\log n)$ additions. Here, also we can reduce the no. of additions by doing

$$N(p) - N(\text{LEFT}(a)) - N(\text{RIGHT}(b))$$

where, $N(x)$ denote the no. of elements in the subtree rooted at $x$ and if LEFT(a), RIGHT(b) returning 0 for NULL.

edited by

^ @srestha..why to count all nodes... just count the node which are there in the path of P to a and P to b...and use formula accordingly.. which can be done in O(log n).. right ??

And here in your example ... @arjun sir's formula is working well but .. try to use this formula for the example I hv posted..

For your 1st example , when both the node are in the same path from P any leaf..b ( like 12 and 17 are in the same path from root(10) to leaf (20)..

then, do following steps -

Calculate N(Right(a) = 3  // node - 16 , 17 , 20 ....O(log n) time

calculate N(Right(b)) = 1 // node - 20 ...O( log n) time.

So, N ( a< N < b) = N(Right(a)) - N( Right(b) - 1 = 3 - 1 -1 = 1  // node 16 only.

For your second case example -

Do following steps -

1. First find common ancestor of node ...here 16  ... // O(log n) time

2. take count variable = 0.

3. do add all right subent nodes from P to a ..

here - count =count + N(Right ( 14 )) + 1 (counting node 14 as well) =0+ 1 + 1 =2.

count = count + N(Right (12 )) = 2 + 1 = 3.

4.  do add all left subent nodes from P to b ..

here - count = count + N(left (17)) = 3 + 0.

5. finaly, total N(a<N<b) = count + 1 ( counting node 16 ..).

Again , total addition -- O(log n)..

@arjun sir

@Arjun Sir,   number of comparisons is O(log n).
Number of addition is also O(log n)..
Even if we will follow the second method by you , that will also lead to O(log n) additions in worst case.

Here a is 30 & b is any number greater than 50... Lets only focus on 'a'
20 is the first number which did not satisfy. Fine. I ignored it's left side & noted it's number of children.
I moved to right side of 20.
34 satisfied. I moved to left of 34. Then I found 30.

It seems I have to do only a constant additions & one subtraction. But it is not true.

This zigzag pattern can continue till leaf, where a node will be less than a, so I will ignore it's left side but I have to go to it's right side.
I hope the diagram below clearly describes what I want to say.

(I AM ONLY SHOWING THE PATH IN A PARTICULAR BST. OTHER NODES ARE THERE TOO. IT IS A PATH IN BBST)

30 is a .

If 'a' is near root, then it is simple that left side of 'a' does not satisfy, right side will always satisfy. That is BEST CASE.
But if 'a' is near to leaf then this rule cant be applied  easily. We have to do O(log n) additions & one subtraction.

Thanks @Bala for the long discussion & giving right arguments.

@Vijay, @Ahwan Yes, my answer does not seem correct. But the zigzag pattern is not a proof -- we need to show that at least log n additions are required in the worst case for any possible method.
@Arjun Sir I guess only these two methods exist.
1) Add the nodes whose value lies between a & b
OR
2) Subtract those nodes from total nodes which does not fall in range a & b.

For both methods it is proved that  number of additions in worst case is O(log n)
$O(\log n)$ comparisons and $O(\log n)$ additions. The algorithm is :

1. Find $a$ and $b$ : This will take $O(\log n)$ comparisons as tree is balanced BST.

2. Follow path from $a$ to $b$, and along the path, keep adding the required number of nodes to result by looking at number stored at each node. Path length is $O(\log n)$, hence number of additions will also be $O(\log n)$.
If b is not in the subtree of a, we cant get the path using normal BST procedure rt?
We actually look at the number of nodes in the right child of each node in the path on our way up, and the number of nodes in the left child of each node in the path on our way down, but yeah, those are $O(\log n)$ additions and $O(\log n)$ comparisons.
@Arjun, yes we need parent pointer to go up the tree, but we can store parents during searching only.
The parent will be the lowest common ancestor for (a, b) - which will be the rightmost common node in the search paths for a and b rt?
@Arjun Sir,what if it is a binary search tree

Will it be o(n) addition and o(n) comparisons
O(n) comparisons. Addition can be constant only - see the answer now.
+1 vote
// res1 ,res2 and count are global variables initialized to 0

//res_link is a global pointer of type struct node* initialized to NULL

//t->data is the value present in a node

//t->elem is the number of elements in the left and right subtrees of t excluding t

//output is the number of elements between a and b including a and b

//Address of the root node is passed as an argument to fun() and we execute fun1() from fun()

//the function fun() is associated with finding 'a' and function fun1() is associated with finding b

void fun(struct node *t)
{
if(a==(t->data))
{
if(t->right)
{
if(t!=root)
{
res2=t->right->elem+2;
}
else
{
fun1(root->right);
printf("%d",res2+1);
}
}
else
res2=1;
}
else if(a<(t->data))
{
fun(t->left);
if(b>(t->data))
{
res1+=res2;
if(t->right)
res2=t->right->elem+2;
else
res2=1;
if(t==root)
{
res2=1;
if(t->right)
{
fun1(t->right);
printf("%d",res1+res2);
}
else
printf("%d",res1+res2);
}
}
else
{
if(count==0)
{
count=1;
if(res2>1)
{
res2=1;
printf("%d",res1+res2);
}
else
printf("%d",res1+1);
}
}
}
else
fun(t->right);
}

fun1(struct node *t)
{

if(b==(t->data))
{

if(t->left)
res2+=t->left->elem +2;
else
res2+=1;
}
else if(b>(t->data))
{
fun1(t->right);
if(t->left)
res2+=t->left->elem +2;
else
res2+=1;
}
else
fun1(t->left);

}

@arjun sir i have constructed the above code to prove that number of additions and number of comparisions will not exceed log n in the worst case. The time complexity of the above code is O(log n). I request you to take a look at this code and  and verify that option a) is correct