1.4k views

In a balanced binary search tree with $n$ elements, what is the worst case time complexity of reporting all elements in range $[a,b]$? Assume that the number of reported elements is $k$.

1. $\Theta (\log n)$
2. $\Theta (\log n +k)$
3. $\Theta (k \log n)$
4. $\Theta ( n \log k)$
in DS
recategorized | 1.4k views

First, you'll have to check if the elements $a$ and $b$ are present in the BST or not. Since the BST is balanced, it will take $\theta(log_2(n))$ time for that. Traversing the $k$ elements would take additional $\theta(k)$ time.

Hence $\theta(log_2(n) + k)$
by Active (1.5k points)
0

"reporting " can meaning counting or printing or informing

So, why donot we use that logic in comment section there?

For your 1st example , when both the node are in the same path from P any leaf..b ( like 12 and 17 are in the same path from root(10) to leaf (20)..

then, do following steps -

Calculate N(Right(a) = 3  // node - 16 , 17 , 20 ....O(log n) time

calculate N(Right(b)) = 1 // node - 20 ...O( log n) time.

So, N ( a< N < b) = N(Right(a)) - N( Right(b) - 1 = 3 - 1 -1 = 1  // node 16 only.

For your second case example -

Do following steps -

1. First find common ancestor of node ...here 16  ... // O(log n) time

2. take count variable = 0.

3. do add all right subent nodes from P to a ..

here - count =count + N(Right ( 14 )) + 1 (counting node 14 as well) =0+ 1 + 1 =2.

count = count + N(Right (12 )) = 2 + 1 = 3.

4.  do add all left subent nodes from P to b ..

here - count = count + N(left (17)) = 3 + 0.

5. finaly, total N(a<N<b) = count + 1 ( counting node 16 ..).

Again , total addition -- O(log n)..

0

"then why in TIFR question is logn is ans, when they asked "we want to find the number of elements x in the tree that lie between a and b,"

Because they have done augmentation.

+1
Update: Below is wrong, I did find an algorithm which works in $O(k + lg n)$ . Accepted explanation is handwavy but atleast the answer is correct. Correct explanation is doing recursive calls to left subtree if k1 is less than left child, print root if in range, recursive call to right if k2 is greater than right child. Runs in O(k + lg n)

None of the options seems to be the answer. The best algorithm seems to be :

1. Do $in\ order$ traversal and whenever you encounter a value in range $[a,b]$ just $print$ it.

Total complexity = $O(n)$

$OR$

1. $Save$ keys in an $array$ via $in\ order$ traversal: $O(n)$

2. $Binary\ search\ a$ (say found at $i$), $Binary\ Search\ b$ (say found at $j$), : $O(lgn)$

3. Traverse the $array$ from index $i$ to $j$: $O(k)$

So. $O(n + k + lgn)$ which is nothing but $O(n)$

So, the best algorithm is $O(n)$

Lets check the options:

$A. O(lg\ n)\ is\ too\ small$

$B. O(lg\ n + k)\ is\ too\ small\ as\ well$ (since our algo does not depend on k)

$C. O(k \lg\ n)\ is\ too\ big$ (since we can do in $O(n)$ even when $k = n$)

$D. O(n \lg\ n)\ is\ too\ big$ (since we can do in $O(n)$ )

So, $none$ of the options are correct.
0
But why not depend on k?

According to previous ans, it either be k or log n

right?
0

@ankitgupta.1729

augmentation means?

0
I couldn't find any better algorithm which depended on k.

There is a worse time complexity algorithm though which runs in $O(klgn)$ (mentioned in ErickDemains video)
0
So, ur ans k log n?
0
No, why go with worse algorithm when better is already known having $O(n)$
0

worst case time complexity of reporting all elements in range [a,b]

then why r u not going for worst case?

0
Because that's not worst of the worst algorithm, if you want you can find even worse algorithm than $O(k lg n)$ . For example you can do a O(n^2) sort and then linear search. That is worse than $O(k lg n)$.

I hope that is clear.

Anyway just now I found an algorithm which really works in $O(k + lg n)$ in balance BST. My answer is wrong then, above accepted answer is correct. I will delete my answer tomorrow.
0
when you add some extra information to the existing data structure then that existing data structure becomes augmented.
This extra information for a node can be anything like no. of nodes in the left sub-tree or right sub-tree or height of that node or total no. of black-colored nodes in the left and right sub-trees etc.
0

@mkagenius

Do we need to challenge this??

0

@srestha No, the answer is correct O(k + lg n ) (only the explanation is handwavy)

0

@mkagenius

What algorithm u got?

@ankitgupta.1729

Then which line do u think augmented here?

0

Here a video explanation at 7:45

Algorithm: Do recursive calls to left subtree if k1 is less than left child, print root if in range, recursive call to right if k2 is greater than right child. Runs in O(k + lg n) .

0

Can we not do like this?

Suppose in range there are (n-2) elements and outside of range there are only 2 elements

Step 1) We could found common ancestor of range query in $log n$ time.

Step 2) just eliminate those two elements, which are not in range. Can be done in O(1) time.

Step 3) And we can tell other than those two elements all are inside the range.

So, isnot it in O(log n) only??

0
Yes sure. Then if the range is only 1 element and if it is at root, then we have O(1) algorithm, nice. Solve for general case, please.
0

@mkagenius

See

Step 1) We could found common ancestor of range query in logn time.

Step 2) Then from root , go to those nodes which are satisfying query. So, here root to leaf walk will take log n time and it could be log n (or) 2 log n (or) 3 log n.

Step 3) If those taking more time, subtract those node, which are not in range, which will also take O(1) time.

right??

+1

@srestha can you tell me how are you gonna report all K elements in O(1) time ?

As for the tifr question you have mentioned , Do read the question clearly,It is given that at each node, in addition to the key, we store the number of elements in the sub tree rooted at that node  ,which plays a key role here.When you want to know number of elements you just find the lca of the smallest and largest element in the range and travel through that path and add the number of elements in subtree of each node,which is stored at each node as per the question.SO O(logn) comparisons and O(logn) additions.You clearly didn't understand that question and just saw O(logn) and applied it here too.

0
To write the answer on the computer screen, you have to invoke print() statement atleast k times if your answer has k elements (as given in the question)

So no one, not even god can invent any algorithm which is better than O(k)
+1 vote
The idea is to traverse the given binary search tree starting from root. For every node being visited, check if this node lies in range, if yes, then add 1 to result and recur for both of its children. If the current node is smaller than the low value of range, then recur for right child, else recur for left child.
Time complexity of the above program is O(h + k) where h is the height of BST and k is the number of nodes in a given range.
Here h is logn, hence O(logn+k)
by Active (2.9k points)
The question asks about worst case time complexity. The worst case can be achieved by implementing the BST using singly linked list. and here it will take log(N) time complexity as the tree is balanced and for k numbers the time complexity becomes O(kLogN). This is what was in my mind while attempting the question in the exam hall.

Option (C).
by (189 points)
0
The question is not asking for worst algorithm you can devise. The question is asking for best alrogithm that works for the worst test case in the best possible time.
0

@mkagenius

where are u getting this " best alrogithm that works for the worst test case in the best possible time."

Question only told "worst case time complexity"

Can u tell me, what that means?

+1
It means what I wrote there which you have quoted. I would suggest you to go through some course lectures, as you seem to be missing the point of why we even use algorithms, Erik Demain's course is good one, just search on youtube. Go through complexity lecture too (its part of his course) that also seems completely haywire.

Also, a course on English grammar if you want to improve on that area, personally I do not care but people might not understand what you are trying to say. "Question is only told" is not right.
0
Corrected..

yes, I already gone through Erik Demaine MIT course.

Can u  tell me, which lecture specifically??
0
Well, go through it once more if you can, and try to answer to yourself why do people use algorithms, do not worry about exact algorithm, just that one question why do we even try to find algorithm. What happens when someone finds a better algorithm. Why do we even invent more complex data structures? Life is already hard so why to invent more complex algorithm. Is it only to test students in exams like GATE or does it have any real life implications. How do we even decide which one is the better algorithm among given two algorithms. Then go on to ask what does complexity mean, does O(lg n) mean it always takes lg n time, or sometimes takes lg n time?

What is best case or worst case algorithm?

Whenever we study algorithm we try to find out what all cases will slow our algorithm the most, thats called worst test case. Similarly what all situation will cause our algorithm to finish fast, those are best test cases.

Also get some real feel by writing code for some of the algorithm - write bubble sort, run it on a million randomly generated numbers, and just wait for it to finish. Then write quicksort and run it on a million random numbers and just wait for it to finish. Then write a $O(n!)$ to sort random 10 and 20 numbers. See the difference, feel the need for a better algorithm. And then you will really understand why would one even ask for "Worst algorithm". Since the worst algorithm has no limits, I can always write a worst algorithm by just putting while(true); before running any algorithm. No one asks for worse algorithm, that is against the whole point of having algorithm.
0

@mkagenius

yes true.

In question, it is told "the number of reported elements is k."

what "reported" means?

It means inform those elements , which are between a and b.

right??

0

Again, "It means inform those elements " is completely wrong (grammatically). Nobody informs the elements, they are not living beings.

report = print() on the screen

0
I think report is telling about existence, means if there exists some node or not.

It doesnot wants it's value. When u r telling, that print something, that means we work on it's values. But report doesnot require it.

right?
0

Again, Can u check this code for this tree

If we want to print the nodes between 30 and 70

Then check this code

while(root!=30){
if(root<30){
root=root->right;
}
while(root>=30){
root=root->left;
if(root>30 AND root<=70){
root=root->left;
Count++;
}
else{
break;
}
}

0

why aren't you doing print() instead of count++ ?

Also, I don't think this code will ever enter the section where Count++ happens, for the given tree.

A much easier way to check would be to create a tree, and check it yourself on your computer.

Here I have created one for you for the tree you described : https://ideone.com/75Q3We (your code gives runtime error, maybe because null check is not there, you will have to debug, play with it for a few days, if possible write a recursive code)

0

@mkagenius

ok trying

do I need to challenge this question?

Should it give marks to all?

0
No, why challenge?
0
because they havenot considered O(log n) or O(k) as ans
0
Because either of them are wrong individually, they are correct only if they are together as O(lg n + k).

for example: lg n is wrong when there are n/2 elements to report

and o(k) is wrong when there is only 1 element to be reported and thats present at some leaf node.
0

@mkagenius

Then , is TIFR question ans wrong? It is telling only logn comparison and logn addition?

logn addition means T.C. also log n ,right?

0
No more answers, please complete the above exercise first. I want you to figure your own questions yourselves, otherwise its of no use.
0
which one?program?

We need to first find the indices of a & b, which takes time of log n + log n (Since it is Balanced BST). Once we find the indices, we can directly print the k elements.

As k can be significantly larger than log n, time complexity is O(log n + k)

by (67 points)

+1 vote