"then why in TIFR question is log*n* is ans, when they asked "`we want to find the number of elements x in the tree that lie between a and b,`

"

Because they have done augmentation.

The Gateway to Computer Science Excellence

+6 votes

In a balanced binary search tree with $n$ elements, what is the worst case time complexity of reporting all elements in range $[a,b]$? Assume that the number of reported elements is $k$.

- $\Theta (\log n)$
- $\Theta (\log n +k)$
- $\Theta (k \log n)$
- $\Theta ( n \log k)$

+7 votes

First, you'll have to check if the elements $a$ and $b$ are present in the BST or not. Since the BST is balanced, it will take $\theta(log_2(n))$ time for that. Traversing the $k$ elements would take additional $\theta(k)$ time.

Hence $\theta(log_2(n) + k)$

Hence $\theta(log_2(n) + k)$

0

"then why in TIFR question is log*n* is ans, when they asked "`we want to find the number of elements x in the tree that lie between a and b,`

"

Because they have done augmentation.

+1

Update: Below is wrong, I did find an algorithm which works in $O(k + lg n)$ . Accepted explanation is handwavy but atleast the answer is correct. Correct explanation is doing recursive calls to left subtree if k1 is less than left child, print root if in range, recursive call to right if k2 is greater than right child. Runs in O(k + lg n)

None of the options seems to be the answer. The best algorithm seems to be :

1. Do $in\ order$ traversal and whenever you encounter a value in range $[a,b]$ just $print$ it.

Total complexity = $O(n)$

$OR$

1. $Save$ keys in an $array$ via $in\ order$ traversal: $O(n)$

2. $Binary\ search\ a$ (say found at $i$), $Binary\ Search\ b$ (say found at $j$), : $O(lgn)$

3. Traverse the $array$ from index $i$ to $j$: $O(k)$

So. $O(n + k + lgn)$ which is nothing but $O(n)$

So, the best algorithm is $O(n)$

Lets check the options:

$A. O(lg\ n)\ is\ too\ small$

$B. O(lg\ n + k)\ is\ too\ small\ as\ well$ (since our algo does not depend on k)

$C. O(k \lg\ n)\ is\ too\ big$ (since we can do in $O(n)$ even when $k = n$)

$D. O(n \lg\ n)\ is\ too\ big$ (since we can do in $O(n)$ )

So, $none$ of the options are correct.

None of the options seems to be the answer. The best algorithm seems to be :

1. Do $in\ order$ traversal and whenever you encounter a value in range $[a,b]$ just $print$ it.

Total complexity = $O(n)$

$OR$

1. $Save$ keys in an $array$ via $in\ order$ traversal: $O(n)$

2. $Binary\ search\ a$ (say found at $i$), $Binary\ Search\ b$ (say found at $j$), : $O(lgn)$

3. Traverse the $array$ from index $i$ to $j$: $O(k)$

So. $O(n + k + lgn)$ which is nothing but $O(n)$

So, the best algorithm is $O(n)$

Lets check the options:

$A. O(lg\ n)\ is\ too\ small$

$B. O(lg\ n + k)\ is\ too\ small\ as\ well$ (since our algo does not depend on k)

$C. O(k \lg\ n)\ is\ too\ big$ (since we can do in $O(n)$ even when $k = n$)

$D. O(n \lg\ n)\ is\ too\ big$ (since we can do in $O(n)$ )

So, $none$ of the options are correct.

0

I couldn't find any better algorithm which depended on k.

The answer above is wrong.

There is a worse time complexity algorithm though which runs in $O(klgn)$ (mentioned in ErickDemains video)

The answer above is wrong.

There is a worse time complexity algorithm though which runs in $O(klgn)$ (mentioned in ErickDemains video)

0

Question is asking for

worst case time complexity of reporting all elements in range [a,b]

then why r u not going for worst case?

0

Because that's not worst of the worst algorithm, if you want you can find even worse algorithm than $O(k lg n)$ . For example you can do a O(n^2) sort and then linear search. That is worse than $O(k lg n)$.

I hope that is clear.

Anyway just now I found an algorithm which really works in $O(k + lg n)$ in balance BST. My answer is wrong then, above accepted answer is correct. I will delete my answer tomorrow.

I hope that is clear.

Anyway just now I found an algorithm which really works in $O(k + lg n)$ in balance BST. My answer is wrong then, above accepted answer is correct. I will delete my answer tomorrow.

0

when you add some extra information to the existing data structure then that existing data structure becomes augmented.

This extra information for a node can be anything like no. of nodes in the left sub-tree or right sub-tree or height of that node or total no. of black-colored nodes in the left and right sub-trees etc.

This extra information for a node can be anything like no. of nodes in the left sub-tree or right sub-tree or height of that node or total no. of black-colored nodes in the left and right sub-trees etc.

0

Here a video explanation at 7:45

Algorithm: Do recursive calls to left subtree if k1 is less than left child, print root if in range, recursive call to right if k2 is greater than right child. Runs in O(k + lg n) .

0

Can we not do like this?

Suppose in range there are (n-2) elements and outside of range there are only 2 elements

Step 1) We could found common ancestor of range query in $log n$ time.

Step 2) just eliminate those two elements, which are not in range. Can be done in O(1) time.

Step 3) And we can tell other than those two elements all are inside the range.

So, isnot it in O(log n) only??

0

Yes sure. Then if the range is only 1 element and if it is at root, then we have O(1) algorithm, nice. Solve for general case, please.

0

See

Step 1) We could found common ancestor of range query in logn time.

Step 2) Then from root , go to those nodes which are satisfying query. So, here root to leaf walk will take log n time and it could be log n (or) 2 log n (or) 3 log n.

Step 3) If those taking more time, subtract those node, which are not in range, which will also take O(1) time.

right??

+1

@srestha can you tell me how are you gonna report all K elements in O(1) time ?

As for the tifr question you have mentioned , Do read the question clearly,It is given that *at each node, in addition to the key, we store the number of elements in the sub tree rooted at that node ** *,which plays a key role here.When you want to know number of elements you just find the lca of the smallest and largest element in the range and travel through that path and add the number of elements in subtree of each node,which is stored at each node as per the question.SO O(logn) comparisons and O(logn) additions.You clearly didn't understand that question and just saw O(logn) and applied it here too.

0 votes

The question asks about worst case time complexity. The worst case can be achieved by implementing the BST using singly linked list. and here it will take log(N) time complexity as the tree is balanced and for k numbers the time complexity becomes O(kLogN). This is what was in my mind while attempting the question in the exam hall.

Option (C).

Option (C).

0

The question is not asking for worst algorithm you can devise. The question is asking for best alrogithm that works for the worst test case in the best possible time.

0

where are u getting this " best alrogithm that works for the worst test case in the best possible time."

Question only told "worst case time complexity"

Can u tell me, what that means?

+1

It means what I wrote there which you have quoted. I would suggest you to go through some course lectures, as you seem to be missing the point of why we even use algorithms, Erik Demain's course is good one, just search on youtube. Go through complexity lecture too (its part of his course) that also seems completely haywire.

Also, a course on English grammar if you want to improve on that area, personally I do not care but people might not understand what you are trying to say. "Question is only told" is not right.

Also, a course on English grammar if you want to improve on that area, personally I do not care but people might not understand what you are trying to say. "Question is only told" is not right.

0

Corrected..

yes, I already gone through Erik Demaine MIT course.

Can u tell me, which lecture specifically??

yes, I already gone through Erik Demaine MIT course.

Can u tell me, which lecture specifically??

0

Well, go through it once more if you can, and try to answer to yourself why do people use algorithms, do not worry about exact algorithm, just that one question why do we even try to find algorithm. What happens when someone finds a better algorithm. Why do we even invent more complex data structures? Life is already hard so why to invent more complex algorithm. Is it only to test students in exams like GATE or does it have any real life implications. How do we even decide which one is the better algorithm among given two algorithms. Then go on to ask what does complexity mean, does O(lg n) mean it always takes lg n time, or sometimes takes lg n time?

What is best case or worst case algorithm?

Whenever we study algorithm we try to find out what all cases will slow our algorithm the most, thats called worst test case. Similarly what all situation will cause our algorithm to finish fast, those are best test cases.

Also get some real feel by writing code for some of the algorithm - write bubble sort, run it on a million randomly generated numbers, and just wait for it to finish. Then write quicksort and run it on a million random numbers and just wait for it to finish. Then write a $O(n!)$ to sort random 10 and 20 numbers. See the difference, feel the need for a better algorithm. And then you will really understand why would one even ask for "Worst algorithm". Since the worst algorithm has no limits, I can always write a worst algorithm by just putting while(true); before running any algorithm. No one asks for worse algorithm, that is against the whole point of having algorithm.

What is best case or worst case algorithm?

Whenever we study algorithm we try to find out what all cases will slow our algorithm the most, thats called worst test case. Similarly what all situation will cause our algorithm to finish fast, those are best test cases.

Also get some real feel by writing code for some of the algorithm - write bubble sort, run it on a million randomly generated numbers, and just wait for it to finish. Then write quicksort and run it on a million random numbers and just wait for it to finish. Then write a $O(n!)$ to sort random 10 and 20 numbers. See the difference, feel the need for a better algorithm. And then you will really understand why would one even ask for "Worst algorithm". Since the worst algorithm has no limits, I can always write a worst algorithm by just putting while(true); before running any algorithm. No one asks for worse algorithm, that is against the whole point of having algorithm.

0

yes true.

In question, it is told "the number of `reported`

elements is k."

what "reported" means?

It means `inform`

those elements , which are between a and b.

right??

0

Again, "It means `inform`

those elements " is completely wrong (grammatically). Nobody informs the elements, they are not living beings.

report = print() on the screen

0

I think report is telling about existence, means if there exists some node or not.

It doesnot wants it's value. When u r telling, that print something, that means we work on it's values. But report doesnot require it.

right?

It doesnot wants it's value. When u r telling, that print something, that means we work on it's values. But report doesnot require it.

right?

0

Again, Can u check this code for this tree

If we want to print the nodes between 30 and 70

Then check this code

while(root!=30){ if(root<30){ root=root->right; } while(root>=30){ root=root->left; if(root>30 AND root<=70){ root=root->left; Count++; } else{ break; } }

0

why aren't you doing print() instead of count++ ?

Also, I don't think this code will ever enter the section where Count++ happens, for the given tree.

A much easier way to check would be to create a tree, and check it yourself on your computer.

Here I have created one for you for the tree you described : https://ideone.com/75Q3We (your code gives runtime error, maybe because null check is not there, you will have to debug, play with it for a few days, if possible write a recursive code)

0

Because either of them are wrong individually, they are correct only if they are together as O(lg n + k).

for example: lg n is wrong when there are n/2 elements to report

and o(k) is wrong when there is only 1 element to be reported and thats present at some leaf node.

for example: lg n is wrong when there are n/2 elements to report

and o(k) is wrong when there is only 1 element to be reported and thats present at some leaf node.

0

Then , is TIFR question ans wrong? It is telling only logn comparison and logn addition?

logn addition means T.C. also log n ,right?

0 votes

The idea is to traverse the given binary search tree starting from root. For every node being visited, check if this node lies in range, if yes, then add 1 to result and recur for both of its children. If the current node is smaller than the low value of range, then recur for right child, else recur for left child.

Time complexity of the above program is O(h + k) where h is the height of BST and k is the number of nodes in a given range.

Here h is logn, hence O(logn+k)

Time complexity of the above program is O(h + k) where h is the height of BST and k is the number of nodes in a given range.

Here h is logn, hence O(logn+k)

52,222 questions

59,849 answers

201,031 comments

118,095 users