2.8k views

A program takes as input a balanced binary search tree with $n$ leaf nodes and computes the value of a function $g(x)$ for each node $x$. If the cost of computing $g(x)$ is: $$\Large \min \left ( \substack{\text{number of leaf-nodes}\\\text{in left-subtree of x}}\;,\; \substack{\text{number of leaf-nodes}\\\text{in right-subtree of x}}\right )$$

Then the worst-case time complexity of the program is?

1. $\Theta (n)$

2. $\Theta (n \log n)$

3. $\Theta(n^2)$

4. $\Theta (n^2\log n)$

retagged | 2.8k views
This question is giving me a severe headache :P

Can anyone answer it in detail along with a few examples ?

It is balanced binary search tree given .

In simple words

we have total of logn levels (because tree is balanced) and cost at every level is O(n).

Hence total cost is= n*logn =O(nlogn)

how is cost of each level n?

if we use simple divide and conquer technique(without using extra space)

we first need to count no. of leaf in left sub-tree, then no. of leaf in right sub-tree + compare them to find minimum.

T(n)=2T(n/2)+1   that gives complexity as O(n)

but we need to compute this for every node.

so, cost at 0 th level=O(n)

cost at 1st level=n/2+n/2=O(n)

cost at 2nd level=n/4+n/4+n/4+n/4=O(n)

we have total of logn levels (because tree is balanced) and cost at every level is O(n).

Hence total cost is  O( n log n)

@ Bikram
Thank you
It could be solved in O(N) with O(N) extra memory. But O(NlogN) without extra memory.
@ chhotu

Can you explain bit more

B. At the root node (first level) the cost would be $\Theta\left(\frac{n}{2}\right)$ as the tree is balanced.

At next level, we have 2 nodes and for each of them cost of computing $g(x)$ will be $\Theta\left(\frac{n}{4}\right)$. So, total cost at second level $= \Theta\left(\frac{n}{2}\right).$ Similarly at each level (total cost per level and not the cost per node in a level) the cost would be $\Theta\left(\frac{n}{2}\right)$ and so for $\log n$ levels it would be $\Theta({n} \log n).$

PS: Even if we change $\min$ to $\max$ in the defintion of $g(x)$ we get the same answer.

selected by
Shouldn't the cost would be n/2 then n/4 and so on ....
thats the cost per level- not the cost per node in a level.
clear now?
how is the cost for root node n/2 and the nodes at the next level n/4 per node and so on.. ?
* In the given answer it is considered as the tree has n-nodes(for genrality) instead 2n-1 nodes as given in the original question, so play accordingly.
//just a remark as it took a quite a time for me to find out how the first level has theta(n/2) cost while having

2n-1 nodes.
The number of levels in a balanced tree having n leaf nodes is $log(n)+1$ right? Then multiplying it with $n/2$ will give us $\mathit{\Theta (nlogn + n)}$. But $nlogn>n$, so answer is $\mathit{\Theta (nlogn)}$

Do a post order traversal and store and return $\min \left( \text{g}(x \rightarrow left), \text{g}(x \rightarrow right ) \right )$ for all non leaf nodes and store $0$ for all leaf nodes and return 1. BST being balanced, with n leaf nodes we can have total 2n nodes and complexity of tree traversal is linear in number of nodes- $\Theta(n)$.

But this is just computing the time complexity of $g(x)$ for each node- not exactly what is asked in question.

Actually the procedure I gave is computing the COST of computing the value of $g(x)$ which would have been correct had the question been defined as

$g(x) = \Large \min \left ( \substack{\text{number of leaf-nodes}\\ \text{in left-subtree of x}}\;,\; \substack{\text{number of leaf-nodes}\\\text{in right-subtree of x}}\right )$.

The correct answer for this would be $\Theta(n \log n)$ as at each level, the cost is $\Theta(n)$ and we have $\log n$ levels since the tree is balanced.

what is the reason for returning 1 for leaf nodes plz explain Sir
because we are counting the no. of leaf nodes- each of them add 1 to this count.
Sir, in the question it is mentioned

A program takes as input a balanced binary search tree with n leaf nodes and computes the value of a function g(x) for each node x.
here g(x) is mentioned as a function which takes the time complexity which you answered as Θ(n) to calculate..
we need to apply the function at every node so, we start with node which would take Θ(n), then two child which would take Θ(n/2)=Θ(n).. and so on until the leaves
here we have n leave so, total log n levels.. so we get Θ(n log n). Shouldnt, it be?
@Arjun, here g(x) is computed for every node, It seems like O(nlogn) is more correct. Not sure why you are saying O(n).

$n$ = number of leaf nodes

@Akash, abhilash:

The complexity for computing $g(x)$ for any given node is $O(1)$ because we already have the required values for the left and right subtree.

(provided we apply the algorithm in a way that a parent is never evaluated before its child - a post-order fashion as Arjun described would do.)

Summing up those $O(1)$ computations for $O(n)$ nodes gives us a net complexity of $O(n)$ for the overall algorithm.

The question is still ambiguous as it just mentions "worst case" without mentioning the "optimal implementation"

If I were to answer this in GATE, I would mark $O(n)$, assuming that the question is talking about the "optimal implementation", simply because there is no upperbound for a sub-optimal implementation!

@ Pragy, i got the point.. thanks :)

x

@Abhilash:

No, the complexity of computing $g(x)$ is $O(1)$ and not $O(n)$. Read the answer and my comment again.

The complexity of applying $g(x)$ to $1$ node is $O(1)$, and we have $O(n)$ nodes, so in total, from start to finish, adding $g(x)$ for every node, full and final, no hidden conditions, we get $O(n)$ time.

@Pragy, little bit confused. What if algorithm is written in a way that for every node it always takes min(no of leaf nodes in Left Subtree, min of nodes in Right subtree ) and does not depend on what value we computed for it's left and right child already ! Can you prove that such algorithm doesn't exists ?

min(number of leaf-nodesin left-subtree of x,number of leaf-nodesin right-subtree of x)

@Akash: That's totally possible! But what if the algorithm tries to solve the Halting problem before computing the number of leaves in the left subtree. What if it tried to become the PM of India before finding the leaves in the right subtree?

There is no upperbound to the time complexity of a sub-optimal algorithm. If you're given an exact algorithm, you compute the time for it. If you're left with the choice to implement the algorithm as you like, you always consider the optimal implementation.

PS: this question isn't about finding the time complexity at all. This question tests whether the student can use the bottom-up/dynamic programming  approach to implement the optimal algorithm.
@Pragy Actually the procedure I gave is computing the COST of computing the value of $g(x)$ which would have been correct had the question been defined as

$g(x) = \Large \min \left ( \substack{\text{number of leaf-nodes}\\ \text{in left-subtree of x}}\;,\; \substack{\text{number of leaf-nodes}\\\text{in right-subtree of x}}\right )$

But here it is given the cost of computing $g(x)$.
Oh. I totally missed that! That was a blunder indeed. The answer will be $O(n\log n)$ then?
yes..
@Arjun sir confused between O(n) and O(nlogn) ???

One postorder travesal on 2n nodes sud take O(n) is n't it??

And they metioned in question that to find g(x) for each node..then why we r finding level by level...help sir..
Here

$g(x) = \Large \min \left ( \substack{\text{number of leaf-nodes}\\ \text{in left-subtree of x}}\;,\; \substack{\text{number of leaf-nodes}\\\text{in right-subtree of x}}\right )$

if it will be cost of computing VALUE of g(x) then answer will be O(N), because here min() is calculating value among right subtree and left subtree.

But here we are calculating g(x) , i.e. height of the tree. Complexity to measure height of a tree is O(log N), and N such node are there

So, it will be $\Theta (n log n)$

Am I rt?
@srestha i think g(x) is not calculating here height.

And we can compute g(x) while doing some modified Postorder traversal .
We are doing post order traversal because here we are finding leaves. But if we do only post order traversal here then time taken will be $\Theta (n)$ because then we keep track of all nodes.

i.e. why @Arjun sir told it is not VALUE of g(x) what we are not calculating here.

We are calculating g(x) here.

Then what this g(x) is?

It is a comparison between two values. Other than height how we could get $log n$ complexity in comparison of two values
I think (logn) they meant is:-

Total n leaf nodes given in question so max logn levels possible.
The above discussion telling that to calculate g(x) we need O(n) time.

And logn levels are there so O(nlogn) time.

But this is not digested by me??

Reason1.:--why calculating level by level bcz they ask to calculate @ each node.

Reason2:- why not possible to calculate  g(x) simultaneously while  doing postorder traversal.?? So ans may be O(n).
Well we are free to do in any way. But in the end, we must calculate, $g(x)$ for each node. And we don't know what is $g(x)$ but know the cost of computation. So, all we can do is to add up this cost for the entire tree.

1.if we use simple divide and conquer technique(without using extra space)

we first need to count no. of leaf in left subtree,then no. of leaf in right subtree+compare them to find minimum.

T(n)=2T(n/2)+1----->O(n)

but we need to compute this for every node.

so, cost at 0 th level=O(n)

cost at 1st level=n/2+n/2=O(n)

cost at 2nd level=n/4+n/4+n/4+n/4=O(n)\

we have total of logn levels (because tree is balanced) and cost at every level is O(n).

Hence total cost is=n*logn=O(nlogn)

2.if we use extra memory to store no of leaf at left subtree and at right subtree for every node

we don't need to go to last level for every node.

1.compute for every leaf node.

2.then use those values to compute at all nodes at (h-1) level till root.

so, time complexity of this method is equal to O(n),since we are calculating value for each node and doing only one comparison.

space complexity is also O(n),since we are using extra memory for every node.

since question is asking about worst case time complexity,so we have O(nlogn) .

Let us assume a balanced binary tree containing n leaf nodes. We have to compute g(x) corresponding to every node x and the cost of computing g(x) is

min(number of leaf-nodes in left-subtree of x,number of leaf-nodes in right-subtree of x).

we start from root and compute g(x) corresponding to every node.g(root) = min(n/2 , n/2) = Θ(n/2) =Θ(n).

Cost at level 1 :Θ(n).

Cost corresponding to second level :

g(left_node)= min(n/4,n/4)=n/4

g(right_node)=min(n/4,n/4)=n/4

Total cost at level 2 :Θ(n/4 + n/4) =Θ(n/2)=Θ(n)

Similarly total cost at level 3 : Θ(n/8 + n/8 +n/8 + n/8)=Θ(n/2)=Θ(n)

...There are totally log n levels since it is a balanced binary search tree and cost at every level is Θ(n).

There fore time complexity  of the program is Θ(n) . log n =Θ(n logn).

edited

we have total of logn levels (because tree is balanced) and cost at every level is O(n).

Hence total cost is= n*logn =O(nlogn)

thank you sir . analysed my mistake and got your point.

ok sir