@Sachin Mittal 1 great !

Dark Mode

55 votes

Best answer

For $n$ leaves we have $n-1$ keys in the internal node. (see 'part a' of this question)

Total keys in internal nodes $= n-1,$ each node can have keys between d and 2d.

For $n-1$ keys there will be minimum $\left \lceil \frac{n-1}{2d} \right \rceil$ internal nodes, and maximum $\left \lceil \frac{n-1}{d} \right \rceil$ internal nodes.

To calculate Big-Omega I am taking maximum everywhere.

If every node contains $d+1$ pointers (d keys) then height will be maximum, because number of nodes to be accommodated are fixed $\left (\left \lceil \frac{n-1}{d} \right \rceil \right )$.

If height is $h$ then equation becomes

$\large 1+(d+1)+(d+1)^2+(d+1)^3 +\ldots+(d+1)^{h-1} = \frac{n-1}{d}$

$\implies \frac{(d+1)^h-1}{(d+1)-1} = \frac{n-1}{d}$

$\implies (d+1)^h = n$

$\implies h =\log_{(d+1)}n$

This is the maximum height possible or says the maximum number of levels possible.

Now using $h$ traverse we can get to the leaf node :

Answer is $O(h)$ i.e., $O(\log_{(d+1)}n)\\=O(\log_dn)$

Now using h traverse we can get to leaf node, and then we may need to traverse 'd' more keys if our desired record is present in rightmost of leaf.

Answer is O(h+d) i.e.

O(log(d+1)n+d)=O(logdn+d)

We generally assume that one tree node is stored in one disk block.And hence I suppose the question is asking the number of disk block accesses.

Now, after reaching the last level ,I don't think that we need 'd' more accesses because our required key will be in the last block accessed as we do Binary search here.

So, Access cost (Random Access)= height of tree

Access cost (Sequential Access or range query)= height of tree + no. of leaves (because in worst case we may need to return all the nodes )

9

0

edited
Dec 15, 2017
by Niraj Singh 2

last step is not correct because in question , number of node access is asked so answer will be order of height only.

there is no need of such complex calculation

search complexity will be maximum when height is maximum and for maximum height , each internal node should have minimum children

so starting with root ,

root can have minimum 2 children ( as it is common for all b+ tree )

so at height 0 , we have 1 node

at height 1 , we will have 2 nodes

at height 2 we will have 2*(d+1) nodes

at height 3 we will have 2*(d+1)^{2 }nodes

similarly at height h , we will have 2*(d+1)^{h-1 }children

therefore number of leaves >= 2*(d+1)^{h-1}

i.e n >= 2*(d+1)^{h-1}

^{after simplification }

^{h= O(logd+1(n))}

19

@MiNiPanda how at height 2 we will have 2*(d+1) nodes ????

help me little here. AS EVERY INTERNAL NODE HAS MINIMUM d/2 CHILDREN.

SO FOR TWO INTERNAL NODE d/2+d/2=d

AM I DOING SOME MISTAKE????

0

Here order of B+ tree is defined in a different way

A

B+ - tree of orderdcontains betweendand 2dkeys in each node

Now this "minimum" constraint is not applicable for only the root node. It can have minimum of 1 key value also. That means it can have min. of 2 block pointers.

These 2 block pointers will point to 2 nodes in the next level.

From this level the minimum constraint is to be strictly followed. So each node should have min. of (d+1) block pointers. So 2 nodes each having (d+1) block pointers can point to 2(d+1) nodes in the next level and so on..

3

Worst case will occur when a sequential query is given which covers all keys present at leaf.

In this case we have to first come at the leftmost node at leaf level, and then read all keys present in all n leaf nodes in a sequential manner thereby requiring n block accesses of n leaf nodes

So, total Access Cost becomes $O(log_{d+1}n+n)=O(n)$

In this case we have to first come at the leftmost node at leaf level, and then read all keys present in all n leaf nodes in a sequential manner thereby requiring n block accesses of n leaf nodes

So, total Access Cost becomes $O(log_{d+1}n+n)=O(n)$

9

0

0

0

I think we should take floor value $\left \lfloor \frac{n-1}{d} \right \rfloor$ instead of ceil value $\left \lceil \frac{n-1}{d} \right \rceil$

suppose, d=3 & n=11

=> $\left \lceil \frac{11-1}{3} \right \rceil$ = $\left \lceil \frac{10}{3} \right \rceil$ = 4 nodes

=> nodes are :- __3key__ __3key__ __3key__ __1key__ → this 1key node is not allowed as min. d=3 keys per node allowed.

=> $\left \lfloor \frac{11-1}{3} \right \rfloor$ = $\left \lfloor \frac{10}{3} \right \rfloor$ = 3 nodes

=> nodes are :- __3key__ __3key__ 4__key__ → this is allowed as d to 2d keys per node allowed.

1

0

0

@Pranavpurkar B+ Tree grows in breadth not in depth unlike BST. Hence, at any given condition it will be $O(log_dn)$

1

edited
Sep 29
by Pranavpurkar

yes ! so at each level all the nodes will be filled and thus at each level **2^l [l=level no.] **nodes right?

and while searching for any particular node we will go on searching all the way down to one of the leaf as in the last level all the nodes are in a list so then constant time to get the desired node.

is it correct ?

0

0

While searching, from top only, we compare and go down. We know that in a B+ tree to search a key value $K$:-

- We start from the root, look for the largest key value $(K_i)$ in the node $<=$ $K$
- Follow the pointer $P_{i+1}$ to next value until we reach the leaf node.
- If $K$ is found = $K_i$ in the leaf, follow $P_{ri}$ to search record.

Now at every node since the elements inside it are sorted, we can perform binary search.

1

6 votes

if u think i am wrong at some point plz comment.

i think the first part. they have given their own definition of b+ tree.

no considering their definition . if d=1

then minimum no of nodes should be 1 and maximum should be 2 . which we can see is not the case as 4 keys are their in one node.

d=2

minimum =2 max =4

satisfy the condition .

d=3

one underflow case exists which means the value of d is 2. now insert it according order 5 tree.

now the second part. number of acess will be order of level . as i have to access every level once till the leafs . as the record pointer is always present in leafs.

so to find worst case complexity the tree should be of maximum height so that the number of level can be full.

By putting maximum.

Height nodes

0 1

1 2d+1

2 (2d+1)^2

k (2d+1)^k

n = (2d+1)^k

as n are leaf nodes which will result in k = log n base (2d+1)

i think the first part. they have given their own definition of b+ tree.

no considering their definition . if d=1

then minimum no of nodes should be 1 and maximum should be 2 . which we can see is not the case as 4 keys are their in one node.

d=2

minimum =2 max =4

satisfy the condition .

d=3

one underflow case exists which means the value of d is 2. now insert it according order 5 tree.

now the second part. number of acess will be order of level . as i have to access every level once till the leafs . as the record pointer is always present in leafs.

so to find worst case complexity the tree should be of maximum height so that the number of level can be full.

By putting maximum.

Height nodes

0 1

1 2d+1

2 (2d+1)^2

k (2d+1)^k

n = (2d+1)^k

as n are leaf nodes which will result in k = log n base (2d+1)

2 votes

as given **min d keys **at each node so assuming **atleast d+1 child for each ( root too) **

as they have given their own definition of degree

so (d+1)^ h = n

thus,** h = log n / log (d+1) **

as given 2d keys at each node so assuming 2d+1 child for each( root too).

if 2d+1 child at each then at height h (2d+1)^h = n

thus, **h = log n / log (2d+1 ) **

in both cases - o(log n ) ( as d will be small )

am i correct ?

1 vote

For n leaves, there must be exactly n-1 keys distributed in the internal nodes.

As per the question, each node can have keys between d and 2d. Therefore for n-1 keys there can be minimum $\lceil$$\dfrac{n-1}{2d}$$\rceil$ internal nodes and maximum $\lceil$$\dfrac{n-1}{d}$$\rceil$ internal nodes.

To determine upper bound on the number of nodes accessed during the search, let us take the maximum case.

In this $B^+$-tree every level will contain nodes in the powers of (d+1), so the equation of height is,

1 + (d+1) + $(d+1)^2$ + $(d+1)^2$ +.....+ $(d+1)^{h-1}$ = $\dfrac{n-1}{d}$

$\implies$ $\dfrac{(d+1)^{h}-1}{(d+1)-1}$ = $\dfrac{n-1}{d}$

$\implies$ $(d+1)^h$ = n

$\implies$ h = $log_{(d+1)}n$

Now using h, leaf node containing key being searched can be reached. Hence the number of nodes accessed during the search is O($log_{(d+1)}n$)

As per the question, each node can have keys between d and 2d. Therefore for n-1 keys there can be minimum $\lceil$$\dfrac{n-1}{2d}$$\rceil$ internal nodes and maximum $\lceil$$\dfrac{n-1}{d}$$\rceil$ internal nodes.

To determine upper bound on the number of nodes accessed during the search, let us take the maximum case.

In this $B^+$-tree every level will contain nodes in the powers of (d+1), so the equation of height is,

1 + (d+1) + $(d+1)^2$ + $(d+1)^2$ +.....+ $(d+1)^{h-1}$ = $\dfrac{n-1}{d}$

$\implies$ $\dfrac{(d+1)^{h}-1}{(d+1)-1}$ = $\dfrac{n-1}{d}$

$\implies$ $(d+1)^h$ = n

$\implies$ h = $log_{(d+1)}n$

Now using h, leaf node containing key being searched can be reached. Hence the number of nodes accessed during the search is O($log_{(d+1)}n$)