1,755 views
For a $B^+$ - tree of order $d$ with $n$ leaf nodes, the number of nodes accessed during a search is $O(\_)$.

For $n$ leaves we have $n-1$ keys in the internal node. (see 'part a' of this question)

Total keys in internal nodes $= n-1,$ each node can have keys between d and 2d.

For $n-1$ keys there will be minimum $\left \lceil \frac{n-1}{2d} \right \rceil$ internal nodes, and maximum $\left \lceil \frac{n-1}{d} \right \rceil$ internal nodes.

To calculate Big-Omega I am taking maximum everywhere.

If every node contains $d+1$ pointers (d keys) then height will be maximum, because number of nodes to be accommodated are fixed $\left (\left \lceil \frac{n-1}{d} \right \rceil \right )$.

If height is $h$ then equation becomes

$\large 1+(d+1)+(d+1)^2+(d+1)^3 +\ldots+(d+1)^{h-1} = \frac{n-1}{d}$
$\implies \frac{(d+1)^h-1}{(d+1)-1} = \frac{n-1}{d}$
$\implies (d+1)^h = n$
$\implies h =\log_{(d+1)}n$

This is the maximum height possible or says the maximum number of levels possible.

Now using $h$ traverse we can get to the leaf node :

Answer is $O(h)$ i.e., $O(\log_{(d+1)}n)\\=O(\log_dn)$

@Sachin Mittal 1 great !

Now using h traverse we can get to leaf node, and then we may need to traverse 'd' more keys if our desired record is present in rightmost of leaf.

We generally assume that one tree node is stored in one disk block.And hence I suppose the question is asking the number of disk block accesses.

Now, after reaching the last level ,I don't think that we need 'd' more accesses because our required key will be in the last block accessed as we do Binary search here.

So, Access cost (Random Access)= height of tree

Access cost (Sequential Access or range query)= height of tree + no. of leaves (because in worst case we may need to return all the nodes )

(A B+B+ - tree of order dd contains between dd and 2d2d keys in each node) please explain this part
@vs

I am also thinking the same.

@sachin sir

edited

last step is not correct because in question , number of node access is asked so answer will be order of height only.

there is no need of such complex calculation

search complexity will be maximum when height is maximum and for maximum height , each internal node should have minimum children

so starting with root ,

root can have minimum 2 children ( as it is common for all b+ tree )

so at height 0 , we have 1 node

at height 1 , we will have 2 nodes

at height 2 we will have 2*(d+1) nodes

at height 3 we will have 2*(d+1)2   nodes

similarly at height h , we will have 2*(d+1)h-1  children

therefore number of leaves >= 2*(d+1)h-1

i.e n >= 2*(d+1)h-1

after simplification

h= O(logd+1(n))

Niraj Singh 2 Yes i am also getting the same :/

@MiNiPanda how  at height 2 we will have 2*(d+1) nodes ????

help me little here. AS EVERY INTERNAL NODE HAS MINIMUM d/2 CHILDREN.

SO FOR TWO INTERNAL NODE d/2+d/2=d

AM I DOING SOME MISTAKE????

Here order of B+ tree is defined in a different way

B+ - tree of order d contains between d and 2d keys in each node

Now this "minimum" constraint is not applicable for only the root node. It can have minimum of 1 key value also. That means it can have min. of 2 block pointers.

These 2 block pointers will point to 2 nodes in the next level.

From this level the minimum constraint is to be strictly followed. So each node should have min. of (d+1) block pointers. So 2 nodes each having (d+1) block pointers can point to 2(d+1) nodes in the next level and so on..

@MiNiPanda thanku :)

Nice.explanation @sachin mittal sir
Worst case will occur when a sequential query is given which covers all keys present at leaf.

In this case we have to first come at the leftmost node at leaf level, and then read all keys present in all n leaf nodes in a sequential manner thereby requiring n block accesses of n leaf nodes

So, total Access Cost becomes $O(log_{d+1}n+n)=O(n)$
sir as we are taking the maximum so the max key in B+ tree is 2d  than we can have 2d+1 pointer why d+1?
For considering maximum height and thus maximum access cost you have to fill nodes as minimum as possible
"For n−1 keys there will be minimum ⌈n−1/2d⌉ internal nodes, and maximum ⌈n−1/d⌉ internal nodes."

I think we should take floor value $\left \lfloor \frac{n-1}{d} \right \rfloor$ instead of ceil value $\left \lceil \frac{n-1}{d} \right \rceil$

suppose, d=3 & n=11

=>  $\left \lceil \frac{11-1}{3} \right \rceil$ = $\left \lceil \frac{10}{3} \right \rceil$ = 4 nodes

=> nodes are :-  3key  3key  3key  1key → this 1key node is not allowed as min. d=3 keys per node allowed.

=>  $\left \lfloor \frac{11-1}{3} \right \rfloor$ =  $\left \lfloor \frac{10}{3} \right \rfloor$ = 3 nodes

=> nodes are :-  3key  3key  4key → this is allowed as d to 2d keys per node allowed.

What if you apply binary search instead of sequential search because at leaf level data is in sorted form then complexity must be goes down to

O( log(d)n + log(2)n )

@Default its node i.e linked list. Binary search not possible
why we are not considering the worst case here? i.e O(n) ?

pls explain!

@Pranavpurkar B+ Tree grows in breadth not in depth unlike BST. Hence, at any given condition it will be $O(log_dn)$

edited

Abhrajyoti00

yes ! so at each level all the nodes will be filled and thus at each level 2^l [l=level no.] nodes right?

and while searching for any particular node we will go on searching all the way down to one of the leaf as in the last level all the nodes are in a list so then constant time to get the desired node.

is it correct ?

if this is correct then my doubt is:

when we reach one of the leaf nodes and the element we are searching for is the last element in the list so ultimately we have to traverse all the elements which takes O(n) time as total n elements in leaf.

While searching, from top only, we compare and go down. We know that in a B+ tree to search a key value $K$:-

1. We start from the root, look for the largest key value $(K_i)$ in the node $<=$ $K$
2. Follow the pointer $P_{i+1}$ to next value until we reach the leaf node.
3. If $K$ is found = $K_i$ in the leaf, follow $P_{ri}$ to search record.

Now at every node since the elements inside it are sorted, we can perform binary search.

got it brother . Thanks  :)

if u think i am wrong at some point plz comment.

i think the first part. they have given their own definition of b+ tree.

no considering their definition . if d=1

then minimum no of nodes should be 1 and maximum should be 2 . which we can see is not the case as 4 keys are their in one node.

d=2

minimum =2 max =4

satisfy the condition .

d=3

one underflow case exists which means the value of d is 2. now insert it according order 5 tree.

now the second part.  number of acess will be order of level . as i have to access every level once till the leafs . as the record pointer is always present in leafs.

so to find worst case complexity the tree should be of maximum height so that the number of level can be full.

By putting maximum.

Height        nodes
0                 1

1                2d+1

2               (2d+1)^2

k               (2d+1)^k

n = (2d+1)^k

as n are leaf nodes which will result in k = log n base (2d+1)
by

### 1 comment

that finding 'd' part is really helpful..thanks:)
as given min d keys at each node so assuming atleast d+1 child for each ( root too)
as they have given their own definition of degree
so (d+1)^ h = n
thus, h = log n / log (d+1)

or (if considering max child at each node )

as given 2d keys at each node so assuming 2d+1 child for each( root too).
if 2d+1 child at each then at height h (2d+1)^h = n
thus, h = log n / log (2d+1 )
in both cases - o(log n ) ( as d will be small )
am i correct ?
For n leaves, there must be exactly n-1 keys distributed in the internal nodes.
As per the question, each node can have keys between d and 2d. Therefore for n-1 keys there can be minimum $\lceil$$\dfrac{n-1}{2d}$$\rceil$ internal nodes and maximum $\lceil$$\dfrac{n-1}{d}$$\rceil$ internal nodes.

To determine upper bound on the number of nodes accessed during the search, let us take the maximum case.

In this $B^+$-tree every level will contain nodes in the powers of (d+1), so the equation of height is,
1 + (d+1) + $(d+1)^2$ + $(d+1)^2$ +.....+ $(d+1)^{h-1}$ = $\dfrac{n-1}{d}$
$\implies$ $\dfrac{(d+1)^{h}-1}{(d+1)-1}$ = $\dfrac{n-1}{d}$
$\implies$ $(d+1)^h$ = n
$\implies$ h = $log_{(d+1)}n$

Now using h, leaf node containing key being searched can be reached. Hence the number of nodes accessed during the search is O($log_{(d+1)}n$)

wat is the difference between u and sachin's answer ??