2.5k views

Consider $B^+$ - tree of order $d$ shown in figure. (A $B^+$ - tree of order $d$ contains between $d$ and $2d$ keys in each node)

1. Draw the resulting $B^+$ - tree after $100$ is inserted in the figure below. 2. For a $B^+$ - tree of order $d$ with $n$ leaf nodes, the number of nodes accessed during a search is $O(-)$.

edited | 2.5k views
+4 Replace with this one.

+2
I think answer to (b) part is O(n)..
0
It will be O(n) because in B tree we have to search node from leaf node, rt?
0

We can correlate with this question https://gateoverflow.in/8052/gate2015-2-6

+1
Is the given B+ tree correct? Every key in non-leaf should be present in leaf right? I mean in right pointer of 93, 93 should be present in leaf as it is B+ tree.
0

@ even 38 is also not there

For $n$ leaves we have $n-1$ keys in internal node. (see 'part a' of this question)

Total keys in internal nodes $= n-1,$ each node can have keys between d and 2d.

For $n-1$ keys there will be minimum $\left \lceil \frac{n-1}{2d} \right \rceil$ internal nodes, and maximum $\left \lceil \frac{n-1}{d} \right \rceil$ internal nodes.

To calculate Big-Omega I am taking maximum everywhere.

If every node contains $d+1$ pointers (d keys) then height will be maximum, because number of nodes to be accommodated are fixed $\left (\left \lceil \frac{n-1}{d} \right \rceil \right )$.

If height is $h$ then equation becomes

$\large 1+(d+1)+(d+1)^2+(d+1)^3 +\ldots+(d+1)^{h-1} = \frac{n-1}{d}$
$\implies \frac{(d+1)^h-1}{(d+1)-1} = \frac{n-1}{d}$
$\implies (d+1)^h = n$
$\implies h =\log_{(d+1)}n$

This is the maximum height possible or say maximum number of levels possible.

Now using $h$ traverse we can get to leaf node, and then we may need to traverse 'd' more keys if our desired record is present in rightmost of leaf.

Answer is $O(h+d)$ i.e., $O(\log_{(d+1)}n+d)\\=O(\log_dn+d)\\$

selected by
0

@Sachin Mittal 1 great !

+4

Now using h traverse we can get to leaf node, and then we may need to traverse 'd' more keys if our desired record is present in rightmost of leaf.

We generally assume that one tree node is stored in one disk block.And hence I suppose the question is asking the number of disk block accesses.

Now, after reaching the last level ,I don't think that we need 'd' more accesses because our required key will be in the last block accessed as we do Binary search here.

So, Access cost (Random Access)= height of tree

Access cost (Sequential Access or range query)= height of tree + no. of leaves (because in worst case we may need to return all the nodes )

0
(A B+B+ - tree of order dd contains between dd and 2d2d keys in each node) please explain this part
0
@vs

I am also thinking the same.

@sachin sir

+4

last step is not correct because in question , number of node access is asked so answer will be order of height only.

there is no need of such complex calculation

search complexity will be maximum when height is maximum and for maximum height , each internal node should have minimum children

so starting with root ,

root can have minimum 2 children ( as it is common for all b+ tree )

so at height 0 , we have 1 node

at height 1 , we will have 2 nodes

at height 2 we will have 2*(d+1) nodes

at height 3 we will have 2*(d+1)2   nodes

similarly at height h , we will have 2*(d+1)h-1  children

therefore number of leaves >= 2*(d+1)h-1

i.e n >= 2*(d+1)h-1

after simplification

h= O(logd+1(n))

+1

Niraj Singh 2 Yes i am also getting the same :/

0

@MiNiPanda how  at height 2 we will have 2*(d+1) nodes ????

help me little here. AS EVERY INTERNAL NODE HAS MINIMUM d/2 CHILDREN.

SO FOR TWO INTERNAL NODE d/2+d/2=d

AM I DOING SOME MISTAKE????

+1

Here order of B+ tree is defined in a different way

B+ - tree of order d contains between d and 2d keys in each node

Now this "minimum" constraint is not applicable for only the root node. It can have minimum of 1 key value also. That means it can have min. of 2 block pointers.

These 2 block pointers will point to 2 nodes in the next level.

From this level the minimum constraint is to be strictly followed. So each node should have min. of (d+1) block pointers. So 2 nodes each having (d+1) block pointers can point to 2(d+1) nodes in the next level and so on..

0

@MiNiPanda thanku :)

0
Nice.explanation @sachin mittal sir
+1
Worst case will occur when a sequential query is given which covers all keys present at leaf.

In this case we have to first come at the leftmost node at leaf level, and then read all keys present in all n leaf nodes in a sequential manner thereby requiring n block accesses of n leaf nodes

So, total Access Cost becomes $O(log_{d+1}n+n)=O(n)$
0
sir as we are taking the maximum so the max key in B+ tree is 2d  than we can have 2d+1 pointer why d+1?
0
For considering maximum height and thus maximum access cost you have to fill nodes as minimum as possible 0
have you notice given B+ tree is not correct.
+1
how 2d=4 here?
0
here 2d is maximum no of keys.. and in the figure max. keys present is 4.. so 2d=4.. i guess..mayb m sure . :P
0
In the given B+ tree, keys 38 and 93 is not present in the last level(leaf nodes). Why is that so?
0

Bikram sir has missed some values in this B+ tree.

if u think i am wrong at some point plz comment.

i think the first part. they have given their own definition of b+ tree.

no considering their definition . if d=1

then minimum no of nodes should be 1 and maximum should be 2 . which we can see is not the case as 4 keys are their in one node.

d=2

minimum =2 max =4

satisfy the condition .

d=3

one underflow case exists which means the value of d is 2. now insert it according order 5 tree.

now the second part.  number of acess will be order of level . as i have to access every level once till the leafs . as the record pointer is always present in leafs.

so to find worst case complexity the tree should be of maximum height so that the number of level can be full.

By putting maximum.

Height        nodes
0                 1

1                2d+1

2               (2d+1)^2

k               (2d+1)^k

n = (2d+1)^k

as n are leaf nodes which will result in k = log n base (2d+1)
edited by
as given min d keys at each node so assuming atleast d+1 child for each ( root too)
as they have given their own definition of degree
so (d+1)^ h = n
thus, h = log n / log (d+1)

or (if considering max child at each node )

as given 2d keys at each node so assuming 2d+1 child for each( root too).
if 2d+1 child at each then at height h (2d+1)^h = n
thus, h = log n / log (2d+1 )
in both cases - o(log n ) ( as d will be small )
am i correct ?
+1 vote
For n leaves, there must be exactly n-1 keys distributed in the internal nodes.
As per the question, each node can have keys between d and 2d. Therefore for n-1 keys there can be minimum $\lceil$$\dfrac{n-1}{2d}$$\rceil$ internal nodes and maximum $\lceil$$\dfrac{n-1}{d}$$\rceil$ internal nodes.

To determine upper bound on the number of nodes accessed during the search, let us take the maximum case.

In this $B^+$-tree every level will contain nodes in the powers of (d+1), so the equation of height is,
1 + (d+1) + $(d+1)^2$ + $(d+1)^2$ +.....+ $(d+1)^{h-1}$ = $\dfrac{n-1}{d}$
$\implies$ $\dfrac{(d+1)^{h}-1}{(d+1)-1}$ = $\dfrac{n-1}{d}$
$\implies$ $(d+1)^h$ = n
$\implies$ h = $log_{(d+1)}n$

Now using h, leaf node containing key being searched can be reached. Hence the number of nodes accessed during the search is O($log_{(d+1)}n$)
0
wat is the difference between u and sachin's answer ??
0

For part $(b)$ we can argue in the following way:

In the worst case, we can consider a complete $d$ - ary B+ tree ($\because$  B+ tree of order $d$ is given) and the key to be searched is present in the last level of the tree.

$\therefore$ Number of nodes accessed = $O ( h + d )$ (as given in @Sachin Mittal 1 's answer)

For height of B+ tree,

Let the total number of nodes in tree be $N$.

Level 0 - $d^{0}$ nodes

Level 1 - $d^{1}$ nodes

Level 2 - $d^{2}$ nodes

Level 3 - $d^{3}$ nodes

and so on till

Level h - $d^{h}$ nodes (last level)

$\therefore$        N = $d^{0}$ + $d^{1}$+ $d^{2}$+ $...$ + $d^{h}$ = $\frac{d^{h+1} -1}{d-1}$ = $\frac{nd -1}{d-1}$          ( $\because$ last level has leaf nodes= $n$ )

height of B+ tree= $\log _{d}N= \log$ $\frac{nd -1}{d-1}$ = $O ( \log _{d}n$)

Number of nodes accessed = $O ( h + d )$ = $O ( \log _{d}n + d)$

2