# Indexing

1.4k views

Database file consist 50000 records with record size 100 bytes, block size 512 bytes. If sparse B+ tree index build over given database file with search key size 20 bytes both block pointer and record pointer sizes 12 bytes each. How many maximum index blocks required if node order P is defined as between ⎡P/2⎤ to P pointers per node?

Record Size = 100 B
Block Size = 512 B

Hence, We can say, $\left \lfloor \frac{512}{100} \right \rfloor$ = $5$ records per block (Unspanned organisation assumed by default)

So, Number of block required for Database table = $\left \lceil \frac{50000}{5} \right \rceil = 10,000$ Blocks

Now, Let's find the Order $P$ of B+ tree.

Search Key size = 20 B
Pointer size = 12 B
And Order is defined as usual convention.

So, $(P-1)20 B + (P \times 12 B) \leq 512B$

Hence, $P = 16$

So, In any node of B+ tree, Minimum $\left \lceil \frac{P}{2} \right \rceil -1 = 7$ Keys should present.

Since, We are asked to make the Maximum index blocks in the B+ tree index, So, for that, We will try to fill each Index block as less as possible. Hence, We will try to put only $7$ keys per block.

Here, For B+ trees we can apply Bulk-loading concept. Since there are 10,000 blocks in the database relation.... For Sparse B+ tree index, We need to store 10,000 keys in the Index.

So, At the Bottom level (I am naming  it 1st level)  of Index, Putting 7 keys per node.. We need $\left \lfloor \frac{10000}{7} \right \rfloor = 1428$ Nodes.

Now, at next upper level (2nd level), We need $1428$ Node Pointers. And Since, We can put minimum 8 pointers per node, We would need $\left \lfloor \frac{1428}{8} \right \rfloor = 178$ Nodes

So, Now, at next upper level (3rd level), We need 178 Node pointers. Hence, $\left \lfloor \frac{178}{8} \right \rfloor = 22$ Nodes.

Going on same way, On next upper level (4th level), We need 22 Node pointers. Hence, $\left \lfloor \frac{22}{8} \right \rfloor = 2$ Nodes.

And finally, at the root level, We can have 2 Node pointers, Hence, 1 Block.

So, Maximum number of Index Blocks required = $1428 + 178 + 22 + 2 + 1 = 1631$ Blocks.

selected by
0
Why'd you take floor why not ceil because sir here it's asked about nodes when one node we can store 7 keys and 1428*7 = 9996 not 10000 so 4 keys we missed so it  should be ceil isn't it ?
0
yes it should be ceil not floor

ceil(1000/7)=1429
till 1428th node all the nodes will be filled. now minimum occupancy at leaf is ceil(p/2)-1=7

the 9996 keys are already distributed leaving 4 keys left which will be in 1429th node

4<7
so keys will be re-distributed b/w 1428th and 1429th node
1

1428*7 = 9996 not 10000 so 4 keys we missed so it  should be ceil isn't it ?

These 4 keys will be put in some of those 1428 Index blocks. You can't make new separate Index block for them because it'd violate the minimum occupancy criteria.

so keys will be re-distributed b/w 1428th and 1429th node

Say it were correct. Now you have 7 keys in 1428th node and 4 keys in 1429th node. Now you are saying that we re-distribute these 11 keys in these two nodes without violating the minimum occupancy criteria.. I don't see that possible.

0

ah! got it now.... 11 keys cannot be re-distributed among two nodes such that both nodes will have min 7 keys

yes you're correct @Deepakk Poonia (Dee)

0

$(P-1)20 B + (P \times 12 B) \leq 512B$

this formula isnot for order of leaf nodes of B+ tree

right??

0

@Deepakk Poonia (Dee)

Moreover in nptel min no of keys is $\left \lceil \frac{p}{2} \right \rceil$

https://nptel.ac.in/courses/106106095/pdf/5_Data_Storage_and_Indexing.pdf

and even wiki too https://en.wikipedia.org/wiki/B%2B_tree

maximum index blocks

means max nodes, right??

0
Another doubt from wiki

Is branching factor and order means same for $B+$ tree??

Then max no. of keys will be $(p-1)$ and min be $\left \lceil \frac{p}{2} \right \rceil$

right??
0

for B+ Tree

order of leaves=max number of keys

order of internal nodes=max number of block pointers or what you call branching factor

for both internal and leaf nodes, max number of keys is p-1 and minimum number of keys=$\left \lceil \frac{p}{2} \right \rceil-1$

p is the overall order of tree  see this text from Korth

0
for leaf: (p-1)(k+r)+b<=block size
for internal nodes: (p-1)k+p*b<=block size
0

U have written wrong formula for leaf

It should be  p(k+r)+b<=block size

And this formula should be used in this question

right??

because, first of all we need order of leaf level node.

0

@srestha
that is the correct formula

p is the max order of a B+ tree

the leaf nodes will have max p-1 keys and thus p-1 record pointers

according to your formula, a leaf of B+ tree will have p keys and p record pointers which is incorrect

0
you can use any formula. you'll get the same result
0

Donot use any wrong formula

Otherwise, u will get -ve marks in GATE for sure.

Because in that exam 0.00xx matters

0

@srestha

https://gateoverflow.in/35594/b-tree-find-order

https://gateoverflow.in/3605/gate2006-it-61#viewbutton,

see this
in any solution you'll find these formulas are used

for leaf: (p-1)(k+r)+b<=block size
for internal nodes: (p-1)k+p*b<=block size

0

I am telling about leaf node :(

0

@srestha
ok assume your formula is correct

It should be  p(k+r)+b<=block size

now a leaf node can have max p-1 key pointers. correct?
here p is the max order of the B+ tree
but according to your formula the leaf can have max p key pointers
use the formula and try to draw the leaf node
you'll understand

0
In B+ tree in it's leaf node, there should be p key pointers

Because it has totally different representation from other node

It would be like this

$\left \langle \left \langle K_{1},R_{1} \right \rangle,\left \langle K^{2},R_{2} \right \rangle,\left \langle K_{3},R_{3} \right \rangle,.......{\color{Red} {R_{next}}} \right \rangle$

where $R_{1},R_{2},......$ are key pointers

and $R_{next}$ is block pointer.

So, for eack block size, the size of $p$ key pointer and 1 block pointer should be included in it
0

here p is order of leaf only. overall order of tree will be p+1

because that one block pointer is not counted but all block pointers of internal nodes are counted

order of B+ Tree leaf is max number of keys and order of internal nodes is max number of block pointers

if the order of B+ Tree is p then formula is (p-1)(k+r)+b<=block size

0
No, there order of leaf is $p$ and overall order also be $p.$ Because block pointer is not used to count order in $B+$ tree leaf nodes. Only record pointer count it
0
are u saying the internal nodes will have p block pointers and leaves will have p keys?
0
yes.

So, what will be $p$ for above question??
0

@srestha
https://gateoverflow.in/163103/b-tree-with-sparse-dense-indexing
see this solution
formula for leaf is given

if order of B+ Tree is p
then leaf node will have p-1 keys not p keys

0

That is the correction needed for that question.

right??

0
Nope. it's correct actually
I found out later
0

Plz correct the formula first. U r going wrong :(

0

@srestha

in that question n is the max no. of keys in the leaf

lets say all nodes are completely filled in all levels

if order of B+ tree is p then all internal nodes will have p-1 keys... right? and all internal nodes will have p block pointers

in case of leaves it'll also have p-1 keys(which happens to be the order of leaf)
now leaf also has p pointers but one of them points to the right sibling that is why it is not counted
[there are p-1 record pointers in leaves and 1 block pointer]

P.S. leaves don't have block pointer(that one 1 block pointer to right sibling is not counted]. leaves have record pointers

I may be wrong. then pls provide me some authentic resource for the formula

## Related questions

1 vote
1
335 views
The minimum number of nodes (both leaf and non-leaf) of $B^{+}$ tree index required for storing $5500$ keys and order of $B^{+}$ tree is $8$________________(order is max pointers a node can have) See here first level should be divide by $7$ $2nd$ levelshould divide by ... each $7$ pointer of 1st level has $8$ pointer in 2nd level. Am I missing something?? But in ans they divided by only $8$ :(
1 vote
Consider the following statement below: $A)$ A clustered index may be either sparse or dense. $B)$ Every $B^{+}$ tree index is dense. Which of the above statement is true? Is clustering Index can be dense. Dense means non-ordering field, but clustering field should be ordering field right??