edited by
6,430 views
26 votes
26 votes
One giga bytes of data are to be organized as an indexed-sequential file with a uniform blocking factor $8.$ Assuming a block size of $1$ Kilo bytes and a block refrencing pointer size of $32$ bits, find out the number of levels of indexing that would be required and the size of the index at each level. Determine also the size of the master index. The referencing capability (fanout ratio) per block of index storage may be considered to be $32$.
edited by

4 Answers

Best answer
45 votes
45 votes

First we can understand the terms given in the question:

  •  Uniform blocking factor $= 8$

This is the no. of records which can be held in a data block. 

This information is required for DENSE index which is mandatory when the index is unclustered - data records not ordered by the search key (there is an index entry for each record) as compared to fully sparse (which has an index entry for each data block). Since in the question we do not have any information about "record pointer size" we can assume that the index is sparse. (Solution considering dense index is given at end)

  • Block size $= 1\; \text{KB}$

This is the size of data block (file block containing records) as well as index block (file block containing index entries). Since file size is given as 1 giga bytes, we get no. of data blocks $ = \frac{1 \; \text{GB}}{1\; \text{KB}} = 1 \; \text{M} = 2^{20}$

  • Block referencing pointer size $= 32 \; \text{bits} = 4 \; \text{B}$

This is the pointer size required to point to a block. 

  • The referencing capability (fanout ratio) per block of index storage may be considered to be $32.$

This means that an index block can refer to $32$ blocks (either data or index blocks). i.e., even though we have $1024$ bytes in a block, and each block pointer size is $4$ bytes, it can refer to only $32$ blocks. This might be due to large search key size which must be present for each index entry. 

Now, coming to the solution:

No. of entries in first level index (which indexes to the data blocks) (in case of page tables in virtual memory, this will be the total no. of entries in last level page table) $=$ no. of data blocks (assuming sparse index) $ = 2^{20}$

No. of index blocks in level $1 = \frac{2^{20}}{32} = 2^{15}$ as each index block can refer to $32$ blocks (given fanout) which means size of level $1$ index $ = 2^{15} \times 1\; \text{KB} = 32 \; \text{MB}$

Since the fanout is $32,$ no. of index blocks in second level $ = \dfrac{2^{15}}{32} = 2^{10}$. 
Size of second level index $ = 2^{10} \times 1\; \text{KB} = 1 \; \text{MB}$

No. of index blocks in third level $ = \frac{2^{10}}{32} = 32$.
Size of third level index $ = 32 \times 1\; \text{KB} = 32 \text{ KB}$

No. of index blocks in fourth level $ = \frac{32}{32} = 1$ and it occupies $1\; \text{KB}.$ Since only $1$ index block is there we do not need further level of indexing. 

Searching starts in the last level (this will be level $1$ page table in case of virtual memory in OS).

Master Index -- not sure exactly what this means but I assume this is the complete index whose size will be $32 \text{ MB} + 1 \text{ MB} + 32 \text{ KB} + 1 \text{ KB}= 33.033 \text{ MB}$

 


Now assuming dense index. 

Block pointer size $= 32 \text{ bits}$. Since, we have $8$ records in a block, we need at least $3$ more extra bits for a record pointer. So, we need to assume $5$ bytes for a record pointer. As fanout is given in the question it is not changing when the record pointer size changes. If fanout was not given, we could have calculated it as $\frac{\text{block size}}{\text{search_key_size}+\text{record pointer size}}$

Here, we need an index entry for each record. So, we need $ = \frac{1 \; \text{GB}}{1\; \text{KB}} \times 8 = 8 \; \text{M} = 2^{23}$ entries in first level index. 

No. of index blocks in first level $ = \frac{2^{23}}{32} = 2^{18}$
Size of first level index $ = 2^{18} \times 1 \text{ KB} = 256 \text{MB}$

No. of index blocks in second level $ = \frac{2^{18}}{32} = 2^{13}$
Size of second level index $ = 2^{13} \times 1 \text{ KB} = 8 \text{MB}$

No. of index blocks in third level $ = \frac{2^{13}}{32} = 2^{8}$
Size of third level index $ = 2^{8} \times 1 \text{ KB} = 256 \text{KB}$

No. of index blocks in fourth level $ = \frac{2^{8}}{32} = 8$
Size of fourth level index $ = 8 \times 1 \text{ KB} = 8 \text{KB}$

No. of index blocks in fifth level $ = \lceil \frac{8}{32} \rceil = 1$
Size of fifth level index $ = 1 \times 1 \text{ KB} = 1\text{ KB}$

Master Index size $ = 256 \text{ MB} + 8 \text{ MB} + 256 \text{ KB} + 8 \text{ KB} + 1 \text{ KB} = 264.265 \text{ MB}$

edited by
22 votes
22 votes

Indexed sequential file is that which uses one pointer in index to point to a data block

Now data size = 1 GB =  230 B

Data block size = 1 KB =  210 B

So, no. of data disk blocks = 230 / 210 = 220 blocks

So, no. of pointers in 1st level index = 220 blocks

Now, fan out of 1 index block (or block factor of 1 index block) = 32 = 25 pointers / index block

So, no. of index blocks required in $1^{st}$ level = 220 / 25 = 215 blocks

So, we will have these much pointers in the $2^{nd}$ level  

No of index blocks required in the $2^{nd}$ level = 215 / 25 = 210 blocks

Similarly no. of index blocks in the $3^{rd}$ level =  210 / 25 = 25 blocks

And finally these $32$ blocks of the $3^{rd}$ level will be pointed by $1$ block of $4^{th}$ level as one block can contain $32$ pointers. Hence, we stop here being reached the final level of indexing..

Hence,

a) number of levels required for indexing =  4

b) Index size at $1^{st}$ level =  No of 1st level index blocks * Block size

                                            =  215 * 210 B

                                            = 32 MB

  Index size at  2nd level  = 210 * 210 B

                                          = 1 MB

  Index size at 3rd level = 25 * 210 B

                                          = 32 KB

  Index size at 4th (last level) = 1 KB

1 votes
1 votes

Data size = $2^{30}$ bytes

Block size = $2^{10}$ bytes

$\therefore$ No. of blocks required = $\frac{2^{30}}{2^{10}} = 2^{20}$ blocks

$1^{st}$ level index entries = No. of blocks = $2^{20}$
each entry size is 32 bits (given)
size of $1^{st}$ level index = $2^{20} \times 32$ bits = 4 MB
as, index blocking factor = 32
$\therefore$ No. of blocks in $1^{st}$ level = $\frac{2^{20}}{2^{5}} = 2^{15}$ blocks

$2^{nd}$ level index entries = No. of blocks in $1^{st}$ level = $2^{15}$
size of $2^{nd}$ level index = $2^{15} \times 32$ bits = 128 KB
Now, No. of blocks in $2^{nd}$ level = $\frac{2^{15}}{2^{5}} = 2^{10}$ blocks

$3^{rd}$ level index entries = No. of blocks in $2^{nd}$ level = $2^{10}$
size of $3^{rd}$ level index = $2^{10} \times 32$ bits = 4 KB
Now, No. of blocks in $3^{rd}$ level = $\frac{2^{10}}{2^{5}} = 2^{5}$ blocks

$4^{th}$ level index entries = No. of blocks in $3^{rd}$ level = $2^{5}$ (which can fit in 1 block, so no more level)
size of $4^{th}$ level index = $2^{5} \times 32$ bits = 128 B

​​​​​​​

Hence, we require 4 level of indexing with above sizes

1 votes
1 votes
                                      INDEXING IN DATABASES
Single level Indexing Multilevel Indexing
  1. PRIMARY INDEXING
  • Index is created on primary key for each data block.
  • No. of index entries = No. of data blocks.
  • Index is created for first record of each block( called block anchor).
  • A kind of sparse indexing.

         2. SECONDARY INDEXING

  • Index is created on data file whose records are unordered on non-key
  • Index is created for each record in data file.
  • No. of index entries = No. of records
  • A kind of dense indexing.
  • B-Trees
  • B+Trees

 

Primary Indexing:

  • Here, we do not need blocking factor as index entry is created for each block.
  • Block Size = 1 KB = 1024 B = $2^{10}B$

No. of blocks in given data file = $\frac{filesize}{Block Size}=\frac{1GB}{1KB}=2^{20}$

Pointer size = 32 bits = 4B.

The referencing capability (fanout ratio) per block of index storage may be considered to be 32.
 

It means an index block can refer to 32 blocks ( either data blocks or index blocks).

  • No. of index blocks required for 1st level = $\frac{2^{20}}{32}=2^{15}$

           Size of 1st level Index = $2^{15}*BS=2^{15}*2^{10}=32 MB$

  • Since, 32MB > BS, so again divide into index blocks.

            No. of index blocks required for 2nd level = $\frac{2^{15}}{32}=2^{10}$

            Size of 2nd level Index = $2^{10}*BS=2^{10}*2^{10}=1 MB$

  • Since 1 MB > BS, so again divide into index blocks.

            No. of index blocks required for 3rd level = $\frac{2^{10}}{32}=2^{5}$

            Size of 3rd level Index = $2^{5}*BS=2^{5}*2^{10}=32 KB$

  • Since, 32 KB > BS, so again divide into index blocks.

            No. of index blocks required for 4th level = $\frac{2^{5}}{32}=1$

            Size of 4th level Index = $1*BS=1*2^{10}=1 KB$

Since, 1 KB = BS so, no further indexing is required. Hence, 4-level indexing is required if Primary indexing is used.


Secondary Indexing

  • Here, we need blocking factor to know how many records are present per block since index is created for each record in data file.
  • No. of entries per block = blocking factor = 8.
  • No. of blocks in data file = $\frac{FileSize}{Block Size}=\frac{1GB}{1KB}=2^{20}$
  • No. of index entries =$ no. of blocks * Blocking factor = 2^{20}*8=2^{23}$
  • No. of index blocks for 1st level = $\frac{2^{32}}{32}=2^{18}$

           Size of 1st level index = $2^{18}*BS=2^{18}*2^{10}=2^{28}=256 MB$

  • Since 256 MB > BS, so divide it into index blocks.

           No. of index blocks for 2nd level = $\frac{2^{18}}{32}=2^{13}$

           Size of 2nd level index = $2^{13}*BS=2^{13}*2^{10}=2^{23}=8 MB$

  • Since, 8 MB > BS, so divide it into index blocks.

           No. of index blocks for 3rd level = $\frac{2^{13}}{32}=2^{8}$

           Size of 3rd level index = $2^{8}*BS=2^{8}*2^{10}=256 KB$

  • Since, 256 KB > BS,  so divide it into index blocks.

           No. of index blocks for 4th level = $\frac{2^{8}}{32}=2^{3}$

           Size of 4th level index = $2^{3}*BS=2^{3}*2^{10}=8 KB$

  • Since, 8 KB > BS, so divide it into index blocks.

            No. of index blocks for 4th level = $\left \lceil \frac{2^{3}}{2^5} \right \rceil=1$

            Size of 4th level index = $1*BS=1*2^{10}=1 KB$

Hence, 5-level of indexing is required.

Related questions

10 votes
10 votes
2 answers
3
makhdoom ghaya asked Nov 19, 2016
2,600 views
Match the pairs in the following questions:$$\begin{array}{|ll|ll|}\hline (a) & \text{Secondary index} & (p) & \text{Function dependency} \\\hline (b) & \text{Non-proced...
9 votes
9 votes
2 answers
4
makhdoom ghaya asked Nov 26, 2016
1,982 views
Show that the elements of the lattice $(N, \leq)$, where $N$ is the set of positive intergers and $a \leq b$ if and only if $a$ divides $b$, satisfy the distributive prop...