3.2k views

Consider a join (relation algebra) between relations $r(R)$ and $s(S)$ using the nested loop method. There are 3 buffers each of size equal to disk block size, out of which one buffer is reserved for intermediate results. Assuming size(r(R))<size(s(S)) , the join will have fewer number of disk block accesses if

1. relation $r(R)$ is in the outer loop.
2. relation $s(S)$ is in the outer loop.
3. join selection factor between $r(R)$ and $s(S)$ is more than 0.5.
4. join selection factor between $r(R)$ and $s(S)$ is less than 0.5.
edited | 3.2k views
+6
A join between r(R) and s (S) using nested loop method will be as follows.
For each tuple r in R do
For each tuple s in S do
If r and s satisfy the join condition then output the tuple <r,s>
This algorithm will involve Nr*Bs+Br block transfers and Nr+Br seeks, where Br and Bs are
number of blocks in relations R and S respectively and Nr is number of tuple in relation R.
Now to have less block accesses, Nr should be less and it is already given that |R|<|S|. Relation
r(R) should be in the outer loop to have fewer number of disk block accesses.

0
are join algorithms in syllabus?
0
Yes.
0

Quoting from wiki

join selection factor is"the percentage (or fraction) of records in one file that will be joined with records of another file". This can be calculated when two database tables are to be joined. It is primarily concerned with query optimisation.

0

First solve https://gateoverflow.in/3847/gate2005-it-82a

first .Then it becomes very easy to understand

In joining B and B using nested loop method, with A in outer loop two factors are involved.

i>No. of  blocks containing all rows in A should be fetched

ii> No. of Rows A times no of Blocks containing all Rows of B

(in worst case all rows of B are matched with all rows of A).

In above ques, $|R|<|S|$

i> will be less when number of rows in outer table is less since less no of rows will take less no. of blocks

ii> if we keep $R$ in outer loop, no. of rows in $R$ are less and no. of blocks in $S$ are more

If we keep $S$ in outer loop, no of rows in $S$ are more and no. of blocks in $R$ are less.

In ii> block accesses will be multiplication and will come same in both cases.

So, i> will determine no of block accesses

edited
0
Is "join" in question, the only reason for option A?
+1
It is  due to join using nested loop  that a> is correct.Is this u were asking or something else?
+3
+5

size(r(R)) < size(r(S))  means blocks occupied by R are less?
And if yes then what about the case if blocks are less but rows are more

Consider

r(R) #blocks (br) = 10 and #tuples (nr) = 100

r(S) #blocks (bs) = 20 and #tuples (ns) = 50

Now

1 If R is in outer loop #block Accesss = nr * bs + br = 100*20 + 10 = 2010

2 If S is in outer loop #block Accesss = ns * br + bs = 50*10 + 20 = 520

0
Sir, is there any significance of the 3rd buffer which is for holding intermediate results...
0

Hi @ਜਗਮੀ ji, Very good Point.

ping @Krish__,  @Anu007,  @Ashwin Kulkarni and @reena_kandari ji.

0
II) is only true if we are considering both relations to have same record size.We can't say that expression 2 is equal in general case.
–1 vote

(D) join selection factor between $r(R)$ and $s(S)$ is less than 0.5.

it does not matter if 10 x 20 or 20 x 10 answer is still the same. If the jon selection factor is less that means we will fewer record in the result relation