Dark Mode

6,222 views

14 votes

A relation $r(A, B)$ in a relational database has $1200$ tuples. The attribute $A$ has integer values ranging from $6$ to $20$, and the attribute $B$ has integer values ranging from $1$ to $20$. Assume that the attributes $A$ and $B$ are independently distributed.

The estimated number of tuples in the output of $\sigma _{(A>10)\vee(B=18)}(r)$ is ____________.

The estimated number of tuples in the output of $\sigma _{(A>10)\vee(B=18)}(r)$ is ____________.

25 votes

Best answer

- $P(A>10) = \frac{10}{15} = \frac{2}{3}$
- $P(B=18) = \frac{1}{20}$
- $P(A>10 \land B=18) = \frac{2}{3}\times\frac{1}{20} = \frac{1}{30}$

$P(A>10 \lor B=18) = P(A>10) + P(B=18) – P(A>10 \land B=18)$

$\qquad = \frac{2}{3} + \frac{1}{20} – \frac{1}{30} = \frac{40 + 3 – 2}{60} = \frac{41}{60}$

$\text{Estimated number of tuples} = \frac{41}{60}\times1200 = 820$

The above answer is TRUE for SQL SELECT but not for Relational Algebra as by theory relational algebra operates on a set which means all the elements must be distinct. Since we have $15$ distinct possible values for $A$ and $20$ distinct possible values for $B,$ in strict relational algebra we’ll get

$\text{Estimated number of tuples} = \frac{41}{60}\times (15 \times 20) = 205.$

Official Answer: $205$ OR $820.$

0

7 votes

To all those answering 820

**Remember relational algebra operator select is also distinct**

By principal of mutual inclusion exclusion

For each **A>10 (11-20)**

there are **20 distinct B(1-20)**

So *current total = 200 (20*10)*

For each **B = 18 **

There are **15 distinct A**

*current total = 200 + 15 = 215*

Now excluding the tuples which we have counted twice i.e B=18 and A>10

There are 10 such tuples

So *current total = 200 +15 -10*

*Final Answer should be 205 according to me*

*Sadly I made a silly mistake and entered 210 :)*