2k views

Consider a hash function that distributes keys uniformly. The hash table size is 20. After hashing of how many keys will the probability that any new key hashed collides with an existing one exceed 0.5.

1. 5
2. 6
3. 7
4. 10
asked in DS | 2k views

The question is a bit ambiguous.

After hashing of how many keys, will the probability that any new key hashed collides with an existing one exceed 0.5.

Here, 'new key hashed' is the ambiguity. It can mean the probability of a collision in the next 'hash', or the probability of a collision in any of the hashes of the 'new keys' starting from the first insertion. For the first case answer must be 10 to get probability equal to 0.5, and so 11 must be the answer for probability > 0.5. Thus we can conclude from given choices, it is the second case.

So, we need to find $n$ such that after $n$ hashes, probability of collision (in any of the $n$ hashes)  > 0.5.

Probability that there will be a collision after $n$ hashes = 1 - Probability that there was no collision in the first $n$ hashes

$= 1 - 1. \frac{19}{20} . \frac{18}{20} \dots \frac{20-n+1}{20}$.

So, we need,

$0.5 < 1 - 1. \frac{19}{20} . \frac{18}{20} \dots \frac{20-n+1}{20}$.

$\implies \frac{19}{20} . \frac{18}{20} \dots \frac{20-n+1}{20} < 0.5$.

For $n=5$, we get, $0.5814$ and for $n=6$, we get $0.43605$. So, answer should be $n = 6$.

selected by
@Shraddha Please read the given explanation. Had there been no choice given 11 is the best answer. But given the choices a candidate is expected to rethink what the question setter meant.
@ arjun sir, i am not able to understand the difference between the two cases. please elaborate a bit. ?
@Arjun Sir I could not understand the explanation given by you.. Can u please explain again in bit more detail. I could not understand the diference in two cases made by you
I hashed 5 keys - meaning I inserted 5 elements to the hashtable. What is the probability that there was a (at least one) collision during these insertions?

Now, I have 5 values in my hashtable. What is the probability that the next insertion causes a collision?
Thanx sir. Got ur point. :)

He has asked "after how many insertion" or in which insertion probability will be >= 0.5

So after 10 keys has been inserted probability of collision will be 0.5

We can simply think like this :there are 20 slots out of which 10 are full. So there is half probability of collision. if there were less than 10 values prob. of collision would never exceed 1/2 rather it would decrease.

So answer less than 10 is not possible at all

Also probability in 'k' th insertion is  (k-1)/20

(k-1)/20 >=0.5

which gives k>=11

means in 11 th insertion we get required prob .  OR after inserting 10 values prob will exceed 0.5

+1 vote
After the 10th insertion, probability of collision of new key hashed= 10/20= 0.5

After 11th insertion, probability of collision of new key hashed= 11/20 > 0.5

We are asked after which insertion, probability of collision of new key hashed exceeds 0.5. So it should be 11.
it is asking about exceed 0.5 which is 11 because it is not given in the option then we can mark 10 (thats all).
ans 10
I think it's answer is (A) that is after insertion of 5 keys, the insertion of new key will exceed 0.5...

and (10) not be the answer.. even if I think in that way that insert 10 keys.. then we have 10 slots vacat.. probability of collision = 10 / 20 = 0.5.. He used the word "exceed".. exceed means greater than.. It doesn't mean "greater and equal", but it is equal.. !!

explaination:

probability of inserting a new key is not independent they depends on previous insertions. So basically it is a conditional probability.. p("insetion of new key" | "insertion of m keys")

We are asking to insert the new key with probablity exceeds 0.5

0.5 < probabiltity of collision when (m + 1) keys are inserted!!

=> 0.5 < 1 - p(all are hashed to a different location)

after insertion of 5 keys

p(all are...) = 1 * 19/20 * 18/20 * 17/18 * 16/20 = 0.581  => 1 - 0.581 = 0.418 < 0.5..

after insertion of 6 th keys

it exeedes 0.5..

so, after hashing of 5 keys new key hashsed collides with an existing one exceed 0.5..

1 * 19/20 * 18/20 * 17/18 * 16/20 = 0.581  => 1 - 0.581 = 0.418 < 0.5.

it should be 17/20 right??