2.9k views

Consider a hash function that distributes keys uniformly. The hash table size is $20$. After hashing of how many keys will the probability that any new key hashed collides with an existing one exceed $0.5$.

1. $5$
2. $6$
3. $7$
4. $10$
edited | 2.9k views

The question is a bit ambiguous.

After hashing of how many keys, will the probability that any new key hashed collides with an existing one exceed 0.5.

Here, 'new key hashed' is the ambiguity. It can mean the probability of a collision in the next 'hash', or the probability of a collision in any of the hashes of the 'new keys' starting from the first insertion. For the first case answer must be $10$ to get probability equal to $0.5$, and so $11$ must be the answer for probability $> 0.5$. Thus we can conclude from given choices, it is the second case.

So, we need to find $n$ such that after $n$ hashes, probability of collision (in any of the $n$ hashes)  $> 0.5$.

Probability that there will be a collision after $n$ hashes (a collision happened in at least one of those $n$ hashes) $= 1 -$Probability that there was no collision in the first $n$ hashes

$= 1 - 1. \frac{19}{20} . \frac{18}{20} \dots \frac{20-n+1}{20}$.

So, we need,

$0.5 < 1 - 1. \frac{19}{20} . \frac{18}{20} \dots \frac{20-n+1}{20}$.

$\implies \frac{19}{20} . \frac{18}{20} \dots \frac{20-n+1}{20} < 0.5$.

For $n=5$, we get, $0.5814$ and for $n=6$, we get $0.43605$. So, answer should be $n = 6$.

edited by

@Rahul

I donot think so. What is meaning of " at least collision in first n hashes "? First n hashes not all making collision

@Arjun Sir

$= 1 - 1. \frac{19}{20} . \frac{18}{20} \dots \frac{20-n+1}{20}$.

why u searching from 19, 18, 17................upto n th hashing to find hashing exceeds 0.5 or not

here " Probability that there will be a collision after n hashes = 1 - Probability that there was no collision in the first n hashes "

So, equation could be $= 1 - 1. \frac{1}{20} . \frac{2}{20} \dots \frac{20-n+1}{20}$.  like this too

@Rahul, yes you are correct. I just made it clear in the answer.

@Srestha How $1,2,\dots$ comes?
I removed the link -- what else can be done? :O
I mean probability of 1st hast where no collision should be 1/20

for 2nd hash 2/20

but why u started from 19/20?
For the first hash there wont be any collision -- so 20/20, for the second one one can collide, remaining 19/20 and so on.
@Arjun

If it is a fill in the blank then how will you solve it ? like n=5, n=6 or n=10..... etc

then 11 would be best answer @Raju Kalagoni

My worry is not about the answer @ rahul sharma 5. I'm more biased towards the procedure of tackling this kind of problems. Please explain if you know....

Probability that there will be a collision after n hashes (a collision happened in at least one of those n hashes)

even if there is no collision in those n hashes.....probablity of collision after n can occur right ?

why is it necessary to have atleast one collision there

i actually dint get this statement properly

can someone help

He has asked "after how many insertion" or in which insertion probability will be >= 0.5

So after 10 keys has been inserted probability of collision will be 0.5

We can simply think like this :there are 20 slots out of which 10 are full. So there is half probability of collision. if there were less than 10 values prob. of collision would never exceed 1/2 rather it would decrease.

So answer less than 10 is not possible at all

Also probability in 'k' th insertion is  (k-1)/20

(k-1)/20 >=0.5

which gives k>=11

means in 11 th insertion we get required prob .  OR after inserting 10 values prob will exceed 0.5

After the 10th insertion, probability of collision of new key hashed= 10/20= 0.5

After 11th insertion, probability of collision of new key hashed= 11/20 > 0.5

We are asked after which insertion, probability of collision of new key hashed exceeds 0.5. So it should be 11.
+1 vote
it is asking about exceed 0.5 which is 11 because it is not given in the option then we can mark 10 (thats all).
ans 10
I think it's answer is (A) that is after insertion of 5 keys, the insertion of new key will exceed 0.5...

and (10) not be the answer.. even if I think in that way that insert 10 keys.. then we have 10 slots vacat.. probability of collision = 10 / 20 = 0.5.. He used the word "exceed".. exceed means greater than.. It doesn't mean "greater and equal", but it is equal.. !!

explaination:

probability of inserting a new key is not independent they depends on previous insertions. So basically it is a conditional probability.. p("insetion of new key" | "insertion of m keys")

We are asking to insert the new key with probablity exceeds 0.5

0.5 < probabiltity of collision when (m + 1) keys are inserted!!

=> 0.5 < 1 - p(all are hashed to a different location)

after insertion of 5 keys

p(all are...) = 1 * 19/20 * 18/20 * 17/18 * 16/20 = 0.581  => 1 - 0.581 = 0.418 < 0.5..

after insertion of 6 th keys

it exeedes 0.5..

so, after hashing of 5 keys new key hashsed collides with an existing one exceed 0.5..

1 * 19/20 * 18/20 * 17/18 * 16/20 = 0.581  => 1 - 0.581 = 0.418 < 0.5.

it should be 17/20 right??