4,712 views

​​​Suppose we are given $n$ keys, $m$ hash table slots, and two simple uniform hash functions $h_{1}$ and $h_{2}.$ Further suppose our hashing scheme uses $h_{1}$ for the odd keys and  $h_{2}$ for the even keys. What is the expected number of keys in a slot?

1. $\frac{m}{n}$
2. $\frac{n}{m}$
3. $\frac{2n}{m}$
4. $\frac{n}{2m}$

edited

The correct, formal reasoning is given by @Sachin Mittal 1 in the answer below, But I just want to point out that in the question, $h_1,h_2,$ two simple uniform hash functions are given only to confuse students.

Both $h_1,h_2$ are Simple Uniform Hash Functions, So, using any of them, for any key $k,$ each slot is equally likely. $i.e.$ Whatever key $k$ we take, on this key, Whatever hash function $(h_1\,\,or\,\,h_2)$ we apply, each bucket/slot is equally likely to be occupied.

So, for each slot, the probability that it will be occupied for any key is $1/m.$

Hence, the expected number of keys in a slot is $n/m.$

$\color{blue}{\text {NOTE 1 : }}$  The statement “suppose our hashing scheme uses $h_1$ for the odd keys and  $h_2$ for the even keys” is irrelevant for uniform hash functions. Even if our hashing scheme uses $h_1$ for the $r$ keys and  $h_2$ for the $n-r$ keys, we still have $1/m$ probability for any slot to be occupied for any key.

So, Don’t think that answer is $n/m$ because we apply $h_1$ for odd keys and $h_2$ for even keys. Assume that $n=100, m=20,$ and that these $n$ keys are such that $5$ of them are odd, and remaining all are even, then also expected number of keys in a slot will be $n/m = 100/20 = 5.$

$\color{blue}{\text {Theorem : }}$

If $H = \{ h_1,h_2,h_3,..,h_r \}$ is any collection of uniform hash functions, such that our hashing scheme uses $h_1$ for $n_1$ keys,  $h_2$ for $n_2$ keys, and so on, $h_r$ for $n_r$ keys, where $n_1+n_2+..+n_r = n,$ then the expected number of keys in a hash table slot is $n/m.$

Some Theory :

We want to store a set of $n$ keys. A hash table is an array $T[1 .. m],$ where $m$ is another positive integer,
which we call the table size, represents number of hash table slots. Let $x$ be any key, and let $i, where \,(1<=i<=m)$, be any slot.

The definition of a “simple uniform hash function $h$” is as following :

$\color{red} {\text{Uniform hash function definition : } } Pr[h(x) = i] = 1/m$

Let me emphasize that this condition must hold for EVERY key $x$ and EVERY hash index $i.$

We say that key $x$ hashes to the slot $T[h(x)].$

If we have a Set $H$ of such simple uniform hash functions, then whatever hash function $h \in H$ we choose for any key $x$, it makes any slot $i$ equally likely to get occupied i.e.

$\text{for any } h \in H; Pr[h(x) = i] = 1/m, \text{for all } x \text{ and for all } i$

https://courses.cs.duke.edu/cps102/spring09/Lectures/L-18.pdf

refer items per slots and empty slots section.

$\color{blue} {\text{Uniform hash function definition : } } Pr[h(x) = i] = 1/m$

which means for every $x$, we have equal probability of mapping to any of slot $i$.

Quick Question-

$\color{red} {\text{Hash function given in question is Uniform ? } }$

Take any slot $i$ and calculate probability of mapping some arbitrary $x$ to $i$?
i.e. $\color{blue} {Pr[h(x) = i] = } \color{red}{?}$

Let the probability of choosing $h_1$ is $p$ and choosing $h_2$ is $1-p$ then

$\Pr[h(x) = i] = \color{red} {p}Pr[h_1(x) = i] + \color{red} {(1-p)} Pr[h_2(x) = i]$
$\Rightarrow Pr[h(x) = i] = \frac{\color{red} {p}}{m} + \frac{\color{red} {1-p}}{m} \text{ (since }h_1\text{ and }h_2\text{ both are uniform hash functions)}$
$= \frac{1}{m}$

As you see, value of $p$ does not matter but we can calclulate $p$ as $= \frac{\text{number of even keys}}{\text{Total keys}}$

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Since $h$ is uniform, we can say it will distribute all keys uniformly. This means if there are 50 keys and 10 slots then each slot will get 5 keys. i.e. $\frac{n}{m}$ is answer.

But we can do this using probability.

let $X$ = Number of items in slot $\color{red}{1}$. (Note that I am. only talking about some random slot, say slot $\color{red}{1}$ )

$X_i = \left\{ \begin{array}{ll} 1 & \text{if } i^{th} \text{ item maps to slot 1 } \\ 0& \text{otherwise} \end{array} \right.$

Very easily, we can say $X = X_1 + X_2 + \dots X_n$

They are asking $E[X]$

$E[X] = E[X_1 + X_2 + \dots X_n]$

$E[X] = E[X_1] + E[X_2] + \dots E[X_n]$

How to calulate $E[X_i]$ ? – There could be 2 methods, one given below and the other is in first comment.

Once we calculate $P(X_i = 1) \text{ and } P(X_i = 0)$ then finding $E[X_i]$ is straight forward and so $E[X]$

$P(X_i = 1) = \frac{1}{m}$ because whatever happens within black box ( $h1 or h2$), the overall hash function will be uniform.

$E[X_i] = 1 P(X_i = 1) + 0 P(X_i = 0) = \frac{1}{m}$

Hence,

$E[X] = \frac{n}{m}$

edited

We can calculate E[X] without knowing anything about h(x).
Using  Law of Total expectation.

$E = P(A) E_1 + P(A^c) E_2$
(where $E_1$ is expectation if event $A$ happens and $E_2$ is expectation if event $A$ does not happens.)

In our case,

$\begin{array} {rrr} & \text{prob of choosing }h_1\times\text{Expectation of }X_i\text{ when }h_1\text{ is choosen}\\ E[X_i] = & +\quad\text{ }\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad & &\\ & \text{prob of not choosing }h_1\text{(or choosing }h_2\text{)}\times\text{Expectation of }X_i\text{ when }h_2\text{ is choosen} \\ \end{array}$

$\text{Expectation of }X_i\text{ when }h_1\text{ is choosen} = \frac{1}{m}$

$E[X_i] = \frac{\color{red} {p}}{m} + \frac{\color{red} {1-p}}{m} = \frac{1}{m}$

Hence, $E[X] = \frac{n}{m}$

Here we do not need to know if $h$ is uniform or not. We know only about $h_1$ and $h_2$.

Thankyou @Sachin Mittal 1 Sir , @Deepak Poonia Sir, for such an awesome explanation using different methods.