$128$ main memory blocks are mapped to $4$ sets in cache. So, each set maps $32$ blocks each. And in each set there is place for two blocks ($2$-way set).
Now, we have $4$ sets meaning $2$ index bits. Also, $32$ blocks going to one set means $5$ tag bits.
Now, these $7$ bits identify a memory block and tag bits are placed before index bits. (otherwise adjacent memory references- spatial locality- will hamper cache performance)
So, based on the two index bits (lower $2$ bits) blocks will be going to sets as follows:
$$\begin{array}{|c|c|} \hline \textbf {Set Number} & \textbf{Block Numbers} \\\hline \text{$0$} & \text{$0,16$} \\\hline\text{$1$} & \text{$5,9$} \\\hline \text{$2$} & \\\hline \text{$3$} & \text{ $\require{cancel} \cancel{3}$,$7,55$} \\\hline \end{array}$$
Since, each set has only $2$ places, $3$ will be thrown out as it is the least recently used block. So, final content of cache will be
$0$ $5$ $7$ $9$ $16$ $55$
(C) choice.