Consider a discrete memoryless channel and assume that H(x) is the amount of information per symbol at the input of the channel; H(y) is the amount of information per symbol at the output of the channel. H(x $\mid$ y) is the amount of uncertainty remaining on x knowing y; and I(x;y)is the information transmission. ... -H(y $\mid$ x)]; p(x) max [H(x)-H(x $\mid$ y)]; p(x) max H(x $\mid$y); p(x)