The Gateway to Computer Science Excellence
+3 votes
A language uses an alphabet of six letters, $\left\{a, b, c, d, e, f\right\}$. The relative frequency of use of each letter of the alphabet in the language is as given below: $$\begin{array}{|c|c|}\hline \text{LETTER} & \text{RELATIVE FREQUENCY OF USE}\\\hline a & 0.19  \\\hline b & 0.05  \\\hline c & 0.17  \\\hline d & 0.08  \\\hline e & 0.40 \\\hline f & 0.11  \\\hline \end{array}$$Design a prefix binary code for the language which would minimize the average length of the encoded words of the language.
in Algorithms by Boss (30.8k points)
edited by | 450 views

2 Answers

+4 votes
Best answer

$a-0.19,b-0.05, c-0.17,d-0.08,e-0.40,f-0.11$

Since it is relative frequency we can multiply each with $100$ and result remains the same.

$a-19,b-5, c-17,d-8,e-40,f-11$

For Assigning prefix binary code lets first create the Huffman tree

$(1) \ a-19,{\color{Red} {b-5} },c-17,{\color{Red} {d-8} },e-40,f-11$

$(2) \ a-19,{\color{Red} {(b,d)-13} },c-17,e-40,{\color{Red} {f-11} }$


$(3) \ {\color{Red} {a-19} },(b,d,f)-24,{\color{Red} {c-17} },e-40$


$(4) \ {\color{Red} {(a,c)-36,(b,d,f)-24}},e-40$

$(5) \ {\color{Red} {(a,b,c,d,f)-60,e-40}}$

Now put $0$ on each of the left edges and $1$ on each of the right edges as shown below

$\text{Prefix code}$ = Traverse from the root node to the leaf node and write the symbol $(1 \  or \ 0)$ present on each edge.

  • For $a$ the prefix code is $111$  i.e. $3$ bits
  • For $b$ the prefix code is $1010$ i.e. $4$ bits
  • For $c$ the prefix code is $110$ i.e. $3$ bits
  • For $d$ the prefix code is $1011$ i.e. $4$ bits
  • For $e$ the prefix code is $0$ i.e. $1$ bits
  • For $f$ the prefix code is $100$ i.e. $3$ bits

And the average length of encoded words $=\dfrac{(19\times 3)+(5\times 4)+(17\times 3)+(8\times 4)+(40\times 1)+(11\times 3)}{100}$

$\quad =\dfrac{57+20+51+32+40+33}{100}$

$\quad =\dfrac{233}{100}$

$\quad =2.33$

by Boss (23.8k points)
selected by
What if I directly take the probability and multiply with the no. of bits??

Then I don't need to divide by $\mathbf{100}$ right?
It may be a silly doubt but I sometimes feel that why aren't be dividing by $\mathbf 6$, that is, the number of symbols?

Then I don't need to divide by 100 right?


why aren't be dividing by 6

It is asking average length of a word , not of a letter/symbol.

yes, right point.
+3 votes

Huffman coding is to find prefix of letters,is to compress the message size over ASCII code.

To find length of encoded message find Huffman tree and sum of all internal nodes gives the message length.

First find minimum tress in the forest and remove it add to huffman tree.Add two minimum trees and add in the forest.

Repeat this procedure until forest become single tree.

Length of message=1+0.60+0.24+0.36+0.13


by Boss (12.8k points)
moved by

Related questions

Quick search syntax
tags tag:apple
author user:martin
title title:apple
content content:apple
exclude -tag:apple
force match +apple
views views:100
score score:10
answers answers:2
is accepted isaccepted:true
is closed isclosed:true
50,737 questions
57,292 answers
104,909 users