The Gateway to Computer Science Excellence
+23 votes
A message is made up entirely of characters from the set $X=\{P, Q, R, S, T\}$. The table of probabilities for each of the characters is shown below:$$\begin{array}{|c|c|}\hline \textbf{Character}  &  \textbf{Probability } \\\hline  \text{$P$} & \text{$0.22$} \\\hline  \text{$Q$} & \text{$0.34$} \\\hline  \text{$R$} & \text{$0.17$} \\\hline \text{$S$} & \text{$0.19$} \\\hline  \text{$T$} & \text{$0.08$} \\\hline \text{Total} & \text{$1.00$} \\\hline  \end{array}$$If a message of $100$ characters over $X$ is encoded using Huffman coding, then the expected length of the encoded message in bits is ______.
in Algorithms by Veteran (431k points)
edited by | 4.8k views

@Bikram sir whts wrong with this m just multiplying 100 .



Your Tree is Not correct ..

After 25 become root node ..there nodes are at that point --> 25, 19, 22, 34 and 100 .

Rearrange them in increasing order, it become 19, 22, 25, 34 , 100 .

so 19 and 22 make another sub tree . 25 and 34 make another sub tree ....

hope you get where you did mistake, you forget to rearrange them once again.. after each subtree make we need to rearrange each avaailable nodes .

Why this question is not in GO book? I downloaded the latest version but did not find it.
this is in network ..vol 3
ROOT node should be of 100 because sum of all character i.e P,Q,R,S,T is 100 and it totally wrong that you r getting 200..

4 Answers

+28 votes
Best answer

$X = \{ P, Q, R, S, T\}$

$∴ \text{Expected length of an encoded character}$
$\qquad \qquad= (0.22 \times 2) + (0.34 \times 2) + (0.17 \times 3) + (0.19 \times 2) + (0.08 \times 3) \hspace{0.1cm} \text{ bits }$
$\qquad \qquad= 0.44 + 0.68 + 0.51+ 0.38 + 0.24 \hspace{0.1cm} \text{ bits}$
$ \qquad \qquad= 2.25 \hspace{0.1cm} \text{ bits } $

$\therefore \text{Expected length of a encoded message of $100$ characters in bits} = 100 \times 2.25 = 225$

by Boss (42.4k points)
edited by
ur tree structure is different from the best answer !!! is it ok ??

When constructing huffman tree , everytime we delete two shortest elements from the min in that way convention is like choose shortest and then next shortest and now add those and insert this as a new node in our min heap.So we can observe that every time we are choosing the shortest element so hence always shortest element will become left sibling og next shortest node..

You can see the code here

During huffman tree construction the calls are like

left = extractMin(minHeap);
right = extractMin(minHeap);
So the tree constructed by "2018" is wrong , as it can't happen that sometime your algorithm is making shortest node as left child and sometime right.Although the answer won't be effected as hight of nodes won't effect in that manner and in that question only hight of nodes matter.
Hoffman tree constructed by akash is right for me.
why is not the expected length 225/100 = 2.25 bits. expected means we need to calculate average length of code, right?

 the expected length here does not mean the avg no of bits required to encode 1 character, it is total bits required to encode 100 characters.

@amitqy I think you're right, but generally expected means average. and when a question is NAT, it's difficult to guess that what do they mean?


it's very silly i think, how avg length of 100 character would be 2.25, if the characters are distinct. For one character you need at least 1 character to encode, isn't it. Some character vary by their prefix so it always be > 100. And if they ask avg length required by per character then this is a different context , 

hence it would be (weighted external path length / total frequency of each character )

+34 votes

so ans is 225

by Loyal (6.8k points)
how d bits r calculated or tree is been prepared
in short:
1. arrange all in increasing order of there frequency.
2. extract 2 with min frequency and make them as a leaf node and root node will be obtained by summing both leaf node. rearrange all remaining frequency including newly created root node and go to step 2
4. repeat above steps.
I think the diagram is not correct, However, the calculation is fine.

0.41 should be left child of 1 and 0.59 will be right child of 1. Apart from that, it looks fine.

@Arjun Sir, Can you please suggest?

I am so sorry... T is 100

This one is correct :-) 




I understood your tree solution .

Please help me in understanding what is difference between average length and average lengthy per character and how do we get to know in question we have to find which one avg lenght or avg length per character.
@mayank I also have the same doubt if the expected length is asked then answer should be 225/100 = 2.25, expected means average value.
@Rupendra @akash

Rupendra's comment should be added in the best answer.

As it is the mistake done by 90% people including me.
+3 votes
by Active (1.9k points)
edited by
225 is right ans as they have asked for expected length for endcoded msg and msg conatins 100 char.
+1 vote

To create the Huffman tree, always increase the nodes in ascending order, and merge the first two nodes.

Step 1)

Given: $\begin{bmatrix} P & Q &R &S &T \\ 0.22 &0.34 &0.17 &0.19 &0.08 \end{bmatrix}$ → $\begin{bmatrix} T & R &S &P &Q \\ 0.08 &0.17 &0.19 &0.22 &0.34 \end{bmatrix}$ → $\begin{bmatrix} TR &S &P &Q \\ 0.25 &0.19 &0.22 &0.34 \end{bmatrix}$


Step 2)

$\begin{bmatrix} TR &S &P &Q \\ 0.25 &0.19 &0.22 &0.34 \end{bmatrix}$ → $\begin{bmatrix} S &P&TR &Q \\ 0.19 &0.22 &0.25 &0.34 \end{bmatrix}$ → $\begin{bmatrix} SP&TR &Q \\ 0.41 &0.25 &0.34 \end{bmatrix}$


Step 3)

$\begin{bmatrix} SP&TR &Q \\ 0.41 &0.25 &0.34 \end{bmatrix}$ → $\begin{bmatrix} TR&Q &SP \\ 0.25 &0.34 &0.41 \end{bmatrix}$ → $\begin{bmatrix} TRQ &SP \\ 0.59 &0.41 \end{bmatrix}$


Step 4)

Finally merge these two.

Hence, the Huffman Tree would look like this:-


Length  2.25 per character. Given, there are 100 characters in the message, so,



by Loyal (6.5k points)

Related questions

Quick search syntax
tags tag:apple
author user:martin
title title:apple
content content:apple
exclude -tag:apple
force match +apple
views views:100
score score:10
answers answers:2
is accepted isaccepted:true
is closed isclosed:true
50,737 questions
57,309 answers
105,026 users