retagged by
9,612 views
19 votes
19 votes

​​​​​​Consider the following $\text{ANSI C}$ code segment:

z=x + 3 + y->f1 + y->f2;
for (i = 0; i < 200; i = i + 2) 
{
    if (z > i) 
    {
        p = p + x + 3;
        q = q + y->f1;
    } else 
    {
        p = p + y->f2;
        q = q + x + 3;
    }
}

Assume that the variable $y$ points to a $\textsf{struct}$ (allocated on the heap) containing two fields $\textsf{f1}$ and $\textsf{f2}$, and the local variables $\textsf{x, y, z, p, q,}$ and $\textsf{i}$ are allotted registers. Common sub-expression elimination $\text{(CSE)}$ optimization is applied on the code. The number of addition and the dereference operations (of the form $\textsf{ y ->f1}$ or $\textsf{y ->f2}$) in the optimized code, respectively, are:

  1. $403$ and $102$
  2. $203$ and $2$
  3. $303$ and $102$
  4. $303$ and $2$
retagged by

2 Answers

Best answer
33 votes
33 votes
t1 = x + 3 // 1 addition
t2 = y->f1; // 1 dereference 
t3 = y->f2; // 1 dereference
z = t1 + t2 + t3 // 2 additions
for (i = 0; i < 200; i += 2) {
    if (z > i) {
        p = p + t1; // 1 addition
        q = q + t2; // 1 addition
    } else {
        p = p + t3; // 1 addition
        q = q + t1; // 1 addition
    }
}

Whether we take if or else block we get $2$ additions, the loop runs exactly $\frac{200}{2} = 100$ times, so from loop we get $2\times100 = 200$ additions plus $100$ additions for incrementing the value of $i,$ before loop we had perform $3$ additions, so total additions $303.$

We only do two de-reference outside the for loop, so total de-references $= 2.$

Option D.

edited by
16 votes
16 votes
t1 = x + 3;       // 1 addition
t2 = y -> f1;      // 1 dereference 
t3 = y -> f2;      // 1 dereference
z = t1 + t2 + t3; // 2 additions
for (i = 0; i < 200; i += 2) { // 100 additions
    if (z > i) {
        p = p + t1; // 1 addition
        q = q + t2; // 1 addition
    } else {
        p = p + t3; // 1 addition
        q = q + t1; // 1 addition
    }
}

So, in total we get $1 + 2 + 100 + 100 * 2 = 303$ additions and $ 2$ defrerences. Since all the variables are mentioned to be in registers and any way p and q are not struct objects there is no pointer aliasing issue (say if y was pointing to object p or q, we cannot move the sub expression out of the loop – they are no longer loop invariant.  

Option D


Not asked in this question. But lets do some more optimizations here. 

t1 = x + 3;    
t2 = y -> f1;       
t3 = y -> f2;      
z = t1 + t2 + t3;

for (i = z+1 + (z%2); i < 200; i += 2) { 
        p = p + t1;
        q = q + t2;
}

for (i = 0; i <= z; i += 2) {
        p = p + t3;
        q = q + t1;
    }
}

The above optimization is loop splitting. The advantage here is now we have one less branch inside the loop – less chance of branch miss prediction and more expected instruction level parallelism – remember pipeline stalls due to branch instructions in COA. Also, now we can optimize the code even further as follows:

t1 = x + 3;
t2 = y -> f1;
t3 = y -> f2;
z = t1 + t2 + t3;

p = p + ((200-z-1-(z%2))/2) * t1;
q = q + ((200-z-1-(z%2))/2) * t2;

p = p + ((z+1)/2) * t3;
q = q + ((z+1)/2)* t1;

Again doing sub-expression elimination:

t1 = x + 3;
t2 = y -> f1;
t3 = y -> f2;
z = t1 + t2 + t3;

t4 = ((200-z-1-(z%2))/2); 
p = p + t4 * t1;
q = q + t4 * t2;
//previous t4 usage is dead here
t4 = ((z+1)/2);
p = p + t4 * t3;
q = q + t4 * t1; 

So, finally, $4$ multiplications, $3$ divisions/mod, $11$ additions/subtractions.

That's what compiler does freely for you :)

Answer:

Related questions

8 votes
8 votes
1 answer
2
Arjun asked Feb 18, 2021
6,462 views
Consider the following augmented grammar with $\{ \#, @, <, >, a, b, c \}$ as the set of terminals. $$\begin{array}{l} S’ \rightarrow S \\ S \rightarrow S \# cS \\ S \r...
12 votes
12 votes
4 answers
3
21 votes
21 votes
7 answers
4
Arjun asked Feb 18, 2021
16,942 views
Consider the following $\text{ANSI C}$ program:int main () { Integer x; return 0; }Which one of the following phases in a seven-phase $C$ compiler will throw an error?Lex...