Consider the following code
Register int i,j;
Register float x,y,A[64][64],B[64][64];
For (i=0;i<64;i++) //s1(4 cycles)
{
For (j=0;j<64;j++) //s2(4 cycles)
{
x=x+A[i][j]; //s3(10 cycles)
}
For (j=0;j<64;j++) //s4(4 cycles)
{y=y+B[i][2xj]; //s5(10 cycles)
}
}
Both int and float are of size 4 bytes
Accessing array A and B generated load to the data cache and remaining variables are allotted in registers
Fully associative ,LRU data cache with 32 lines and each line has 16 bytes.initially data cache is empty.
Both arrays A and B are stored in row major order
If there is data cache miss, additional 40 cycles are required to wait for the data
Arrays A and B both start at the cache line boundaries
Number of cycles for each statement invocation is provided in the above code .
How many cycles does the above code fragment take for execution?