Following table indicates the latencies of operations between the instruction producing the result and instruction using the result.

$$\begin{array}{|l|l|c|} \hline \textbf {Instruction producing the result} & \textbf{Instruction using the result }& \textbf{Latency} \\\hline \text{ALU Operation} & \text{ALU Operation} & 2 \\\hline \text{ALU Operation} & \text{Store} & \text{2}\\\hline \text{Load} & \text{ALU Operation} & \text{1}\\\hline \text{Load} & \text{Store} & \text{0} \\\hline \end{array}$$

Consider the following code segment:

Load R1, Loc 1; Load R1 from memory location Loc1
Load R2, Loc 2; Load R2 from memory location Loc 2
Add R1, R2, R1; Add R1 and R2 and save result in R1
Dec R2; Decrement R2
Dec R1; Decrement R1
Mpy R1, R2, R3; Multiply R1 and R2 and save result in R3
Store R3, Loc 3; Store R3 in memory location Loc 3

What is the number of cycles needed to execute the above code segment assuming each instruction takes one cycle to execute?

- $7$
- $10$
- $13$
- $14$