Genaral diagram for pipeline is like this:
1. Now, each stage takes 4 cycles. As shown by red highlighted part, 1 instruction ( though they are operations corresponding to different instructions) is executed in 1 stage (of non-pipelined processor) as operations are overlapped.
Hence, CPI = cycles per instruction = 1 stage = 4 cycles.
2. Now, lets assume the above 5 stage pipeline processor with stage execution times as 2,3,4,5,6 for IF,ID, EX, MA, WB respectively.
WHen they overlap as shown by red highlighted part, the stages that complete early like IF have to wait for other stages to complete. They cant proceed with their work individually and independently. Else, we wont get correct data in buffers at right times.
Hence, stage delay corresponds to longest stage.
Hence, cycle time = longest stage = 6 (@Rahul. For your question, it would be 5)
3.
Assume that only 1st and 2nd instructions are pipelined.
Lets say that first instruction is branch instruction. SO, till 3rd stage, we wont know that this is going to be branch instruction.
Now, 2nd instruction (in red) is also in pipeline. So, when 1st instruction is in ID stage, 2nd will be in IF stage.
Now, when branch address is available at end of 3rd stage, the procesor will be instructed to flush already pieplined instructions viz. I2, I3, I4 and so on and newly load the instruction whose execution will start with 4 stage.
So, remaining stages of 2nd instructon viz. EX, MA, WB are flushed (shown by a strike) an newly loaded instruction is shown in green which starts execution from 4th stage. So, there was unexpected delay of 2 stages(in pipeline) as shown by the yellow part.
This yellow part corresponds to stall cycles.