Considering a single CPU and 4 stages of pipeline

Now after some time Instruction fetch,Instruction Decode,Execute,Store is running parellely means overlapping . Thats why we are getting one instructions executed per clk cycle.[ideally]

Now my doubt is how pipeline stage overlapping is possible with single cpu. As all stages may need CPU inorder to perform their task.

A processor (a core) is not a different entity.For example this above 5 stage pipeline datapath is built inside a typical processor which enables the system to run overlapping instruction. The control unit synchronizes all stage operations. On a particular common clock pulse the control unit activates control signals and depending on which each stage do it's functionality. Make sure that here, the processor is not fetching more than one instrutions simultaneously.(superscalar processor is irrelevent in our case).

Not fetches more than one instruction, but out of order execution of different stages is possible

out of order instruction is ok! Please tell little bit about stage out of order ?

I- I2 - I3 say these three are currently in the pipeline. Ientered first,Isecond and I3 last.

You mean I2 can complete its execution before I1 does. right ?

When we are telling about out of order instruction, a instruction can go previous than other instruction. even it is possible that some part of instruction go before another instruction. Until an external interrupt happens, it is fine. When an exception occured, the pipeline stops that instruction from executing. And all pending instruction must be complete first, then again it return back to that instruction.

