Pipelining
Wikipedia topics :
Instruction pipeline
classic RISC pipeline
* Early processor design often required a read to fetch the instruction opcode
followed by additional reads to fetch operands after the decoding process
started.
* Because memory access is often slower than activity in CPU, performing single
read of complete instruction can save clock time.
* RISC architectures use single read to obtain whole instruction,
ie. instructions equal to the size of the data bus.
However, to access full address range, RISC architecture may resort to
indirect addressing where a register holds the target memory address.
This register may require multiple instructions to set.
* Modern CPUs (32, 64, 128-bit data bus) are capable of fetching whole or
even multiple instructions in single read even when variable in length.
* CISC systems often support complex indirect addressing modes and combine
data fetch with more complex activity (multiply a register with the
value found in memory ) which can cause delays in the fetch - excute cycle.
RISC separages the fetching/storing of data in/out of the cpu from the
more complex tasks such as adding or mulitiplication, allowing these
activities only between registers in the CPU.
* Another feature of RISC is to create a set of simpler shorter time
instructions and let the user use programming to customize more complex
activities only as needed.
Non-pipelined - 1 MHz clock
|
Instruction type 1 1 clock per step in FE cycle * 4 steps = 4 clocks/ins. 1,000,000 cycles per sec./4 cycles per instruction = 250,000 ins/sec.
Instruction type 2
Instruction type 3
|
Pipelined - 1 MHz clock - example assumes single instruction type for simplicity.
|
Instruction type 1 Since all steps are the same, longest step 1 cycle/sec. 1 instruction/clock 1,000,000 cycles per sec. / 1 cycle per instruction = 1,000,000 ins/sec.
Instruction type 2
Instruction type 3 |
Super-scalar pipeline (execution step) - 1 MHz clock - single instruction type for simplicity.
|
Instruction type 1 Since all steps are the same, super-scalar has no overall effect. 1 instruction/clock 1,000,000 cycles per sec. / 1 cycle per instruction = 1,000,000 ins/sec.
Instruction type 2
Instruction type 3 |
Check out :
https://arstechnica.com/old/content/2004/09/pipelining-1.ars
and
https://arstechnica.com/old/content/2004/09/pipelining-2.ars