Techniques to improve on simple fetch/execute cycle.
Simple pipe-lining
Multiple instructions in CPU
Different steps of the F/E cycle being processed on different instructions
at same time.
Pipe-lining can create conflicts.
Different steps wanting same resources
* fetch operand while writing results.
* attempting to conditionally branch before result finalized.
May offer 20-60% speed improvement.
Datapath description
Improved Datapath
Super-scalar
Duplication or variation of complex task circuitry, most commonly the ALU.
Arithmetic Logic Units.
Copies often not symmetrical. One ALU performs addition better and the
other favors multiplication/division.
Floating point units.
Less competition for same resources.
Vector processing
A vector processor will apply a single instruction to multiple data units.
Units are a small set of indentical proccessing circuits.
Customized version of super-scalar.
Useful in numeric task, not so much in word processing.
Intel supports 4 integer SIMD instructions.
small number of useful instructions.
fetch will read 4 32-bit values from a starting place in memory (array).
single arithmetic instruction applied to all 4 values.
- instructions may not be supported on older CPUs.
- requires compiler to recognize target CPUs ability.
Hyper-threading
CPU core simulates 2 CPUs.
Not quite true parallel processing.
Takes advantage of super-scalar features.
Duplicates architectural state
Control registers : status, interrupt mask, memory management.
Uses duplicate general purpose registers.
Allows two separate processes or threads to co-exist in CPU core.
If one thread stalled waiting for something like I/O
other thread allowed full access to resources.
Requires an OS that is multi-CPU enabled.
Invisible to program.
Up to 30% performance improvement but very application dependent.
Multi-core
https://www.howtogeek.com/194756/cpu-basics-multiple-cpus-cores-and-hyper-threading-explained/
Most of CPU's core circuits duplicated on same silicon chip.
2, 4, 6 core (8 now available)
Each core has its own level 1 caches.
May share single level 2 cache circuits.
Share single set of address, data,
and control lines connecting CPU to system buses.
Data lines now 64 bits (8 bytes) wide.
Multi-core requires different coding for a single application to take
advantage of multiple cores.
Super-scalar attempts to execute different parts of a single program
in parallel, whereas multi-core can run different programs.
Multiple CPUs or Multiprocessing
Separate CPUs that share bus and memory.
Symmetrical
- many identical CPUs.
Often simple.
SIMD, single instruction, multiple data (vector).
MIMD, multiple instructions, multiple data (mesh).
# parallel processing - different CPUs all working on same task
(more common).
# multiprocessing - different CPUs acting on different tasks.
Asymmetrical
- different CPUs handle different system tasks.
Memory management unit, Math Co-processor
Video processor, Sound processor