Back Next
Data Buffers
  Temporary storage for data being moved or used.
  Usually implemented as FIFO or queues.

Stack
  Temporary storage. 
  Usually implemented as FILO.

CPU Cache
  Temporary storage.
  Predictable but non-linear storage of small blocks of data (from memory).
    Usually keyed on address of data's memory location.

  
  source : en.wikipedia.org/wiki/CPU_cache

Cache - A compromise between registers and main memory.
  May be external or internal to CPU
  Wikipedia topics :
  CPU cache - general description of the nature of CPU caches.
       Sections 1-4 and 6.2
  Pentium - lists CPU designs with cache levels and sizes.

    Looks like main memory (from the program and CPU's data path point of view)

    Incorporated into each core of a CPU on modern systems.

    Cost between main memory and registers.

    Doesn't affect the 'architecture' of the CPU. 

  Several levels - Trade-off between speed and quantity.   
    * Register files (multiple copies of CPU's work registers).

    * Level 0 - see  https://forum.beyond3d.com/showthread.php?t=54666
      Caches set up for specific functional units in CPU
        such as the ALU or floating point co-processor.
        # buffer may be a better term for this technique.

      Branch speculation support. Do calculation on both branches 
        but don't commit until branch resolved.
 
      Instruction queue - Intel 8088/6

    Level 1 - Small quantity 16K-256K (per core) on CPU at CPU speed. 
      Harvard architecture - Code and data separate caches.
      N-way set associative caches

    Level 2 - 1-8 MiB - High speed unified (mixed code/data)
      Each core most likely has own dedicated portion of a single L2 cache.

      Set Associative - cache split into sets.

      Static ram external to CPU on custom bus (earlier design). 
        or
      Integrated on the CPU chip, most current CPUs 

      On multi-core CPUs with allocated blocks : 
        * 4 core CPU with a 4MiB cache and 4 way set associative cache
         each core has 4 * 256 KiB set associative cache. 
     
    Level 3 - system level but with fast response static ram.
      Set Associative or direct if very large. 

      More commonly implemented as shared cache between cores on a 
        multi-core CPU.
  
      see https://www.extremetech.com/computing/55662-top-tip-difference-between-l2-and-l3-cache

    Level 4 - eDRAM, external DRAM.  
      Used to transfer data between CPU and Graphics processor.
      Used to transfer data between Graphics processor or other high
        speed controller (Hypertransport?) and system SDRAM.
      Used as a victum cache (full associative)
        When evicting lines from level 3 but with the potential of being
        needed in the near future.

      Seems to come into and out of use over time.
         128 MB possible size.

  See : https://arstechnica.com/gadgets/reviews/2002/07/caching.ars