Main memory

The fundamental memory facility of a computer is main memory. Main memory is directly addressable by the cpu. While most computer now use cache memory for speed, it is not required. On the other hand, primary main memory is by definition part of a computer system and is needed to store the data be processed.

Main memory is directly addressable by the cpu for the most part. The cpu and the system busses are designed specifically for accessing and transferring data between memory and the cpu. The width of the address bus determines the number of cells (usually bytes) and the data bus determines the size of the cell being accessed.

There are exceptions to the ability of a cpu to directly address all of main memory, either because there is more primary memory than is physically addressable or there is less physical memory than is addressable.

The first condition can be seen in the EMS (expanded memory system or LIM). And the second can be seen in virtual memory. In both of these cases, a combination of system software and additional control circuitry, usually not on the cpu takes care of resolving the physical address of memory.

EMS

The first situation seldom occurs in current systems. But in the early days of the personal computer, software existed, such as spreadsheets, that would use as much memory as they could be provided.

However, the original Intel 8086 processor architecture only provided for 1 Meg of address space and parts of that were reserved for hardware interfaces.

The Expanded Memory Specification (LIM) solution was to create a hardware device that mapped a contiguous 64K block of memory configured somewhere in 784-960K range in 16K page units. The hardware itself was capable of handling 8M of memory and would swap 16K pages between the 64K block seen by the cpu and the 8 Megs on the hardware device. However, the memory could only be used for the data segment, and not for code.

Each 16K page could be swapped out independently. The primary condition for using this mechanism was that the application had to be aware of it and have the correct procedures coded.

Memory used this way was slower than regular primary memory, both because of the swapping itself and the need for software procedural calls in the applications. However, it was still faster than physically swapping memory out to a secondary device to create space for a new block of data.

Because the modern cpu is capable of addressing many megabytes of direct primary addressing, ems is no longer used in new applications. For older applications, Microsoft and other software providers have created programs that can used the memory above the 1 Meg boundary as if it were memory on a ems card. Examples of these software programs were emm386 and qemm. In general, NT and Microsoft OSes released after NT no longer support EMM.

Overlays

Because having and needing more address space than physical primary memory is more likely the situation, several systems have been created.

A strictly software way is the use of overlays. In a program using overlays, each overlay is a portion of the program much like a function or procedure. What makes it different is that it is stored as a separate file and loaded into memory only when needed. It also has to be able to swap itself out to secondary storage with the next overlay needed. DLLs are a advanced form of overlays.

Virtual memory

On most modern cpus, the address bus is usually capable of addressing a very large block of primary physical memory. However because of cost, most systems are not populated with that actual memory.

Another scheme to make a limited amount of physical memory appear if it is a larger memory space is virtual memory addressing.

In virtual memory addressing, the physical memory is uniformly broken up into a set of pages or page frames. The virtual address space (the range of addresses recognized by the cpu and its ISA) is broken up into the same size blocks. A table is used to map the virtual memory pages to the available physical frames.

The data/programs stored in the virtual address space are usually stored on a secondary device and brought in as needed.

The page table is used to indicate if and where in physical memory the a copy of the virtual page's actual data is stored. The page table has one entry for each and every virtual address page. The entry holds the address of a block in real memory and a presence flag indicating whether the virtual page is in real memory or in the secondary storage space. The virtual page's page address is the index into the table.

If the presence flag indicates that the virutal page is absent from pysical memory, then one of the page frames in physical memory must be emptied and loaded from the secondary device. If present, then the real memory address stored in the page table at that virtual page index, points to the page frame in physical memory where the program/data is stored.

Translation between the address stated by the instruction and placed on the address bus and the actual physical address is performed by a hardware circuit called a Memory Management Unit. This unit is capable of quickly accessing the page table and also has direct access to 2ndary storage.

In a virtual memory system, this access is transparent to a user's program and the ISA instructions. However, the speed of execution will be affected and the operating system may have to perform additional procedures.

When an address is accessed by the program, its virtual page address is found. The page table at that element is examined. If the page is not present, a page fault occurs. If this occurs, the required page read in and the instruction reissued.

A separate but related problem occurs if all of physical memory is full. One of the page frames containing the page from another location in virtual memory will have to be written out, the required page read back in, and the instruction reissued.

An instruction can cause several page faults before all the required data is available. And each page fault slow down the system.

On a small system, this page fault procedure is allowed to run its course and within a short time the physical memory will be loaded with the most used blocks of virtual memory and the page faults will taper off to a reasonable level (because of the principle of locality). This technique is called demand paging.

However on a multitasking system, by the time all of the right data has been moved to physical memory to run a specific program, it may be time to swap it out for a different program.

An alternative to demand paging is anticipatory paging. In anticipatory paging either the program indicates the pages it will need or the operating system attempts to predict this by observing the demands made by a program. This is a more expensive and complex practice, but may be necessary on large multitasking machines.

Because, data/code access is not evenly distributed across virtual memory but rather is grouped in clusters of pages referenced over and over for a give time period, it is possible to avoid swapping out these pages. A set of such pages is called a working set.

Deciding which pages stay in real memory and which go can be determined by one of several conceptual algorithms. These are essentially the same algorithms used with caching.

FIFO - First In First Out

LRU - Least Recently Used

LFU (LU) - Least Frequently Used

However, each of these can cause more problems than they cure if the wrong conditions occur. If you are given a set of n+1 virtual pages that are fetched and accessed in order and you have only n page frames to put them in, then when n+1 is accessed, there is a page fault and under either of the 1st two algorithms, page 1 will probably be paged out and n+1 paged in. But is the next step is to branch back to viritual page 1 (which was just paged out), then another page will have to be paged out. It will be page 2. If this sequence of events continue, the results are far worse than picking a random page to page out.

On the other hand, the third technique requires more complex logic to track usage and if there is a tie, it still has to resort to one of the other techniques or do a best guess.

When a program creates a situation with constant page faults, it is referred to as thrashing.

A virtual system is often designed to leave a few of its physical blocks empty, perhaps by swapping out the LRU blocks if the older than a certain time to protect against situations such as thrashing.

Another technique to limit the swapping time of two pages is to note whether the page in physical memory has actually changed. A changed page is called dirty whereas an unmodified page is clean. Pages that contain only instruction code usually are not modified and can be discarded rather than written back to virtual memory.

An example:

If a system has 512K of physical memory and a 4 Gig address buss (32 bit), the memory could be broken up into 64K pages. This would yield 8 64K physical page frames and 64K 64K virtual pages. The size of the table is 64K entries * 1 bit for present/absent flag and 3 bits to show the page in physical memory where the virtual page is stored. This gives a page table of 32K bytes.

When a reference to memory in a virtual pages is accessed, the upper 16 bits of the virtual address are used to select the virtual page table entry. If the entry is absent, then a physical page frame will be allocated and its page number stored in the virtual page table entry. If the page is present, then the page table entry is used to select the specific physical page frame to access.

In the real world, unless the ratio between the number of virtual pages and physical page frames is reasonably small, a single level table is impractical because of its size.

A practical solution is to create a two level table. The first level functions as above but each entry represents a represents a set of page frames.