Secondary Storage - Disk Drives.

Theory

There two ways to access data. Either you can directly go to the data or you can go to the start of a block of data and then examine each piece to find what you are looking for.

Direct access allows quicker access but to go directly to the data, you must have a map of where all data is. At some point, this itself becomes data that needs a map. So there must be a compromise.

Think of a dictionary with section tabs. You can pick it up an flip to the letter C or even some subset of it. But once you are open to that section, you must scan down the column until you find the word of interest.

Or picture a file cabinet. It has drawers that are stacked in rows and columns and any drawer can be directly accessed. Once in the drawer, there is a simple sequence that can be skip searched. But once you find the folder you are interested in, you usually have to look at each page sequentially. You can add more drawers and folder to speed up the access time, but that costs room and expense for property that is not the papers you are filing. Additionally, you must spend more time creating and maintaining a categorizing system.

Practice

One of the primary secondary memory devices is the disk storage with the Winchester (hard drive) storage being the most popular.

A hard drive is an arrangement of magnetic media on the surfaces of a flat circular disk that allows the system to quickly and directly access any part of that disk surface. Early hard drives were physically large, up to 8". With the advent of the PC, this first dropped down to 5 1/4" and then around 3 1/2" . With the advent of laptops, this has become even smaller.

A hard drive uses one or more magnetic read/write heads similar to that of a tape recorder to access or change the data on the disk. However, unlike a tape recorder, the head is designed not to touch the platter but rather to literally fly over it. The platter is spun at a high speed which causes a layer of air to lift the head off of the surface. This buffer of air allows the platter to continue spinning at high speed without wear to either the platter or head. Current speeds are 3600, 7200, or 10600

This high speed allows the very fast transfers of data.

The data on the disk is stored in a set of parallel tracks on the platter surfaces. The set of tracks that fall at the same location on all of the platters is called a cylinder.

The tracks are divided into blocks called sectors.

The use of sectors is two fold. First, although the advantage of the disk design is the ability to quickly position the r/w head anywhere, there has to be some way to determine where you are at to make sure you are where you want to be. The track design can get you part way there, but once you are over the track, how do you know where the data starts and ends.

Secondly, a sector overcomes the problem of having to address every single byte or bit of data on the device. If every byte had its own address, the address data would take more room than the data itself. So the compromise of addressing blocks of data by sector still offers most of the advantages of random accessing ability while only requiring sequential access of a small sector of data at a time. The trick is to decide what is a good sector size that allows very quick access to data on a block that must be scanned sequentially without wasting resources on the overhead.

The current popular value has settled on 512 bytes on many systems. Note that the actual size of a sector is larger because it contains synchronization and addressing information at the beginning and error correcting data at the end.

There is also an inter-sector gap. Each sector must be spaced apart so that the system can recognize when it is passing under the read head.

The 512 byte size has come about for several reasons.

First, although it is historically true that electronics are faster than physical devices, the design of the hard drive allows data to be transfered at rates equal to that of the interface devices and bus architecture. As a result, the hard drive is capable of reading/writing a sector of data very fast. This data must now be transfered to/from the primary memory. Because of this, it is important that the amount of data transfered between primary memory and the hard drive be of a size small enough to prevent loss without slowing the transfer down.

Second, there is a waste of disk space because of the sequential nature of data access within a sector. Even if a sector contains only two valid bytes of data, the whole sector is used to store it. There is no partial sector addressing. So, the smaller the sector, the more efficient the use of disk space.

And third, there is a problem that counter balances the second. If smaller sectors are used, a record of what they contain must be kept. The more the sectors the larger the table of location information. Also, remember that each sector has addressing and error checking data. So there is lost storage to information that is not data. There is also the inter-sector gaps that add up with each additional sector added.

Response time

The response time is determined by several features, the size and speed of the transfer bus between the drive and the system, the speed at which data can be transfered to/from the platter itself, and how fast can the r/w head be placed over the correct data.

The third feature has two aspects, seek time and rotational latency.

Seek time is how long does it take to move the head over the correct cylinder or track. The worst case scenario is that the drive head is at the outside track and the next access is on the inner most track (or visa versa). When seek time is given is assumes averaging and usually represents access half way across the disk. With the newer smaller drives, this time has gotten much smaller.

Rotational latency is how long does it take for the correct sector to appear under the head.

Interleaving. Interleaving is a process by which the sequence of sectors that make up a file are spaced so that time lag of reading the 1st sector and transferring the data to the CPU/memory and the time lag (rotational latency) to spin the next sector to be read under the read head are matched to minimize the wait. So, if there were 6 sectors on a track and all six were being read, they would possibly be laid out in the sequence 1-4-2-5-3-6 On newer drives, the issue of interleaving is bypassed by including an internal buffer on the drives control circuits that allows the drive to read the whole track in as a single read access. The appropriate sectors are then accessed from this cache.

Control

Disks are controlled by a combination of firmware in the computer system's ROM BIOS or kernel and a controller (interface) card. The system side tends to be very generic and communicates with the controller card using a published protocol, which would include things such as what various interrupts mean and how the