Issues of optimal data access
In order to access data, the read/write head must be correctly
positioned. This requires two steps.
Finding the track.
Finding the sector.
Seek time - time required to move the head to the correct track.
Average seek time max/2. Smaller drives offer quicker seek.
Rotational latency - time required for platter to spin around to the
correct location (sector on track).
Faster rotational speed means faster access.
Higher density means more bits under read head for same RPM.
Newer drives also buffer whole track once found.
Zoning improves density on outer tracks.
Access time - combination of seek time and rotational latency.
Transfer time - access time and time to actually transfer data.
Newer high density drives actually deliver more data in same time because
bits closer together.
Some advanced drives have/had multiple read heads on a single actuator arm
to reduce seek delays. These are some times found on high-end servers.
Experiments with drives with read heads on independent actuators.
* Connors made a drive with 2 arms and could do 2 r/w at a time. Expense
made it impractical especially with caching on modern drives.
* Other advances # www.tomshardware.com
* Heat assisted magnetic recording
see : https://blog.seagate.com/craftsman-ship/hamr-next-leap-forward-now/
RAID architecture can accomplish the same more cheaply.
Other issues that contribute to data transfer latency.
The transferring of data from drive to CPU/memory incurs additional
delays that will be covered under the discussions of buses.
Interleaving
Problem
Block of data often occupies several sectors or clusters.
Locating a sector is primarily a mechanical action.
Data processed and moved to/from drive one sector at a time.
and further processed before sending to CPU.
Result
Start of next physical sector/cluster may have moved past read/write head
by time device ready for processing next sector.
Also, when writing, available sectors may not be conveniently located.
Both issues cause the accessing of data to be delayed.
Solution
Interleaving - allows for optimal placement of sectors. Sectors spaced
so that next sector in access sequence is available just after
processing of previous sector finished.
This requires that the data on the drive be rearranged for best
positioning of available and used sectors.
Most drives perform this as long as space is available.
On-drive Buffer (cache) - read and buffer all sectors found on current
track even if not immediate target.
Problem
Fragmentation - when trying to write a block to an optimal location,
location already in use. Causes fragmentation of file on scale
larger than track.
Defragmentation - occasionally relocating sectors of individual files so
they are optimally arranged for access. Also available space is
consolidated to provide best configuration for storing new data.
Scheduled defragmentation available on current versions of Windows.
Linux stores files in a way that seldom cause fragmentation
unless drive is > 80% full.
Interfaces
Hard drives are mechanical, magnetic, and analog.
CPUs and memory are electronic and digital.
To connect these two dissimilar environments requires an interface.
An interface (on PC bus and on device itself)
Translates (transparently) signals (levels and meanings).
Encodes/decodes data between storage friendly and CPU friendly.
RLL or other timing embedded data.
Also, error correcting info (Reed-Soloman)
* 'low-density parity check' starting to replace Reed-Soloman
* fewer bits but needed more processing (2009)
Handles timing issues and handshaking.
Buffers data.
Translates addresses (on newer systems).
Lies about real number of CHS.
Converts LBA to CHS (and zone configurations).
Original Encoding types
Worth reading : https://www.pcguide.com/ref/hdd/geom/data.htm
General
Uses small magnetic dots (domains) to record data.
Changes in flux reversal rather than actual polarity represents data,
easier to detect and non-arbitrary.
Allows for embedded timing, even if long strings of just 1's or 0's
Early drives used iron oxide, easy to magnetize but required larger
domains to detect.
Newer drives use a cobalt, chromium, platinum based alloy. Harder to
magnetize but allows creation of a much smaller domain (higher
density, closer tracks).
MFM
Earliest PC interface.
Modified FM - technique for storing data timing information stored
with data low storage density
Extensive software/firmware support to complete interface
Interface card responsible for most of the drive control.
Early systems broadcast the tones read off of disk to a decoder
card on the bus.
The data encoding aspect of MFM still used with floppies.
RLL - run length limited combined with NRZI.
Next stage in PC/hard-drive interface
Run length limited - modified data storage format (how long strings
of zeros handled)
50% improvement in storage density on same physical drive.
Extensive software/firmware support to complete interface
Interface card still responsible for most control.
ATA (PATA,SATA) drives put most of the decoding circuits close to
the read/write head to limit errors.
The data encoding aspect of RLL still has been used extensively with
even the newer drives.
PRML (EPRML) - partial response, maximum likelihood
Magnetic domains are actually an analog effect.
When a domain is read, its read at its maximum value.
Better quality heads allow for smaller domains and denser data.
As the domains get smaller, harder to distinguish individual domains.
Domain is made weaker to allow closer packing without cross interference.
Switch to sampling to determine what flux (change) is occurring rather than
identifying a specific peak spot.
Several small samples taken of the waveform and analysis suggest best
value for sample taken.
Because sector also contains ECC data, PRML works reliably.
30%-40% improvement in data density.
PRML used on drives > 2GB.
EPRML (improved version) can offer another 20%-70% over PRML
Note that the data is still stored using RLL encoding.
E/PRML is independent of the technology such as IDE or SCSI.
Horizontal vs. Perpendicular magnetic domains.
Older drives use horizontal domain.
Think of a bar magnet laying flat on the disk.
* Older drives used a weaker magnetic material, easier to set but
creates larger domains.
Newer drives use perpendicular domain.
Think of bar magnets stacked vertically.
Uses a strong (highly coercive) magnetic material on the surface
and a magnetically softer material underneath.
Second layer acts as a receiver for the read/write head.
But disks are thicker than older designs.
Used with EPRML to get good reading. Disk has two magnetic layers on
each side of the platter.
*******************
Magnetic bits will fade over time if not refreshed ( ~ 5yrs. )
Hard drives are designed to refresh any sectors accessed.
Defragmenting a drive regularly should refresh active drives.
To refresh a stored or seldom used drive,
(Mount the drive if not connected to a system)
Windows : use scan-disk tools on regular basis.
Copy files to nul