Lectures

Data transfer and storage problem.

Problem:
  Data in primary storage exists as individual bits or switches.
    Each bit is discrete with a unique address and access.

  Data in secondary storage Or being transmitted 
    Tend to have boundaries much less distinct.

  Transmission:  
    * Think of water faucet and hose. 

    Electrical:    different voltage/current levels.

    Optical:       light pulse 

    Electro-magnetic: specific frequency or phase shift over radio waves.
         
  Storage: 
    Magnetic:     magnetized and demagnetized zones. 
 
    Optical:      reflective and non-reflective. 

    Electronic    large banks of Flash ROM.

Data vs. Carrier
   Analog and Digital  (4 combinations).

   Data 
     Analog :  movies, audio, wind speed, etc. 
     Digital : programs, text files, etc.

   Carrier
     Analog : Radio/TV, Wi-Fi, tape recorders, hard drives*.
     Digital : bus, fiber optics, CD, DVD. 

Technique :
  Ones at one level and zeros at another.
    Level (0) in between considered no signal.
  NRZ - non-return to zero

Problem :
  Really long strings of all 1s or 0s loose distinction.

  Level based signals can be inverted (zeros confused with ones)

  Attenuation - (entropy) the tendency of a signal to fade and average out
    as it travels through a medium.

  Accumulated charge - if signal is electric and there is a long string 
    of a particular bit value, unbalanced electrical charges accumulate 
    at one end of the communication. This is sometimes called the DC
    component.

Possible Solutions:
  Assume sender and receiver synchronized.
    Use very precise clocks and equipment.

  Provide second track or signal giving a timing measure. 
    Computer bus with clock line.

    ZIP drives use an optical timing track. 

  Use occasional re-sync signal (reserved bit sequence)
    Used by modems (NON-DSL)
      Transfer 7 or 8 bits of data at a time enclosed in start and stop bits.
   
  Attenuation can be handled by re-sampling the data before a threshold is
    reached.  This is why DSL modem has to be within 2 miles of the switching
    station.

  Accumulated charges balanced by requiring return ground line.

  These solutions require more resources or expense.


Technique :
  Use change in signal to represent 1.

  NRZI - non-return to zero, inverted
   
  Transition in either direction represents a 1. 

Problems :
  Long strings of zeros still problematic. 

Possible Solution:
 
  Limit the run of zero bits to keep in error range.

  NRZI, combined with other techniques is commonly used.
    Hard drives with RLL (run length limited).
    * RLL limits the number of sequential zeros or sequential ones.
     See below,.

    USB with bit stuffing.

    100Base-FX - 100 Mib Ethernet over optical fiber with 4B5B RLL.

     
Technique :
  Embed the data in a timing signal (or timing in data).

  FM - Frequency modulation
    Two frequencies (1 double the other) to represent 1 and 0.

    Transition at beginning of each bit cell provides timing. 

    Additional transition in middle of bit cell for 1

    Requires a 2x frequency.

    In practice, using signals that are whole modulus of each other can 
      induce harmonic distortion, so more commonly high frequency may be 2.3x 
      or some other odd partial multiplier.
   
    Data transmission over phone lines, modems, wireless broadcast. 
      Even DSL over copper phone lines uses an analog FM carrier.
        DSL carrier signals > 25KHz.
        Voice carrier signals 300 Hz - 3.4 KHz.  
        * Human hearing 12Hz - 20KHz. Ideal hearing.

    FM was used in earliest modems and allowed 300 bits/s. A set of 
      frequencies were picked that had least distortion and were such that
      sending and receiving data didn't interfere with each other.
      
      Because most phone circuits were designed to handle a maximum of 3KHz.,
        this also limited the baud rate.

   From the movie "Almost Famous"
   Ben Fong-Torres: "It's called a Mo-Jo, it's a very high-tech machine 
     that transmits pages over the telephone! It only takes eighteen minutes 
     a page!"

   * Modified Frequency Modulation - MFM - form of FM combined with simple
     RLL encoding.  Produced higher bit density that FM on early magnetic
     storage, now obsolete. 5 1/4 and 3 1/2 floppies and earliest hard drives.  

  PM - Phase Modulation (phase shift key)
    This example actually can send 2 bits at a time.

    Standard frequency used as a base.

    Shift in the phase represents a change from one bit type to the other.

    Requires a frequency at 2x the data rate for simple phase shift.

    OK to good for transmission but not for storage.

    Modems with simple phase shift key modulation delivered 1200 bits/s  
      or 2400 bits/s (2 bits per baud symbol).

    Often combined with frequency modulation to send more bits at same time.

    A variation called QAM form of PSK transmitted 4 bits per baud and
      increased throughput to 4800 or 9600 bits/s while using the same 
      carrier frequencies.

    Phase modulation combined with other encoding is used by wireless 
      networking, cable modems, digital cable and satellite TV.

  Manchester encoding
    
    Change in middle of bit 
      if zero, one direction, 
      if one, then the opposite.

    Improper wiring can invert bits.

    Change guarantees timing. Frequency 2x data transmission.

    Manchester encoding is used in wired (10/100) 10Base T Ethernet  802.3

    No DC component - average transfer of current is zero.
  
    (Like NRZ but based on change rather than level)

  Differential Manchester encoding
  
    Change at transition if next bit 0.
 
    Change in middle of bit every time. (clock) 
    
    This is similar in behavior to NRZI combined with a clock signal. 

    Zero average DC voltage - because switching guaranteed at least every
      2 clock cycles, average voltage zero - more efficient, less susceptible
      to noise.

    Bits still correct with improper wiring.

    Used on IBM Token ring.
    Some types of magnetic and optical storage.
    A variation used on magnetic stripe cards (credit cards, etc.)
    Used on the 1st floppy drives.

Data transfer with encoding
  RLL - Run Length Limited (Huffman coding)

  Encodes block of data bits on larger block of bits are easy to work with.

  Data is encoded so that the number or "run" of consecutive zeros is limited.

  Also limits number of consecutive one's to limit high frequency attenuation.

  Cost - Increases the number of bits needed.

   Generic example 
     
   Data   RLL 
          000 
          001 
   00     010   010
   01     011   011
          100 
   10     101   101
   11     110   110
          111 
   
   When combined as a stream of bits,
     At most 2 zeros or 4 ones next to each other.

   Implementations of real RLL encoding is more complex.

   Once data is encoded as RLL, NRZI can often be used effectively.

   RLL very effective when using magnetic storage.

   MFM - Data is encoded so that there is a zero after any one and no more than
     3 zeros in a row.  Sometimes called (1,3)RLL

     Used on floppies and early hard drives.  
 
   RLL(2,7) - encodes n bits of data on 2n bits. Most common on current storage.
     But done in data bit groupings of 2, 3, or 4.

     Guarantees that at least 2 consecutive 0s between bits

     Guarantees that at most 7 consecutive 0s

     RLL vs. MFM.
       The number of actual transitions in the RLL signal allow for
       1/3 smaller foot print when storing bit sequences.
   
       Early hard drives 20 MB (MFM), 30 MB (RLL) - same physical drive.

       Standard coding for hard drives for a long time.


   8b,10b - used on wired Gigabit Ethernet and NRZ.