Purpose of a computer is to manipulate data.
Information can be classified as labels or numbers.
In the real world,
Labels are formed by arbitrary characters or symbols and represent things.
Numbers are formed from a limited set of characters and represent quantities.
Character sets
In a computer system, characters are created by assigning the symbol for the character to an arbitrary value. The value assigned has no implicit value to the computer.
Although the symbol value pair is arbitrary, in order to allow systems to share data with the same interpretation and to allow users to interpret that data consistently, a standard must be followed.
Currently there are three common standards.
EBCDIC Extended Binary Coded Data Interface Code - used by IBM as a standard for its mainframe computers. It represents English characters, numbers, and various control characters for printing and uses an 8 bit unit.
ASCII (American Standard Code for Information Interchange) - an alternative originally for small computer systems. It uses seven bits to represent 128 possible English characters, numbers, and various control characters for printing. The 8th bit originally was used for error checking. On current systems, the eighth bit is used to provide another 128 possible characters, often used to provide alternative symbols for printing.
Both EBCDIC and ASCII are based on 8 bit units and are used to represent English characters. However, computers have become international.
A new character set representation, called Unicode, uses a 16 bit values and has been used to represent the character sets of many languages. Although, Unicode is supposed to be a new standard, 8 bit and 32 versions also exist and can cause incompatibility issues.
Character data has no inherit meaning to a computer system and any human based meaning must be enforced by program design. In general, the only manipulation a computer system performs on character data is move it or compare it. If the standard is designed so that the collating sequence of represented characters are assigned to ascending numeric values, it is possible to use the mathematical facilities of a computer system to perform the comparisons.
The other type of information manipulated by a computer system is numeric data. Numeric data is classified as integer or floating point. Because of the way data is stored and manipulated, each of these is different.
Numbers
The number system currently used in most of the world is the decimal system.
The decimal system uses a base of 10 and zeros as place holders. This probably evolved from the use of all of our fingers to count. 753 represents 3 ones, 5 tens or 5 groups of 10 ones, and 7 hundreds or 7 groups of 10 tens.
Alternatives have existed. The Aztec used a base of 20 called vegecimal.
Computers however use bits, 0 and 1, to count. One form of numbering called binary or base 2 used the computer's native counting.
So 100101 binary represents 1 one, 0 twos, 1 fours, 0 eights, 0 sixteens, and 1 thirty-two. Or 37 decimal.
But base 2 number present a problem. Large values require very long sequences of zeros and ones. Ideally, an alternative bases that synchronize with the base 2 system could simplify the representing of values.
Two of these systems are octal and hexadecimal.
Octal uses a base of 8 or values of 0-7. The advantage of this number system is that three binary digits can be represented by one octal digit.
So 100101 can be represented by 45 octal.
Another base that is useful is base 16 or hexadecimal. However, base 16 presents a problem. We don't have 16 number symbols. The solution to this is to borrow alpha characters to fill out the list. 0-9A-F
If we user hexadecimal, we can represent 4 binary digits with one hexadecimal digit. However to represent 100101, we need to pad the sequence of digits. 00100101 can be represented by 25h.
Note that both of these systems behave just like decimal numbers when using math.
7 0111
3 0011
----------------
10 1010
It is possible to use decimal numbers to represent binary digit sequences. However, binary digits do not divide easily into units that can be represented by decimal digits.
0000 0
1000 8
1001 9
1010 10
1011 11
Notice that the relation between the number of binary digits and decimal digits do not match naturally.
An alternative is to force a break in the binary sequence.
0000 1001 9
0001 0000 10
0001 0000 11
This is known as binary coded decimal. To use it mathematically requires special code.
7 0000 0111
3 0000 0011
________________
10 0001 0000
In the real world, data can be of any size. However, to provide for automation of data manipulation, it is common for a system to work with data in predefined size units. The smallest unit for representing data is a bit (0 or 1) and can represent two values. By grouping bits together, a larger range of values can be represented.
Standard sizes have varied by manufacturer over the evolution of the computer systems. Currently 8 bits (1 byte) or multiples of 8 are common. An 8 bit unit can represent 256 possible values.
Integer data can be represented as an exact value both in the real world and in a computer system provided the user is willing to use sufficient resources.
Binary shortcuts.
2^0 = 1
2^1 = 2
2^8 = 256
2^10 = 1024
2^16 = 65536
2^20 = 2^10 *2^10 = 1024 (1K) * 1024 = 1 Meg
2^30 = 1 Gig = 1Meg * 1K
2^32 = 2^2 * 1Gig = 4 * 1 Gig
4 Gig / 64 K = 2^2 * 2^30 / 2^6 * 2^10 = 2^(32-16) = 2^16 = 64K
Converting decimal to binary.
273
136 r 1
068 r 0
034 r 0
017 r 0
008 r 1
004 r 0
002 r 0
001 r 0
000 r 1
100010001
1 + 16 + 256 = 273
Negative values
Another common feature of numbers is negative numbers. The question is how to represent it.
We can use one of the bits of a number as a flag. Starting with an 8 bit memory, a simple technique would be use the highest bit as a flag. So, 0000 0001 = 1 and 1000 0001 = -1. However, to add these two values together requires the system to test the high bit to determine the correct handling of an add. Notice that an unsigned 8 bit memory can represent 256 values 0-255. If we use the high bit for a sign, that leaves 7 bits for a value of +/1 0-127 with both a positive and negative 0.
Another way to create a negative value is to subtract the value from zero, so -1 in binary is 0000 0000 - 0000 0001 = 1111 1110 and -128 is 0000 0000 - 0111 1111 = 1000 0000. This is called 1's complement. The number is simple to generate, it uses the high bit as a sign flag. But we still have the problem with both a positive and a negative zero.
Additionally, using these values mathematically can be tricky.
1 0000 0001
+ -1 1111 1110
-----------------
0 1111 1111 -0
2 0000 0010
+ -1 1111 1110
----------------
1 0000 0000 0
Note that for both of these, adding 1 corrects the problem.
A third method, called 2's complement requires a little more work to generate the number but then simplifies the mathematical implementations. For 2's complement, the 1 is added when the number is created.
The simplest way to generate the negative version of a number is to write the binary representation, inverting the bits and adding 1
1 0000 0001
inv 1111 1110
add 1
--------------
1111 1111
Now note when the math is applied
1 0000 0001
+ -1 1111 1111
--------------
0000 0000
2 0000 0010
+ -1 1111 1111
--------------
0000 0001
Additionally there is no +/- zero, so a complete 256 value range is available. However it is slightly skewed. 128 negative values, a zero, and 127 positive values.
If a 2 byte value is used, only the range of representative values change. 32768 negative values, zero and 32765 positive values.
Integer values can be represented as exact values by computers. The only restriction occurs because of limits based on the size of the number.
Ranges
8 bit = 256
16 bit = 65536
24 bit = 16777216
32 bit = 4294967296
64 bit = 18446744073700000000 1.8446*10^19
Decimal values
On the other hand, even in the real world, some floating point number are not representable as a numeric value, e.g. Pi or 1/3 (a ratio).
However, it common that an exact value for a real number is not needed, only a representation to a reasonable precise value. 1/3 = .33333
Decimals. Decimals can be represented with binary values. The trick is to work to the right.
So given a 2 byte number with the decimal between the 2 bytes gives
00000000.00000000 and the bits starting from the decimal and going right are 2^-1, 2^-2, 2^-3, etc. with the values:
2^-1 = 1 / 2^2 = ½ or .5
2^-2 = 1/4 = .25
2^-3 = 1/8 = .125
2^-4 = 1/16 = .0625
2^-5 = .03125
2^-6 = .015625
To represent calculate the binary decimal value from a decimal decimal value, multiply by two
and record and discard the value shifted to the integer side.
0.173
0.346
0.692
1.384
0.768
1.536
1.072
0.144
0.288
0.576
1.152
0.304
Notice that this number does not appear to have an exact representation. So the 1st problem is to decide the precision. However another problem becomes obvious. A memory bit record 0 or 1, it does not record a decimal point. So even if we decide to store the decimal or fractional portion of a number, we have to determine ahead of time which bits are to hold the fraction and which hold the integer value.
Some systems support this and is referred to as fixed point numbers. All memory locations are defined as having a specific number of bits for the integer portion and a specific number for the fractional part. The software or hardware is then designed to honor this division.
In a system such as financing, it is possible to describe all numbers as values to three decimal precision. However in many other systems it is common to have wide ranges of number values.
How many atoms of hydrogen are in the Sun if a hydrogen atom weights 1*10-23 grams and the Sun weighs 10^46Tons? (Correct values).
IEEE 754 Floating point representation.
To convert a real number to a floating point representation based on the IEEE 754 standard.
-7.25 912.730
Convert the base 10 real number to its binary representation.
Start by converting the integer portion of the number 1st because it will have the most significant digits.
7
3 1
1 1
0 1 = 111b
Next convert fractional portion of the number. To determine the number if significant digits to carry out, take the number of significant digits that will be stored by the floating point representation and add
2. Remember that we do not need to store the most significant digit and we also want to round the number we do come up with.
Assume we have a storage of 15 bits of significance. and an 8 bit exponent.
1st example, we need 3 bits for the integer, so 15 + 2 - 3 = 14 digits for fraction.
.25
0.50
1.00
0.00 = .01b Since we have no more significant digits, we pad with 0s
Representation
111.010 = 1.1101 0000 0000 000 * 2^2
Significand 1101 0000 0000 000
Bias 2^(8-1)-1 = 127
Biased exponent 127+2 = 129 = 1000 0001b
Floating point representation 1 1000 0001 1101 000
2nd Example - Integer conversion 912.730
912
456 0
228 0
114 0
057 0
028 1
014 0
007 0
003 1
001 1
000 1 = 11 1001 0000b = 256*3 + 16*9 + 0 = 768 + 144 = 912
2nd example, we need 10 bits for the integer, so 15 + 2 - 10 = 7 digits for fraction.
.730
1.460
0.920
1.840
1.680
1.360
0.720
1.440 = .1011 1010b = .5 + .125 + .0625 + .03125 + .0078125
.5 +
.125 +
.0625 +
.03125 +
.0078125
-------------
.7265625 Note this is not = to .730
Representation
1110 0100 00.1011 101b = 1.1100 1000 0101 1101b * 2^9
Significand 1100 1000 0101 111 - Most significant digit dropped and number rounded.
Bias 2^(8-1)-1 = 127
Biased exponent 127+9 = 136 = 1000 1000b
Floating point representation 0 1000 1000 1100 1000 0101 111