Representation of Binary Numbers
Binary numbers are represented in digital systems in one of the following two formats:
· Fixed-point representation
· Floating-point representation
The Fixed-Point Representation
Consider the decimal number
X = 10587.349
This type of representation of a number as a string of digits with the decimal point in between two smaller strings (or groups) of digits is called as fixed-point representation. The group of digits to the left of decimal point is called as integer part, and those to the right of the decimal point is called as fractional (decimal) part.
We can represent binary numbers also in fixed-point representation using a similar format. For example, the binary number
Y = 101001.01011
is written in the fixed-point format. Here, the point symbol is called as binary point (similar to decimal point). As in the case of the decimal-number representation, to the left of the binary point, we have the integer part, and to its right, the fractional part.
The idea of using binary points to represent fractional parts exists only in theory; in practical systems, however, they are not represented in this format. Instead, we designate some locations in the memory of the computer to represent the integer part, and some other locations to represent the fractional part.
We may generalize the representation of numbers in the fixed-point arithmetic by expressing the number X in the format.
The most significant bit 0 indicates that the number represented in Eq. (1.13) is a positive number. Similarly, the general expression for representing negative numbers in the fixed-point arithmetic is:
The general expression given in Eq. (1.14) may be modified in one of the three formats given below to represent negative numbers in the fixed-point arithmetic.
Representation of Negative Numbers and Fractions
Negative numbers or fractions are represented in binary systems in the following formats:
· Sign-magnitude format
· One’s-complement format
· Two’s-complement format
The Sign-Magnitude Format
In digital computers, we have to represent both positive and negative numbers. For this, we use an additional bit, called the sign bit, in the most significant bit (MSB) position. We use 0 to represent positive and 1 to represent negative numbers in the MSB position. This scheme of representing numbers is known as the sign-magnitude format. For example, consider the decimal number 4. Its binary equivalent is 0100. In the sign-magnitude format, we can express this number in the form
+4 = 00100
-4 = 10100
In the above example, we have represented whole number in the sign-magnitude format. We may also require sign-magnitude representation of fractions. Consider the binary fraction
X = -0.11010001
This can be represented in the sign-magnitude format as
X = 1.11010001
where the 1 to the left of the binary point (i.e., the MSB) indicates that the binary fraction following this bit is negative. The general expression with which negative numbers are represented in the sign-magnitude format can now be written as
In the sign-magnitude scheme, multiplication of two numbers is a relatively straightforward process and does not involve the usage of any special algorithm. This is the advantage of this scheme. But as far as addition is concerned, this scheme requires a more complex procedure to be followed in which operations such as sign checks, complementing, and carry-generation are involved.
The One’s-Complement Format
In the 1’s-complement format, negative numbers are represented as 1’s-complement of the given number. For example, consider the binary number
X = -0.11010001
By changing 1s to 0s and 0s to 1s, we get its 1’s-complement as
Y = 1.00111110
The general expression with which negative numbers are represented in the one’s-complement format can now be written as
where `b_{m} is the 1’s complement of the number b_{m}.
It is easy to multiply two 1’s-complement numbers, as this is a straightforward procedure. But addition requires special algorithms. Even though 1’s-complement scheme can be used for mathematical processing in digital systems, because of its obvious advantages in carry-generation, 2’s-complement scheme is used in majority of signal-processing computers that employ fixed-point arithmetic.
The Two’s-Complement Format
We have seen that 2’s-complement of a given number is obtained by adding a 1 to the LSB (least-significant bit) of 1’s complement of that number. For example, consider the binary number
X = 0.1101 0001
1’s-complement of -0.1101 0001
X′ = 1.0010 1110
Now adding a 1 to X′ yields its 2’s complement:
Z = 1.00101111
The general expression with which negative numbers are represented in the 2’s-complement format can now be written as
where b_{m} is the 1’s complement of the given number b_{m}. The symbol Å represents modulo-2 addition of 1 with the 1’s complement of the given number. Modulo-2 addition is used, so that the carry generated in the sign bit can be ignored. It can be seen from practical examples that the carry generated in the sign bit has to be ignored; then only the mathematical operations employing 2’s-complement will produce correct results.
Range of Numbers in the Fixed-Point Arithmetic
Consider the 4-bit representation of binary numbers in the fixed-point arithmetic. We find that in this scheme since one bit has to be reserved for the sign bit, we can represent a maximum of 2^{3} (= 8) positive numbers. The eight positive numbers are 0.000 to 0.111. Similarly, we may represent a maximum of 2^{3} (= 8) negative numbers in this scheme. The negative numbers are 1.001 to 1.111. Thus the total number of positive and negative numbers in this case is sixteen.
We may now obtain a general expression for the range of numbers in the fixed-point format. Let the number of bits that the given computer can accommodate be m. Of these m bits, we have to keep one bit ready for the representing the sign. The remaining (m − 1) bits can be used to represent the numbers. This means that we can represent a total of 2^{m}^{‒1} positive numbers, and 2^{m}^{‒1} negative numbers. Thus in this scheme, the range of the numbers
R = -(2^{m}^{‒1}-1) to (2^{m}^{‒1}-1) (1.18)
For example, consider the case of a 64-bit computer. With fixed-point format, the numbers that can be represented by it lie between
-(2^{63} -1) and (2^{63} -1)
Even though this appears to be a large range of numbers, in reality, it is not so. It can be seen that the floating-point scheme can accommodate much larger numbers than the fixed-point scheme.
In the fixed-point arithmetic, to restrict the maximum and minimum numbers so that they can be stored and processed in registers of finite lengths, we require two operations known as truncation and round off. Truncation is the operation of abruptly terminating a given number at some desired significant-bit (or digit) position. Round off (or simply, rounding) is the operation of approximating the truncated number at the same or next higher value. These are the same terms that we have been using in our day-to-day arithmetic. For example, consider the product
3.256789 ´ 7 = 22.797523
Now, let us truncate this number to three significant digits after the decimal point. This means that we have to terminate the number abruptly at the third digit from the decimal point. In truncation operation, we stop at the desired digit, and do not look for the digit that comes after the truncated one. Using this theory, we find that the truncated number in our example is 22.797.
However, we know that abrupt truncation is not a desired operation. In practice, for a better approximation of the given number, we always look for next digit after the truncated digit. If this digit is greater than or equal to 5, we add a 1 to the truncated digit; if it is less than 5, we discard it. This operation is called rounding off. In our example, we had truncated the given number at the third digit after the decimal point (i.e., at 22.797). We now look for the fourth digit in the number, and find that it is a 5. Therefore, as per our round-off theory, we have to add a 1 to the truncated digit, and this operation makes the given number equal to 22.798 in the combined operations of truncation and rounded-off.
The principle of truncation and round off can be extended to binary number system also. We find that in binary system, for rounding off, we add a 1 to the truncated bit if the next bit after the truncated bit is a 1; if it is a 0, we leave the truncated bit as such without any modification. Consider, for example, the binary number 11101101001. We want this to be truncated at the third bit after the binary point. Performing this operation, we get the truncated number as 1110.110.
To round off the number at this point, we find that the next bit (i.e., the fourth bit after the binary point) is a 1. So, we add this to the third bit and the truncated and rounded-off number will be 1110.111. However, if the fourth bit were a 0, then the third bit will be left as such. This means that the truncated number 1110.110 itself will be the rounded-off number.