• It is found that in the floating-point representation we can cover a much larger range of numbers than that is possible in the fixed-point representation.

• In the floating-point representation the resolution decreases with an increase in the size of the range; this means that the distance between two successive floating-point numbers increases.

• In the floating-point scheme, resolution is variable within the range. However, for the fixed-point format, resolution is fixed and uniform.

• This variability in resolution provides a large dynamic range of the numbers.

We have, for the smallest possible number in this scheme, mantissa = 0.5, which is the minimum possible value of mantissas in the floating-point scheme, as stated earlier. This is then represented under the column

We find that this is achieved by sacrificing uniformity in resolution. Notice that in the floating-point format, compared to larger numbers, whose resolution is coarse, small numbers have finer resolution.

IEEE 754 standard for floating-point arithmetic in 32-bit computers is shown in Table 1.39.

• In the floating-point representation the resolution decreases with an increase in the size of the range; this means that the distance between two successive floating-point numbers increases.

• In the floating-point scheme, resolution is variable within the range. However, for the fixed-point format, resolution is fixed and uniform.

• This variability in resolution provides a large dynamic range of the numbers.

To the ideas given above can be illustrated, by considering case of a 16-bit computer.

**Example 35:**Consider a 16-bit computer. Obtain the dynamic range and resolution when the computer is operated (a) fixed-point format and (b) in the floating-point format.

**Solution:**

**(a)**For the 16-bit computer, with one bit reserved for representing the sign, the highest positive and negative numbers that can be represented in the fixed-point format are -(2

^{m}

^{-}

^{1}- 1) and (2

^{m}

^{-}

^{1 }- 1), respectively, where

*m*= 16. Substituting for

*m*= 16 yields:

The highest positive number = 2

^{16}^{-}^{1 }- 1 = 2^{15}-1 = 32767The highest negative number = -(2

^{16}^{-}^{1 }- 1) = -(2^{15}-1) = -32767This means that we can represent all the whole numbers from -32,767 to + 32,767. Since the numbers represented are of the form -32,767, -32,766, -32,765, …, 32,766, and 32,767 (i.e., successive numbers differing by one digit), we find that in this scheme:

Resolution = 1

In this scheme, we also find that we can represent only whole numbers; we can not represent fractions.

Now, suppose we want to express fractions also through this scheme. For this, let us reserve 5 bits to represent the fractional part, 10 bits to represent the integer part, and 1 bit to represent the sign of the mantissa. In the fixed-point representation, the given number can now be written as

*X*= ± (2

^{10}- 1) ´ 2

^{-}

^{5}

^{ }= ± 31.96875

Thus the range in this case will be between -31.96875 and +31.96875. We have:

Resolution = 2

^{-}^{5}^{ }= 0.00001We find that in this case, the range (sometimes called the

*dynamic*range) has been considerably decreased, but resolution has been greatly increased.**(b) Floating-Point Format**

In the floating-point format, we reserve 5 bits to represent the exponent, 1 bit to represent its sign, 9 bits to represent the mantissa part, and 1 bit to represent its sign. Table 1.38 shows the floating-point representation of the given number.

Table 1.38 Floating-point representation in a 16-bit computer

S_{M} | M (9 bits) | S_{E} | E (5 bits) |

0 | 0.1 0 0 0 0 0 0 0 0 | 1 | 1 1 1 1 1 |

0 | 0.1 0 0 0 0 0 0 0 0 | 0 | 1 1 1 1 1 |

In Table 1.38,

*S*_{M }_{ }represents the sign bit of mantissa,*M*represents the mantissa,*S*_{E}represents the sign bit of exponent, and*E*represents the exponent. We find that row 2 shows the smallest bit that can be represented in this format. This is obtained as follows:We have, for the smallest possible number in this scheme, mantissa = 0.5, which is the minimum possible value of mantissas in the floating-point scheme, as stated earlier. This is then represented under the column

*M*as**.1**followed by eight**0**s (i.e.,**.100000000**). The sign of the mantissa is taken as positive and is represented by a**0**under the column*S*_{M}. Since the exponent has 5 bits to represent it, the highest possible number in this case will be 2^{5}= 32. Using this as the exponent, we get the smallest number in our present case asWe find that this is achieved by sacrificing uniformity in resolution. Notice that in the floating-point format, compared to larger numbers, whose resolution is coarse, small numbers have finer resolution.

**The IEEE 754 Standard (Floating-point format) for 32-bit Machines**IEEE 754 standard for floating-point arithmetic in 32-bit computers is shown in Table 1.39.

**Table 1.39**IEEE 754 standard for 32-bit machines

Sign ( S) | Exponent (E) | Mantissa ( M) |

0 | 1 8 | 9 31 |

In this scheme, we have 23 bits reserved for representing the mantissa, one bit for the sign of the mantissa (

*S*), 7 bits for the exponent, and one bit for the sign of the exponent. The maximum number in this scheme is