32bit IEEE 754: Decimal ↗ Single Precision Floating Point Binary: 0.699 999 999 999 999 955 Convert the Number to 32 Bit Single Precision IEEE 754 Binary Floating Point Representation Standard, From a Base 10 Decimal System Number

Number 0.699 999 999 999 999 955(10) converted and written in 32 bit single precision IEEE 754 binary floating point representation (1 bit for sign, 8 bits for exponent, 23 bits for mantissa)

1. First, convert to binary (in base 2) the integer part: 0.
Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.


  • division = quotient + remainder;
  • 0 ÷ 2 = 0 + 0;

2. Construct the base 2 representation of the integer part of the number.

Take all the remainders starting from the bottom of the list constructed above.


0(10) =


0(2)


3. Convert to binary (base 2) the fractional part: 0.699 999 999 999 999 955.

Multiply it repeatedly by 2.


Keep track of each integer part of the results.


Stop when we get a fractional part that is equal to zero.


  • #) multiplying = integer + fractional part;
  • 1) 0.699 999 999 999 999 955 × 2 = 1 + 0.399 999 999 999 999 91;
  • 2) 0.399 999 999 999 999 91 × 2 = 0 + 0.799 999 999 999 999 82;
  • 3) 0.799 999 999 999 999 82 × 2 = 1 + 0.599 999 999 999 999 64;
  • 4) 0.599 999 999 999 999 64 × 2 = 1 + 0.199 999 999 999 999 28;
  • 5) 0.199 999 999 999 999 28 × 2 = 0 + 0.399 999 999 999 998 56;
  • 6) 0.399 999 999 999 998 56 × 2 = 0 + 0.799 999 999 999 997 12;
  • 7) 0.799 999 999 999 997 12 × 2 = 1 + 0.599 999 999 999 994 24;
  • 8) 0.599 999 999 999 994 24 × 2 = 1 + 0.199 999 999 999 988 48;
  • 9) 0.199 999 999 999 988 48 × 2 = 0 + 0.399 999 999 999 976 96;
  • 10) 0.399 999 999 999 976 96 × 2 = 0 + 0.799 999 999 999 953 92;
  • 11) 0.799 999 999 999 953 92 × 2 = 1 + 0.599 999 999 999 907 84;
  • 12) 0.599 999 999 999 907 84 × 2 = 1 + 0.199 999 999 999 815 68;
  • 13) 0.199 999 999 999 815 68 × 2 = 0 + 0.399 999 999 999 631 36;
  • 14) 0.399 999 999 999 631 36 × 2 = 0 + 0.799 999 999 999 262 72;
  • 15) 0.799 999 999 999 262 72 × 2 = 1 + 0.599 999 999 998 525 44;
  • 16) 0.599 999 999 998 525 44 × 2 = 1 + 0.199 999 999 997 050 88;
  • 17) 0.199 999 999 997 050 88 × 2 = 0 + 0.399 999 999 994 101 76;
  • 18) 0.399 999 999 994 101 76 × 2 = 0 + 0.799 999 999 988 203 52;
  • 19) 0.799 999 999 988 203 52 × 2 = 1 + 0.599 999 999 976 407 04;
  • 20) 0.599 999 999 976 407 04 × 2 = 1 + 0.199 999 999 952 814 08;
  • 21) 0.199 999 999 952 814 08 × 2 = 0 + 0.399 999 999 905 628 16;
  • 22) 0.399 999 999 905 628 16 × 2 = 0 + 0.799 999 999 811 256 32;
  • 23) 0.799 999 999 811 256 32 × 2 = 1 + 0.599 999 999 622 512 64;
  • 24) 0.599 999 999 622 512 64 × 2 = 1 + 0.199 999 999 245 025 28;

We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit) and at least one integer that was different from zero => FULL STOP (losing precision...)


4. Construct the base 2 representation of the fractional part of the number.

Take all the integer parts of the multiplying operations, starting from the top of the constructed list above:


0.699 999 999 999 999 955(10) =


0.1011 0011 0011 0011 0011 0011(2)


5. Positive number before normalization:

0.699 999 999 999 999 955(10) =


0.1011 0011 0011 0011 0011 0011(2)

6. Normalize the binary representation of the number.

Shift the decimal mark 1 positions to the right, so that only one non zero digit remains to the left of it:


0.699 999 999 999 999 955(10) =


0.1011 0011 0011 0011 0011 0011(2) =


0.1011 0011 0011 0011 0011 0011(2) × 20 =


1.0110 0110 0110 0110 0110 011(2) × 2-1


7. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point representation:

Sign 0 (a positive number)


Exponent (unadjusted): -1


Mantissa (not normalized):
1.0110 0110 0110 0110 0110 011


8. Adjust the exponent.

Use the 8 bit excess/bias notation:


Exponent (adjusted) =


Exponent (unadjusted) + 2(8-1) - 1 =


-1 + 2(8-1) - 1 =


(-1 + 127)(10) =


126(10)


9. Convert the adjusted exponent from the decimal (base 10) to 8 bit binary.

Use the same technique of repeatedly dividing by 2:


  • division = quotient + remainder;
  • 126 ÷ 2 = 63 + 0;
  • 63 ÷ 2 = 31 + 1;
  • 31 ÷ 2 = 15 + 1;
  • 15 ÷ 2 = 7 + 1;
  • 7 ÷ 2 = 3 + 1;
  • 3 ÷ 2 = 1 + 1;
  • 1 ÷ 2 = 0 + 1;

10. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.


Exponent (adjusted) =


126(10) =


0111 1110(2)


11. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.


b) Adjust its length to 23 bits, only if necessary (not the case here).


Mantissa (normalized) =


1. 011 0011 0011 0011 0011 0011 =


011 0011 0011 0011 0011 0011


12. The three elements that make up the number's 32 bit single precision IEEE 754 binary floating point representation:

Sign (1 bit) =
0 (a positive number)


Exponent (8 bits) =
0111 1110


Mantissa (23 bits) =
011 0011 0011 0011 0011 0011


The base ten decimal number 0.699 999 999 999 999 955 converted and written in 32 bit single precision IEEE 754 binary floating point representation:
0 - 0111 1110 - 011 0011 0011 0011 0011 0011

The latest decimal numbers converted from base ten to 32 bit single precision IEEE 754 floating point binary standard representation