-0.000 000 000 742 147 601 Converted to 32 Bit Single Precision IEEE 754 Binary Floating Point Representation Standard

Convert decimal -0.000 000 000 742 147 601(10) to 32 bit single precision IEEE 754 binary floating point representation standard (1 bit for sign, 8 bits for exponent, 23 bits for mantissa)

What are the steps to convert decimal number
-0.000 000 000 742 147 601(10) to 32 bit single precision IEEE 754 binary floating point representation (1 bit for sign, 8 bits for exponent, 23 bits for mantissa)

1. Start with the positive version of the number:

|-0.000 000 000 742 147 601| = 0.000 000 000 742 147 601


2. First, convert to binary (in base 2) the integer part: 0.
Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.


  • division = quotient + remainder;
  • 0 ÷ 2 = 0 + 0;

3. Construct the base 2 representation of the integer part of the number.

Take all the remainders starting from the bottom of the list constructed above.

0(10) =


0(2)


4. Convert to binary (base 2) the fractional part: 0.000 000 000 742 147 601.

Multiply it repeatedly by 2.


Keep track of each integer part of the results.


Stop when we get a fractional part that is equal to zero.


  • #) multiplying = integer + fractional part;
  • 1) 0.000 000 000 742 147 601 × 2 = 0 + 0.000 000 001 484 295 202;
  • 2) 0.000 000 001 484 295 202 × 2 = 0 + 0.000 000 002 968 590 404;
  • 3) 0.000 000 002 968 590 404 × 2 = 0 + 0.000 000 005 937 180 808;
  • 4) 0.000 000 005 937 180 808 × 2 = 0 + 0.000 000 011 874 361 616;
  • 5) 0.000 000 011 874 361 616 × 2 = 0 + 0.000 000 023 748 723 232;
  • 6) 0.000 000 023 748 723 232 × 2 = 0 + 0.000 000 047 497 446 464;
  • 7) 0.000 000 047 497 446 464 × 2 = 0 + 0.000 000 094 994 892 928;
  • 8) 0.000 000 094 994 892 928 × 2 = 0 + 0.000 000 189 989 785 856;
  • 9) 0.000 000 189 989 785 856 × 2 = 0 + 0.000 000 379 979 571 712;
  • 10) 0.000 000 379 979 571 712 × 2 = 0 + 0.000 000 759 959 143 424;
  • 11) 0.000 000 759 959 143 424 × 2 = 0 + 0.000 001 519 918 286 848;
  • 12) 0.000 001 519 918 286 848 × 2 = 0 + 0.000 003 039 836 573 696;
  • 13) 0.000 003 039 836 573 696 × 2 = 0 + 0.000 006 079 673 147 392;
  • 14) 0.000 006 079 673 147 392 × 2 = 0 + 0.000 012 159 346 294 784;
  • 15) 0.000 012 159 346 294 784 × 2 = 0 + 0.000 024 318 692 589 568;
  • 16) 0.000 024 318 692 589 568 × 2 = 0 + 0.000 048 637 385 179 136;
  • 17) 0.000 048 637 385 179 136 × 2 = 0 + 0.000 097 274 770 358 272;
  • 18) 0.000 097 274 770 358 272 × 2 = 0 + 0.000 194 549 540 716 544;
  • 19) 0.000 194 549 540 716 544 × 2 = 0 + 0.000 389 099 081 433 088;
  • 20) 0.000 389 099 081 433 088 × 2 = 0 + 0.000 778 198 162 866 176;
  • 21) 0.000 778 198 162 866 176 × 2 = 0 + 0.001 556 396 325 732 352;
  • 22) 0.001 556 396 325 732 352 × 2 = 0 + 0.003 112 792 651 464 704;
  • 23) 0.003 112 792 651 464 704 × 2 = 0 + 0.006 225 585 302 929 408;
  • 24) 0.006 225 585 302 929 408 × 2 = 0 + 0.012 451 170 605 858 816;
  • 25) 0.012 451 170 605 858 816 × 2 = 0 + 0.024 902 341 211 717 632;
  • 26) 0.024 902 341 211 717 632 × 2 = 0 + 0.049 804 682 423 435 264;
  • 27) 0.049 804 682 423 435 264 × 2 = 0 + 0.099 609 364 846 870 528;
  • 28) 0.099 609 364 846 870 528 × 2 = 0 + 0.199 218 729 693 741 056;
  • 29) 0.199 218 729 693 741 056 × 2 = 0 + 0.398 437 459 387 482 112;
  • 30) 0.398 437 459 387 482 112 × 2 = 0 + 0.796 874 918 774 964 224;
  • 31) 0.796 874 918 774 964 224 × 2 = 1 + 0.593 749 837 549 928 448;
  • 32) 0.593 749 837 549 928 448 × 2 = 1 + 0.187 499 675 099 856 896;
  • 33) 0.187 499 675 099 856 896 × 2 = 0 + 0.374 999 350 199 713 792;
  • 34) 0.374 999 350 199 713 792 × 2 = 0 + 0.749 998 700 399 427 584;
  • 35) 0.749 998 700 399 427 584 × 2 = 1 + 0.499 997 400 798 855 168;
  • 36) 0.499 997 400 798 855 168 × 2 = 0 + 0.999 994 801 597 710 336;
  • 37) 0.999 994 801 597 710 336 × 2 = 1 + 0.999 989 603 195 420 672;
  • 38) 0.999 989 603 195 420 672 × 2 = 1 + 0.999 979 206 390 841 344;
  • 39) 0.999 979 206 390 841 344 × 2 = 1 + 0.999 958 412 781 682 688;
  • 40) 0.999 958 412 781 682 688 × 2 = 1 + 0.999 916 825 563 365 376;
  • 41) 0.999 916 825 563 365 376 × 2 = 1 + 0.999 833 651 126 730 752;
  • 42) 0.999 833 651 126 730 752 × 2 = 1 + 0.999 667 302 253 461 504;
  • 43) 0.999 667 302 253 461 504 × 2 = 1 + 0.999 334 604 506 923 008;
  • 44) 0.999 334 604 506 923 008 × 2 = 1 + 0.998 669 209 013 846 016;
  • 45) 0.998 669 209 013 846 016 × 2 = 1 + 0.997 338 418 027 692 032;
  • 46) 0.997 338 418 027 692 032 × 2 = 1 + 0.994 676 836 055 384 064;
  • 47) 0.994 676 836 055 384 064 × 2 = 1 + 0.989 353 672 110 768 128;
  • 48) 0.989 353 672 110 768 128 × 2 = 1 + 0.978 707 344 221 536 256;
  • 49) 0.978 707 344 221 536 256 × 2 = 1 + 0.957 414 688 443 072 512;
  • 50) 0.957 414 688 443 072 512 × 2 = 1 + 0.914 829 376 886 145 024;
  • 51) 0.914 829 376 886 145 024 × 2 = 1 + 0.829 658 753 772 290 048;
  • 52) 0.829 658 753 772 290 048 × 2 = 1 + 0.659 317 507 544 580 096;
  • 53) 0.659 317 507 544 580 096 × 2 = 1 + 0.318 635 015 089 160 192;
  • 54) 0.318 635 015 089 160 192 × 2 = 0 + 0.637 270 030 178 320 384;

We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit) and at least one integer that was different from zero => FULL STOP (Losing precision - the converted number we get in the end will be just a very good approximation of the initial one).


5. Construct the base 2 representation of the fractional part of the number.

Take all the integer parts of the multiplying operations, starting from the top of the constructed list above:


0.000 000 000 742 147 601(10) =


0.0000 0000 0000 0000 0000 0000 0000 0011 0010 1111 1111 1111 1111 10(2)

6. Positive number before normalization:

0.000 000 000 742 147 601(10) =


0.0000 0000 0000 0000 0000 0000 0000 0011 0010 1111 1111 1111 1111 10(2)

7. Normalize the binary representation of the number.

Shift the decimal mark 31 positions to the right, so that only one non zero digit remains to the left of it:


0.000 000 000 742 147 601(10) =


0.0000 0000 0000 0000 0000 0000 0000 0011 0010 1111 1111 1111 1111 10(2) =


0.0000 0000 0000 0000 0000 0000 0000 0011 0010 1111 1111 1111 1111 10(2) × 20 =


1.1001 0111 1111 1111 1111 110(2) × 2-31


8. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point representation:

Sign 1 (a negative number)


Exponent (unadjusted): -31


Mantissa (not normalized):
1.1001 0111 1111 1111 1111 110


9. Adjust the exponent.

Use the 8 bit excess/bias notation:


Exponent (adjusted) =


Exponent (unadjusted) + 2(8-1) - 1 =


-31 + 2(8-1) - 1 =


(-31 + 127)(10) =


96(10)


10. Convert the adjusted exponent from the decimal (base 10) to 8 bit binary.

Use the same technique of repeatedly dividing by 2:


  • division = quotient + remainder;
  • 96 ÷ 2 = 48 + 0;
  • 48 ÷ 2 = 24 + 0;
  • 24 ÷ 2 = 12 + 0;
  • 12 ÷ 2 = 6 + 0;
  • 6 ÷ 2 = 3 + 0;
  • 3 ÷ 2 = 1 + 1;
  • 1 ÷ 2 = 0 + 1;

11. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.


Exponent (adjusted) =


96(10) =


0110 0000(2)


12. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.


b) Adjust its length to 23 bits, only if necessary (not the case here).


Mantissa (normalized) =


1. 100 1011 1111 1111 1111 1110 =


100 1011 1111 1111 1111 1110


13. The three elements that make up the number's 32 bit single precision IEEE 754 binary floating point representation:

Sign (1 bit) =
1 (a negative number)


Exponent (8 bits) =
0110 0000


Mantissa (23 bits) =
100 1011 1111 1111 1111 1110


Decimal number -0.000 000 000 742 147 601 converted to 32 bit single precision IEEE 754 binary floating point representation:

1 - 0110 0000 - 100 1011 1111 1111 1111 1110


How to convert decimal numbers from base ten to 32 bit single precision IEEE 754 binary floating point standard

Follow the steps below to convert a base 10 decimal number to 32 bit single precision IEEE 754 binary floating point:

  • 1. If the number to be converted is negative, start with its the positive version.
  • 2. First convert the integer part. Divide repeatedly by 2 the base ten positive representation of the integer number that is to be converted to binary, until we get a quotient that is equal to zero, keeping track of each remainder.
  • 3. Construct the base 2 representation of the positive integer part of the number, by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above. Thus, the last remainder of the divisions becomes the first symbol (the leftmost) of the base two number, while the first remainder becomes the last symbol (the rightmost).
  • 4. Then convert the fractional part. Multiply the number repeatedly by 2, until we get a fractional part that is equal to zero, keeping track of each integer part of the results.
  • 5. Construct the base 2 representation of the fractional part of the number by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above (they should appear in the binary representation, from left to right, in the order they have been calculated).
  • 6. Normalize the binary representation of the number, by shifting the decimal point (or if you prefer, the decimal mark) "n" positions either to the left or to the right, so that only one non zero digit remains to the left of the decimal point.
  • 7. Adjust the exponent in 8 bit excess/bias notation and then convert it from decimal (base 10) to 8 bit binary, by using the same technique of repeatedly dividing by 2, as shown above:
    Exponent (adjusted) = Exponent (unadjusted) + 2(8-1) - 1
  • 8. Normalize mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal sign if the case) and adjust its length to 23 bits, either by removing the excess bits from the right (losing precision...) or by adding extra '0' bits to the right.
  • 9. Sign (it takes 1 bit) is either 1 for a negative or 0 for a positive number.

Example: convert the negative number -25.347 from decimal system (base ten) to 32 bit single precision IEEE 754 binary floating point:

  • 1. Start with the positive version of the number:

    |-25.347| = 25.347

  • 2. First convert the integer part, 25. Divide it repeatedly by 2, keeping track of each remainder, until we get a quotient that is equal to zero:
    • division = quotient + remainder;
    • 25 ÷ 2 = 12 + 1;
    • 12 ÷ 2 = 6 + 0;
    • 6 ÷ 2 = 3 + 0;
    • 3 ÷ 2 = 1 + 1;
    • 1 ÷ 2 = 0 + 1;
    • We have encountered a quotient that is ZERO => FULL STOP
  • 3. Construct the base 2 representation of the integer part of the number by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above:

    25(10) = 1 1001(2)

  • 4. Then convert the fractional part, 0.347. Multiply repeatedly by 2, keeping track of each integer part of the results, until we get a fractional part that is equal to zero:
    • #) multiplying = integer + fractional part;
    • 1) 0.347 × 2 = 0 + 0.694;
    • 2) 0.694 × 2 = 1 + 0.388;
    • 3) 0.388 × 2 = 0 + 0.776;
    • 4) 0.776 × 2 = 1 + 0.552;
    • 5) 0.552 × 2 = 1 + 0.104;
    • 6) 0.104 × 2 = 0 + 0.208;
    • 7) 0.208 × 2 = 0 + 0.416;
    • 8) 0.416 × 2 = 0 + 0.832;
    • 9) 0.832 × 2 = 1 + 0.664;
    • 10) 0.664 × 2 = 1 + 0.328;
    • 11) 0.328 × 2 = 0 + 0.656;
    • 12) 0.656 × 2 = 1 + 0.312;
    • 13) 0.312 × 2 = 0 + 0.624;
    • 14) 0.624 × 2 = 1 + 0.248;
    • 15) 0.248 × 2 = 0 + 0.496;
    • 16) 0.496 × 2 = 0 + 0.992;
    • 17) 0.992 × 2 = 1 + 0.984;
    • 18) 0.984 × 2 = 1 + 0.968;
    • 19) 0.968 × 2 = 1 + 0.936;
    • 20) 0.936 × 2 = 1 + 0.872;
    • 21) 0.872 × 2 = 1 + 0.744;
    • 22) 0.744 × 2 = 1 + 0.488;
    • 23) 0.488 × 2 = 0 + 0.976;
    • 24) 0.976 × 2 = 1 + 0.952;
    • We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit = 23) and at least one integer part that was different from zero => FULL STOP (losing precision...).
  • 5. Construct the base 2 representation of the fractional part of the number, by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above:

    0.347(10) = 0.0101 1000 1101 0100 1111 1101(2)

  • 6. Summarizing - the positive number before normalization:

    25.347(10) = 1 1001.0101 1000 1101 0100 1111 1101(2)

  • 7. Normalize the binary representation of the number, shifting the decimal point 4 positions to the left so that only one non-zero digit stays to the left of the decimal point:

    25.347(10) =
    1 1001.0101 1000 1101 0100 1111 1101(2) =
    1 1001.0101 1000 1101 0100 1111 1101(2) × 20 =
    1.1001 0101 1000 1101 0100 1111 1101(2) × 24

  • 8. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point:

    Sign: 1 (a negative number)

    Exponent (unadjusted): 4

    Mantissa (not-normalized): 1.1001 0101 1000 1101 0100 1111 1101

  • 9. Adjust the exponent in 8 bit excess/bias notation and then convert it from decimal (base 10) to 8 bit binary (base 2), by using the same technique of repeatedly dividing it by 2, as already demonstrated above:

    Exponent (adjusted) = Exponent (unadjusted) + 2(8-1) - 1 = (4 + 127)(10) = 131(10) =
    1000 0011(2)

  • 10. Normalize the mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal point) and adjust its length to 23 bits, by removing the excess bits from the right (losing precision...):

    Mantissa (not-normalized): 1.1001 0101 1000 1101 0100 1111 1101

    Mantissa (normalized): 100 1010 1100 0110 1010 0111

  • Conclusion:

    Sign (1 bit) = 1 (a negative number)

    Exponent (8 bits) = 1000 0011

    Mantissa (23 bits) = 100 1010 1100 0110 1010 0111

  • Number -25.347, converted from the decimal system (base 10) to 32 bit single precision IEEE 754 binary floating point =
    1 - 1000 0011 - 100 1010 1100 0110 1010 0111