100 001 101 010 099 999 999 999 709 Converted to 32 Bit Single Precision IEEE 754 Binary Floating Point Representation Standard

Convert decimal 100 001 101 010 099 999 999 999 709(10) to 32 bit single precision IEEE 754 binary floating point representation standard (1 bit for sign, 8 bits for exponent, 23 bits for mantissa)

What are the steps to convert decimal number
100 001 101 010 099 999 999 999 709(10) to 32 bit single precision IEEE 754 binary floating point representation (1 bit for sign, 8 bits for exponent, 23 bits for mantissa)

1. Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.


  • division = quotient + remainder;
  • 100 001 101 010 099 999 999 999 709 ÷ 2 = 50 000 550 505 049 999 999 999 854 + 1;
  • 50 000 550 505 049 999 999 999 854 ÷ 2 = 25 000 275 252 524 999 999 999 927 + 0;
  • 25 000 275 252 524 999 999 999 927 ÷ 2 = 12 500 137 626 262 499 999 999 963 + 1;
  • 12 500 137 626 262 499 999 999 963 ÷ 2 = 6 250 068 813 131 249 999 999 981 + 1;
  • 6 250 068 813 131 249 999 999 981 ÷ 2 = 3 125 034 406 565 624 999 999 990 + 1;
  • 3 125 034 406 565 624 999 999 990 ÷ 2 = 1 562 517 203 282 812 499 999 995 + 0;
  • 1 562 517 203 282 812 499 999 995 ÷ 2 = 781 258 601 641 406 249 999 997 + 1;
  • 781 258 601 641 406 249 999 997 ÷ 2 = 390 629 300 820 703 124 999 998 + 1;
  • 390 629 300 820 703 124 999 998 ÷ 2 = 195 314 650 410 351 562 499 999 + 0;
  • 195 314 650 410 351 562 499 999 ÷ 2 = 97 657 325 205 175 781 249 999 + 1;
  • 97 657 325 205 175 781 249 999 ÷ 2 = 48 828 662 602 587 890 624 999 + 1;
  • 48 828 662 602 587 890 624 999 ÷ 2 = 24 414 331 301 293 945 312 499 + 1;
  • 24 414 331 301 293 945 312 499 ÷ 2 = 12 207 165 650 646 972 656 249 + 1;
  • 12 207 165 650 646 972 656 249 ÷ 2 = 6 103 582 825 323 486 328 124 + 1;
  • 6 103 582 825 323 486 328 124 ÷ 2 = 3 051 791 412 661 743 164 062 + 0;
  • 3 051 791 412 661 743 164 062 ÷ 2 = 1 525 895 706 330 871 582 031 + 0;
  • 1 525 895 706 330 871 582 031 ÷ 2 = 762 947 853 165 435 791 015 + 1;
  • 762 947 853 165 435 791 015 ÷ 2 = 381 473 926 582 717 895 507 + 1;
  • 381 473 926 582 717 895 507 ÷ 2 = 190 736 963 291 358 947 753 + 1;
  • 190 736 963 291 358 947 753 ÷ 2 = 95 368 481 645 679 473 876 + 1;
  • 95 368 481 645 679 473 876 ÷ 2 = 47 684 240 822 839 736 938 + 0;
  • 47 684 240 822 839 736 938 ÷ 2 = 23 842 120 411 419 868 469 + 0;
  • 23 842 120 411 419 868 469 ÷ 2 = 11 921 060 205 709 934 234 + 1;
  • 11 921 060 205 709 934 234 ÷ 2 = 5 960 530 102 854 967 117 + 0;
  • 5 960 530 102 854 967 117 ÷ 2 = 2 980 265 051 427 483 558 + 1;
  • 2 980 265 051 427 483 558 ÷ 2 = 1 490 132 525 713 741 779 + 0;
  • 1 490 132 525 713 741 779 ÷ 2 = 745 066 262 856 870 889 + 1;
  • 745 066 262 856 870 889 ÷ 2 = 372 533 131 428 435 444 + 1;
  • 372 533 131 428 435 444 ÷ 2 = 186 266 565 714 217 722 + 0;
  • 186 266 565 714 217 722 ÷ 2 = 93 133 282 857 108 861 + 0;
  • 93 133 282 857 108 861 ÷ 2 = 46 566 641 428 554 430 + 1;
  • 46 566 641 428 554 430 ÷ 2 = 23 283 320 714 277 215 + 0;
  • 23 283 320 714 277 215 ÷ 2 = 11 641 660 357 138 607 + 1;
  • 11 641 660 357 138 607 ÷ 2 = 5 820 830 178 569 303 + 1;
  • 5 820 830 178 569 303 ÷ 2 = 2 910 415 089 284 651 + 1;
  • 2 910 415 089 284 651 ÷ 2 = 1 455 207 544 642 325 + 1;
  • 1 455 207 544 642 325 ÷ 2 = 727 603 772 321 162 + 1;
  • 727 603 772 321 162 ÷ 2 = 363 801 886 160 581 + 0;
  • 363 801 886 160 581 ÷ 2 = 181 900 943 080 290 + 1;
  • 181 900 943 080 290 ÷ 2 = 90 950 471 540 145 + 0;
  • 90 950 471 540 145 ÷ 2 = 45 475 235 770 072 + 1;
  • 45 475 235 770 072 ÷ 2 = 22 737 617 885 036 + 0;
  • 22 737 617 885 036 ÷ 2 = 11 368 808 942 518 + 0;
  • 11 368 808 942 518 ÷ 2 = 5 684 404 471 259 + 0;
  • 5 684 404 471 259 ÷ 2 = 2 842 202 235 629 + 1;
  • 2 842 202 235 629 ÷ 2 = 1 421 101 117 814 + 1;
  • 1 421 101 117 814 ÷ 2 = 710 550 558 907 + 0;
  • 710 550 558 907 ÷ 2 = 355 275 279 453 + 1;
  • 355 275 279 453 ÷ 2 = 177 637 639 726 + 1;
  • 177 637 639 726 ÷ 2 = 88 818 819 863 + 0;
  • 88 818 819 863 ÷ 2 = 44 409 409 931 + 1;
  • 44 409 409 931 ÷ 2 = 22 204 704 965 + 1;
  • 22 204 704 965 ÷ 2 = 11 102 352 482 + 1;
  • 11 102 352 482 ÷ 2 = 5 551 176 241 + 0;
  • 5 551 176 241 ÷ 2 = 2 775 588 120 + 1;
  • 2 775 588 120 ÷ 2 = 1 387 794 060 + 0;
  • 1 387 794 060 ÷ 2 = 693 897 030 + 0;
  • 693 897 030 ÷ 2 = 346 948 515 + 0;
  • 346 948 515 ÷ 2 = 173 474 257 + 1;
  • 173 474 257 ÷ 2 = 86 737 128 + 1;
  • 86 737 128 ÷ 2 = 43 368 564 + 0;
  • 43 368 564 ÷ 2 = 21 684 282 + 0;
  • 21 684 282 ÷ 2 = 10 842 141 + 0;
  • 10 842 141 ÷ 2 = 5 421 070 + 1;
  • 5 421 070 ÷ 2 = 2 710 535 + 0;
  • 2 710 535 ÷ 2 = 1 355 267 + 1;
  • 1 355 267 ÷ 2 = 677 633 + 1;
  • 677 633 ÷ 2 = 338 816 + 1;
  • 338 816 ÷ 2 = 169 408 + 0;
  • 169 408 ÷ 2 = 84 704 + 0;
  • 84 704 ÷ 2 = 42 352 + 0;
  • 42 352 ÷ 2 = 21 176 + 0;
  • 21 176 ÷ 2 = 10 588 + 0;
  • 10 588 ÷ 2 = 5 294 + 0;
  • 5 294 ÷ 2 = 2 647 + 0;
  • 2 647 ÷ 2 = 1 323 + 1;
  • 1 323 ÷ 2 = 661 + 1;
  • 661 ÷ 2 = 330 + 1;
  • 330 ÷ 2 = 165 + 0;
  • 165 ÷ 2 = 82 + 1;
  • 82 ÷ 2 = 41 + 0;
  • 41 ÷ 2 = 20 + 1;
  • 20 ÷ 2 = 10 + 0;
  • 10 ÷ 2 = 5 + 0;
  • 5 ÷ 2 = 2 + 1;
  • 2 ÷ 2 = 1 + 0;
  • 1 ÷ 2 = 0 + 1;

2. Construct the base 2 representation of the positive number.

Take all the remainders starting from the bottom of the list constructed above.

100 001 101 010 099 999 999 999 709(10) =


101 0010 1011 1000 0000 1110 1000 1100 0101 1101 1011 0001 0101 1111 0100 1101 0100 1111 0011 1110 1101 1101(2)


3. Normalize the binary representation of the number.

Shift the decimal mark 86 positions to the left, so that only one non zero digit remains to the left of it:


100 001 101 010 099 999 999 999 709(10) =


101 0010 1011 1000 0000 1110 1000 1100 0101 1101 1011 0001 0101 1111 0100 1101 0100 1111 0011 1110 1101 1101(2) =


101 0010 1011 1000 0000 1110 1000 1100 0101 1101 1011 0001 0101 1111 0100 1101 0100 1111 0011 1110 1101 1101(2) × 20 =


1.0100 1010 1110 0000 0011 1010 0011 0001 0111 0110 1100 0101 0111 1101 0011 0101 0011 1100 1111 1011 0111 01(2) × 286


4. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point representation:

Sign 0 (a positive number)


Exponent (unadjusted): 86


Mantissa (not normalized):
1.0100 1010 1110 0000 0011 1010 0011 0001 0111 0110 1100 0101 0111 1101 0011 0101 0011 1100 1111 1011 0111 01


5. Adjust the exponent.

Use the 8 bit excess/bias notation:


Exponent (adjusted) =


Exponent (unadjusted) + 2(8-1) - 1 =


86 + 2(8-1) - 1 =


(86 + 127)(10) =


213(10)


6. Convert the adjusted exponent from the decimal (base 10) to 8 bit binary.

Use the same technique of repeatedly dividing by 2:


  • division = quotient + remainder;
  • 213 ÷ 2 = 106 + 1;
  • 106 ÷ 2 = 53 + 0;
  • 53 ÷ 2 = 26 + 1;
  • 26 ÷ 2 = 13 + 0;
  • 13 ÷ 2 = 6 + 1;
  • 6 ÷ 2 = 3 + 0;
  • 3 ÷ 2 = 1 + 1;
  • 1 ÷ 2 = 0 + 1;

7. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.


Exponent (adjusted) =


213(10) =


1101 0101(2)


8. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.


b) Adjust its length to 23 bits, by removing the excess bits, from the right (if any of the excess bits is set on 1, we are losing precision...).


Mantissa (normalized) =


1. 010 0101 0111 0000 0001 1101 000 1100 0101 1101 1011 0001 0101 1111 0100 1101 0100 1111 0011 1110 1101 1101 =


010 0101 0111 0000 0001 1101


9. The three elements that make up the number's 32 bit single precision IEEE 754 binary floating point representation:

Sign (1 bit) =
0 (a positive number)


Exponent (8 bits) =
1101 0101


Mantissa (23 bits) =
010 0101 0111 0000 0001 1101


Decimal number 100 001 101 010 099 999 999 999 709 converted to 32 bit single precision IEEE 754 binary floating point representation:

0 - 1101 0101 - 010 0101 0111 0000 0001 1101


How to convert decimal numbers from base ten to 32 bit single precision IEEE 754 binary floating point standard

Follow the steps below to convert a base 10 decimal number to 32 bit single precision IEEE 754 binary floating point:

  • 1. If the number to be converted is negative, start with its the positive version.
  • 2. First convert the integer part. Divide repeatedly by 2 the base ten positive representation of the integer number that is to be converted to binary, until we get a quotient that is equal to zero, keeping track of each remainder.
  • 3. Construct the base 2 representation of the positive integer part of the number, by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above. Thus, the last remainder of the divisions becomes the first symbol (the leftmost) of the base two number, while the first remainder becomes the last symbol (the rightmost).
  • 4. Then convert the fractional part. Multiply the number repeatedly by 2, until we get a fractional part that is equal to zero, keeping track of each integer part of the results.
  • 5. Construct the base 2 representation of the fractional part of the number by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above (they should appear in the binary representation, from left to right, in the order they have been calculated).
  • 6. Normalize the binary representation of the number, by shifting the decimal point (or if you prefer, the decimal mark) "n" positions either to the left or to the right, so that only one non zero digit remains to the left of the decimal point.
  • 7. Adjust the exponent in 8 bit excess/bias notation and then convert it from decimal (base 10) to 8 bit binary, by using the same technique of repeatedly dividing by 2, as shown above:
    Exponent (adjusted) = Exponent (unadjusted) + 2(8-1) - 1
  • 8. Normalize mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal sign if the case) and adjust its length to 23 bits, either by removing the excess bits from the right (losing precision...) or by adding extra '0' bits to the right.
  • 9. Sign (it takes 1 bit) is either 1 for a negative or 0 for a positive number.

Example: convert the negative number -25.347 from decimal system (base ten) to 32 bit single precision IEEE 754 binary floating point:

  • 1. Start with the positive version of the number:

    |-25.347| = 25.347

  • 2. First convert the integer part, 25. Divide it repeatedly by 2, keeping track of each remainder, until we get a quotient that is equal to zero:
    • division = quotient + remainder;
    • 25 ÷ 2 = 12 + 1;
    • 12 ÷ 2 = 6 + 0;
    • 6 ÷ 2 = 3 + 0;
    • 3 ÷ 2 = 1 + 1;
    • 1 ÷ 2 = 0 + 1;
    • We have encountered a quotient that is ZERO => FULL STOP
  • 3. Construct the base 2 representation of the integer part of the number by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above:

    25(10) = 1 1001(2)

  • 4. Then convert the fractional part, 0.347. Multiply repeatedly by 2, keeping track of each integer part of the results, until we get a fractional part that is equal to zero:
    • #) multiplying = integer + fractional part;
    • 1) 0.347 × 2 = 0 + 0.694;
    • 2) 0.694 × 2 = 1 + 0.388;
    • 3) 0.388 × 2 = 0 + 0.776;
    • 4) 0.776 × 2 = 1 + 0.552;
    • 5) 0.552 × 2 = 1 + 0.104;
    • 6) 0.104 × 2 = 0 + 0.208;
    • 7) 0.208 × 2 = 0 + 0.416;
    • 8) 0.416 × 2 = 0 + 0.832;
    • 9) 0.832 × 2 = 1 + 0.664;
    • 10) 0.664 × 2 = 1 + 0.328;
    • 11) 0.328 × 2 = 0 + 0.656;
    • 12) 0.656 × 2 = 1 + 0.312;
    • 13) 0.312 × 2 = 0 + 0.624;
    • 14) 0.624 × 2 = 1 + 0.248;
    • 15) 0.248 × 2 = 0 + 0.496;
    • 16) 0.496 × 2 = 0 + 0.992;
    • 17) 0.992 × 2 = 1 + 0.984;
    • 18) 0.984 × 2 = 1 + 0.968;
    • 19) 0.968 × 2 = 1 + 0.936;
    • 20) 0.936 × 2 = 1 + 0.872;
    • 21) 0.872 × 2 = 1 + 0.744;
    • 22) 0.744 × 2 = 1 + 0.488;
    • 23) 0.488 × 2 = 0 + 0.976;
    • 24) 0.976 × 2 = 1 + 0.952;
    • We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit = 23) and at least one integer part that was different from zero => FULL STOP (losing precision...).
  • 5. Construct the base 2 representation of the fractional part of the number, by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above:

    0.347(10) = 0.0101 1000 1101 0100 1111 1101(2)

  • 6. Summarizing - the positive number before normalization:

    25.347(10) = 1 1001.0101 1000 1101 0100 1111 1101(2)

  • 7. Normalize the binary representation of the number, shifting the decimal point 4 positions to the left so that only one non-zero digit stays to the left of the decimal point:

    25.347(10) =
    1 1001.0101 1000 1101 0100 1111 1101(2) =
    1 1001.0101 1000 1101 0100 1111 1101(2) × 20 =
    1.1001 0101 1000 1101 0100 1111 1101(2) × 24

  • 8. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point:

    Sign: 1 (a negative number)

    Exponent (unadjusted): 4

    Mantissa (not-normalized): 1.1001 0101 1000 1101 0100 1111 1101

  • 9. Adjust the exponent in 8 bit excess/bias notation and then convert it from decimal (base 10) to 8 bit binary (base 2), by using the same technique of repeatedly dividing it by 2, as already demonstrated above:

    Exponent (adjusted) = Exponent (unadjusted) + 2(8-1) - 1 = (4 + 127)(10) = 131(10) =
    1000 0011(2)

  • 10. Normalize the mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal point) and adjust its length to 23 bits, by removing the excess bits from the right (losing precision...):

    Mantissa (not-normalized): 1.1001 0101 1000 1101 0100 1111 1101

    Mantissa (normalized): 100 1010 1100 0110 1010 0111

  • Conclusion:

    Sign (1 bit) = 1 (a negative number)

    Exponent (8 bits) = 1000 0011

    Mantissa (23 bits) = 100 1010 1100 0110 1010 0111

  • Number -25.347, converted from the decimal system (base 10) to 32 bit single precision IEEE 754 binary floating point =
    1 - 1000 0011 - 100 1010 1100 0110 1010 0111