10 111 110 110 999 999 999 999 999 999 818 Converted to 32 Bit Single Precision IEEE 754 Binary Floating Point Representation Standard

Convert decimal 10 111 110 110 999 999 999 999 999 999 818(10) to 32 bit single precision IEEE 754 binary floating point representation standard (1 bit for sign, 8 bits for exponent, 23 bits for mantissa)

What are the steps to convert decimal number
10 111 110 110 999 999 999 999 999 999 818(10) to 32 bit single precision IEEE 754 binary floating point representation (1 bit for sign, 8 bits for exponent, 23 bits for mantissa)

1. Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.


  • division = quotient + remainder;
  • 10 111 110 110 999 999 999 999 999 999 818 ÷ 2 = 5 055 555 055 499 999 999 999 999 999 909 + 0;
  • 5 055 555 055 499 999 999 999 999 999 909 ÷ 2 = 2 527 777 527 749 999 999 999 999 999 954 + 1;
  • 2 527 777 527 749 999 999 999 999 999 954 ÷ 2 = 1 263 888 763 874 999 999 999 999 999 977 + 0;
  • 1 263 888 763 874 999 999 999 999 999 977 ÷ 2 = 631 944 381 937 499 999 999 999 999 988 + 1;
  • 631 944 381 937 499 999 999 999 999 988 ÷ 2 = 315 972 190 968 749 999 999 999 999 994 + 0;
  • 315 972 190 968 749 999 999 999 999 994 ÷ 2 = 157 986 095 484 374 999 999 999 999 997 + 0;
  • 157 986 095 484 374 999 999 999 999 997 ÷ 2 = 78 993 047 742 187 499 999 999 999 998 + 1;
  • 78 993 047 742 187 499 999 999 999 998 ÷ 2 = 39 496 523 871 093 749 999 999 999 999 + 0;
  • 39 496 523 871 093 749 999 999 999 999 ÷ 2 = 19 748 261 935 546 874 999 999 999 999 + 1;
  • 19 748 261 935 546 874 999 999 999 999 ÷ 2 = 9 874 130 967 773 437 499 999 999 999 + 1;
  • 9 874 130 967 773 437 499 999 999 999 ÷ 2 = 4 937 065 483 886 718 749 999 999 999 + 1;
  • 4 937 065 483 886 718 749 999 999 999 ÷ 2 = 2 468 532 741 943 359 374 999 999 999 + 1;
  • 2 468 532 741 943 359 374 999 999 999 ÷ 2 = 1 234 266 370 971 679 687 499 999 999 + 1;
  • 1 234 266 370 971 679 687 499 999 999 ÷ 2 = 617 133 185 485 839 843 749 999 999 + 1;
  • 617 133 185 485 839 843 749 999 999 ÷ 2 = 308 566 592 742 919 921 874 999 999 + 1;
  • 308 566 592 742 919 921 874 999 999 ÷ 2 = 154 283 296 371 459 960 937 499 999 + 1;
  • 154 283 296 371 459 960 937 499 999 ÷ 2 = 77 141 648 185 729 980 468 749 999 + 1;
  • 77 141 648 185 729 980 468 749 999 ÷ 2 = 38 570 824 092 864 990 234 374 999 + 1;
  • 38 570 824 092 864 990 234 374 999 ÷ 2 = 19 285 412 046 432 495 117 187 499 + 1;
  • 19 285 412 046 432 495 117 187 499 ÷ 2 = 9 642 706 023 216 247 558 593 749 + 1;
  • 9 642 706 023 216 247 558 593 749 ÷ 2 = 4 821 353 011 608 123 779 296 874 + 1;
  • 4 821 353 011 608 123 779 296 874 ÷ 2 = 2 410 676 505 804 061 889 648 437 + 0;
  • 2 410 676 505 804 061 889 648 437 ÷ 2 = 1 205 338 252 902 030 944 824 218 + 1;
  • 1 205 338 252 902 030 944 824 218 ÷ 2 = 602 669 126 451 015 472 412 109 + 0;
  • 602 669 126 451 015 472 412 109 ÷ 2 = 301 334 563 225 507 736 206 054 + 1;
  • 301 334 563 225 507 736 206 054 ÷ 2 = 150 667 281 612 753 868 103 027 + 0;
  • 150 667 281 612 753 868 103 027 ÷ 2 = 75 333 640 806 376 934 051 513 + 1;
  • 75 333 640 806 376 934 051 513 ÷ 2 = 37 666 820 403 188 467 025 756 + 1;
  • 37 666 820 403 188 467 025 756 ÷ 2 = 18 833 410 201 594 233 512 878 + 0;
  • 18 833 410 201 594 233 512 878 ÷ 2 = 9 416 705 100 797 116 756 439 + 0;
  • 9 416 705 100 797 116 756 439 ÷ 2 = 4 708 352 550 398 558 378 219 + 1;
  • 4 708 352 550 398 558 378 219 ÷ 2 = 2 354 176 275 199 279 189 109 + 1;
  • 2 354 176 275 199 279 189 109 ÷ 2 = 1 177 088 137 599 639 594 554 + 1;
  • 1 177 088 137 599 639 594 554 ÷ 2 = 588 544 068 799 819 797 277 + 0;
  • 588 544 068 799 819 797 277 ÷ 2 = 294 272 034 399 909 898 638 + 1;
  • 294 272 034 399 909 898 638 ÷ 2 = 147 136 017 199 954 949 319 + 0;
  • 147 136 017 199 954 949 319 ÷ 2 = 73 568 008 599 977 474 659 + 1;
  • 73 568 008 599 977 474 659 ÷ 2 = 36 784 004 299 988 737 329 + 1;
  • 36 784 004 299 988 737 329 ÷ 2 = 18 392 002 149 994 368 664 + 1;
  • 18 392 002 149 994 368 664 ÷ 2 = 9 196 001 074 997 184 332 + 0;
  • 9 196 001 074 997 184 332 ÷ 2 = 4 598 000 537 498 592 166 + 0;
  • 4 598 000 537 498 592 166 ÷ 2 = 2 299 000 268 749 296 083 + 0;
  • 2 299 000 268 749 296 083 ÷ 2 = 1 149 500 134 374 648 041 + 1;
  • 1 149 500 134 374 648 041 ÷ 2 = 574 750 067 187 324 020 + 1;
  • 574 750 067 187 324 020 ÷ 2 = 287 375 033 593 662 010 + 0;
  • 287 375 033 593 662 010 ÷ 2 = 143 687 516 796 831 005 + 0;
  • 143 687 516 796 831 005 ÷ 2 = 71 843 758 398 415 502 + 1;
  • 71 843 758 398 415 502 ÷ 2 = 35 921 879 199 207 751 + 0;
  • 35 921 879 199 207 751 ÷ 2 = 17 960 939 599 603 875 + 1;
  • 17 960 939 599 603 875 ÷ 2 = 8 980 469 799 801 937 + 1;
  • 8 980 469 799 801 937 ÷ 2 = 4 490 234 899 900 968 + 1;
  • 4 490 234 899 900 968 ÷ 2 = 2 245 117 449 950 484 + 0;
  • 2 245 117 449 950 484 ÷ 2 = 1 122 558 724 975 242 + 0;
  • 1 122 558 724 975 242 ÷ 2 = 561 279 362 487 621 + 0;
  • 561 279 362 487 621 ÷ 2 = 280 639 681 243 810 + 1;
  • 280 639 681 243 810 ÷ 2 = 140 319 840 621 905 + 0;
  • 140 319 840 621 905 ÷ 2 = 70 159 920 310 952 + 1;
  • 70 159 920 310 952 ÷ 2 = 35 079 960 155 476 + 0;
  • 35 079 960 155 476 ÷ 2 = 17 539 980 077 738 + 0;
  • 17 539 980 077 738 ÷ 2 = 8 769 990 038 869 + 0;
  • 8 769 990 038 869 ÷ 2 = 4 384 995 019 434 + 1;
  • 4 384 995 019 434 ÷ 2 = 2 192 497 509 717 + 0;
  • 2 192 497 509 717 ÷ 2 = 1 096 248 754 858 + 1;
  • 1 096 248 754 858 ÷ 2 = 548 124 377 429 + 0;
  • 548 124 377 429 ÷ 2 = 274 062 188 714 + 1;
  • 274 062 188 714 ÷ 2 = 137 031 094 357 + 0;
  • 137 031 094 357 ÷ 2 = 68 515 547 178 + 1;
  • 68 515 547 178 ÷ 2 = 34 257 773 589 + 0;
  • 34 257 773 589 ÷ 2 = 17 128 886 794 + 1;
  • 17 128 886 794 ÷ 2 = 8 564 443 397 + 0;
  • 8 564 443 397 ÷ 2 = 4 282 221 698 + 1;
  • 4 282 221 698 ÷ 2 = 2 141 110 849 + 0;
  • 2 141 110 849 ÷ 2 = 1 070 555 424 + 1;
  • 1 070 555 424 ÷ 2 = 535 277 712 + 0;
  • 535 277 712 ÷ 2 = 267 638 856 + 0;
  • 267 638 856 ÷ 2 = 133 819 428 + 0;
  • 133 819 428 ÷ 2 = 66 909 714 + 0;
  • 66 909 714 ÷ 2 = 33 454 857 + 0;
  • 33 454 857 ÷ 2 = 16 727 428 + 1;
  • 16 727 428 ÷ 2 = 8 363 714 + 0;
  • 8 363 714 ÷ 2 = 4 181 857 + 0;
  • 4 181 857 ÷ 2 = 2 090 928 + 1;
  • 2 090 928 ÷ 2 = 1 045 464 + 0;
  • 1 045 464 ÷ 2 = 522 732 + 0;
  • 522 732 ÷ 2 = 261 366 + 0;
  • 261 366 ÷ 2 = 130 683 + 0;
  • 130 683 ÷ 2 = 65 341 + 1;
  • 65 341 ÷ 2 = 32 670 + 1;
  • 32 670 ÷ 2 = 16 335 + 0;
  • 16 335 ÷ 2 = 8 167 + 1;
  • 8 167 ÷ 2 = 4 083 + 1;
  • 4 083 ÷ 2 = 2 041 + 1;
  • 2 041 ÷ 2 = 1 020 + 1;
  • 1 020 ÷ 2 = 510 + 0;
  • 510 ÷ 2 = 255 + 0;
  • 255 ÷ 2 = 127 + 1;
  • 127 ÷ 2 = 63 + 1;
  • 63 ÷ 2 = 31 + 1;
  • 31 ÷ 2 = 15 + 1;
  • 15 ÷ 2 = 7 + 1;
  • 7 ÷ 2 = 3 + 1;
  • 3 ÷ 2 = 1 + 1;
  • 1 ÷ 2 = 0 + 1;

2. Construct the base 2 representation of the positive number.

Take all the remainders starting from the bottom of the list constructed above.

10 111 110 110 999 999 999 999 999 999 818(10) =


111 1111 1001 1110 1100 0010 0100 0001 0101 0101 0101 0001 0100 0111 0100 1100 0111 0101 1100 1101 0101 1111 1111 1111 0100 1010(2)


3. Normalize the binary representation of the number.

Shift the decimal mark 102 positions to the left, so that only one non zero digit remains to the left of it:


10 111 110 110 999 999 999 999 999 999 818(10) =


111 1111 1001 1110 1100 0010 0100 0001 0101 0101 0101 0001 0100 0111 0100 1100 0111 0101 1100 1101 0101 1111 1111 1111 0100 1010(2) =


111 1111 1001 1110 1100 0010 0100 0001 0101 0101 0101 0001 0100 0111 0100 1100 0111 0101 1100 1101 0101 1111 1111 1111 0100 1010(2) × 20 =


1.1111 1110 0111 1011 0000 1001 0000 0101 0101 0101 0100 0101 0001 1101 0011 0001 1101 0111 0011 0101 0111 1111 1111 1101 0010 10(2) × 2102


4. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point representation:

Sign 0 (a positive number)


Exponent (unadjusted): 102


Mantissa (not normalized):
1.1111 1110 0111 1011 0000 1001 0000 0101 0101 0101 0100 0101 0001 1101 0011 0001 1101 0111 0011 0101 0111 1111 1111 1101 0010 10


5. Adjust the exponent.

Use the 8 bit excess/bias notation:


Exponent (adjusted) =


Exponent (unadjusted) + 2(8-1) - 1 =


102 + 2(8-1) - 1 =


(102 + 127)(10) =


229(10)


6. Convert the adjusted exponent from the decimal (base 10) to 8 bit binary.

Use the same technique of repeatedly dividing by 2:


  • division = quotient + remainder;
  • 229 ÷ 2 = 114 + 1;
  • 114 ÷ 2 = 57 + 0;
  • 57 ÷ 2 = 28 + 1;
  • 28 ÷ 2 = 14 + 0;
  • 14 ÷ 2 = 7 + 0;
  • 7 ÷ 2 = 3 + 1;
  • 3 ÷ 2 = 1 + 1;
  • 1 ÷ 2 = 0 + 1;

7. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.


Exponent (adjusted) =


229(10) =


1110 0101(2)


8. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.


b) Adjust its length to 23 bits, by removing the excess bits, from the right (if any of the excess bits is set on 1, we are losing precision...).


Mantissa (normalized) =


1. 111 1111 0011 1101 1000 0100 100 0001 0101 0101 0101 0001 0100 0111 0100 1100 0111 0101 1100 1101 0101 1111 1111 1111 0100 1010 =


111 1111 0011 1101 1000 0100


9. The three elements that make up the number's 32 bit single precision IEEE 754 binary floating point representation:

Sign (1 bit) =
0 (a positive number)


Exponent (8 bits) =
1110 0101


Mantissa (23 bits) =
111 1111 0011 1101 1000 0100


Decimal number 10 111 110 110 999 999 999 999 999 999 818 converted to 32 bit single precision IEEE 754 binary floating point representation:

0 - 1110 0101 - 111 1111 0011 1101 1000 0100


How to convert decimal numbers from base ten to 32 bit single precision IEEE 754 binary floating point standard

Follow the steps below to convert a base 10 decimal number to 32 bit single precision IEEE 754 binary floating point:

  • 1. If the number to be converted is negative, start with its the positive version.
  • 2. First convert the integer part. Divide repeatedly by 2 the base ten positive representation of the integer number that is to be converted to binary, until we get a quotient that is equal to zero, keeping track of each remainder.
  • 3. Construct the base 2 representation of the positive integer part of the number, by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above. Thus, the last remainder of the divisions becomes the first symbol (the leftmost) of the base two number, while the first remainder becomes the last symbol (the rightmost).
  • 4. Then convert the fractional part. Multiply the number repeatedly by 2, until we get a fractional part that is equal to zero, keeping track of each integer part of the results.
  • 5. Construct the base 2 representation of the fractional part of the number by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above (they should appear in the binary representation, from left to right, in the order they have been calculated).
  • 6. Normalize the binary representation of the number, by shifting the decimal point (or if you prefer, the decimal mark) "n" positions either to the left or to the right, so that only one non zero digit remains to the left of the decimal point.
  • 7. Adjust the exponent in 8 bit excess/bias notation and then convert it from decimal (base 10) to 8 bit binary, by using the same technique of repeatedly dividing by 2, as shown above:
    Exponent (adjusted) = Exponent (unadjusted) + 2(8-1) - 1
  • 8. Normalize mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal sign if the case) and adjust its length to 23 bits, either by removing the excess bits from the right (losing precision...) or by adding extra '0' bits to the right.
  • 9. Sign (it takes 1 bit) is either 1 for a negative or 0 for a positive number.

Example: convert the negative number -25.347 from decimal system (base ten) to 32 bit single precision IEEE 754 binary floating point:

  • 1. Start with the positive version of the number:

    |-25.347| = 25.347

  • 2. First convert the integer part, 25. Divide it repeatedly by 2, keeping track of each remainder, until we get a quotient that is equal to zero:
    • division = quotient + remainder;
    • 25 ÷ 2 = 12 + 1;
    • 12 ÷ 2 = 6 + 0;
    • 6 ÷ 2 = 3 + 0;
    • 3 ÷ 2 = 1 + 1;
    • 1 ÷ 2 = 0 + 1;
    • We have encountered a quotient that is ZERO => FULL STOP
  • 3. Construct the base 2 representation of the integer part of the number by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above:

    25(10) = 1 1001(2)

  • 4. Then convert the fractional part, 0.347. Multiply repeatedly by 2, keeping track of each integer part of the results, until we get a fractional part that is equal to zero:
    • #) multiplying = integer + fractional part;
    • 1) 0.347 × 2 = 0 + 0.694;
    • 2) 0.694 × 2 = 1 + 0.388;
    • 3) 0.388 × 2 = 0 + 0.776;
    • 4) 0.776 × 2 = 1 + 0.552;
    • 5) 0.552 × 2 = 1 + 0.104;
    • 6) 0.104 × 2 = 0 + 0.208;
    • 7) 0.208 × 2 = 0 + 0.416;
    • 8) 0.416 × 2 = 0 + 0.832;
    • 9) 0.832 × 2 = 1 + 0.664;
    • 10) 0.664 × 2 = 1 + 0.328;
    • 11) 0.328 × 2 = 0 + 0.656;
    • 12) 0.656 × 2 = 1 + 0.312;
    • 13) 0.312 × 2 = 0 + 0.624;
    • 14) 0.624 × 2 = 1 + 0.248;
    • 15) 0.248 × 2 = 0 + 0.496;
    • 16) 0.496 × 2 = 0 + 0.992;
    • 17) 0.992 × 2 = 1 + 0.984;
    • 18) 0.984 × 2 = 1 + 0.968;
    • 19) 0.968 × 2 = 1 + 0.936;
    • 20) 0.936 × 2 = 1 + 0.872;
    • 21) 0.872 × 2 = 1 + 0.744;
    • 22) 0.744 × 2 = 1 + 0.488;
    • 23) 0.488 × 2 = 0 + 0.976;
    • 24) 0.976 × 2 = 1 + 0.952;
    • We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit = 23) and at least one integer part that was different from zero => FULL STOP (losing precision...).
  • 5. Construct the base 2 representation of the fractional part of the number, by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above:

    0.347(10) = 0.0101 1000 1101 0100 1111 1101(2)

  • 6. Summarizing - the positive number before normalization:

    25.347(10) = 1 1001.0101 1000 1101 0100 1111 1101(2)

  • 7. Normalize the binary representation of the number, shifting the decimal point 4 positions to the left so that only one non-zero digit stays to the left of the decimal point:

    25.347(10) =
    1 1001.0101 1000 1101 0100 1111 1101(2) =
    1 1001.0101 1000 1101 0100 1111 1101(2) × 20 =
    1.1001 0101 1000 1101 0100 1111 1101(2) × 24

  • 8. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point:

    Sign: 1 (a negative number)

    Exponent (unadjusted): 4

    Mantissa (not-normalized): 1.1001 0101 1000 1101 0100 1111 1101

  • 9. Adjust the exponent in 8 bit excess/bias notation and then convert it from decimal (base 10) to 8 bit binary (base 2), by using the same technique of repeatedly dividing it by 2, as already demonstrated above:

    Exponent (adjusted) = Exponent (unadjusted) + 2(8-1) - 1 = (4 + 127)(10) = 131(10) =
    1000 0011(2)

  • 10. Normalize the mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal point) and adjust its length to 23 bits, by removing the excess bits from the right (losing precision...):

    Mantissa (not-normalized): 1.1001 0101 1000 1101 0100 1111 1101

    Mantissa (normalized): 100 1010 1100 0110 1010 0111

  • Conclusion:

    Sign (1 bit) = 1 (a negative number)

    Exponent (8 bits) = 1000 0011

    Mantissa (23 bits) = 100 1010 1100 0110 1010 0111

  • Number -25.347, converted from the decimal system (base 10) to 32 bit single precision IEEE 754 binary floating point =
    1 - 1000 0011 - 100 1010 1100 0110 1010 0111