1 010 010 000 000 000 000 000 100 001 005 Converted to 32 Bit Single Precision IEEE 754 Binary Floating Point Representation Standard

Convert decimal 1 010 010 000 000 000 000 000 100 001 005(10) to 32 bit single precision IEEE 754 binary floating point representation standard (1 bit for sign, 8 bits for exponent, 23 bits for mantissa)

What are the steps to convert decimal number
1 010 010 000 000 000 000 000 100 001 005(10) to 32 bit single precision IEEE 754 binary floating point representation (1 bit for sign, 8 bits for exponent, 23 bits for mantissa)

1. Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.


  • division = quotient + remainder;
  • 1 010 010 000 000 000 000 000 100 001 005 ÷ 2 = 505 005 000 000 000 000 000 050 000 502 + 1;
  • 505 005 000 000 000 000 000 050 000 502 ÷ 2 = 252 502 500 000 000 000 000 025 000 251 + 0;
  • 252 502 500 000 000 000 000 025 000 251 ÷ 2 = 126 251 250 000 000 000 000 012 500 125 + 1;
  • 126 251 250 000 000 000 000 012 500 125 ÷ 2 = 63 125 625 000 000 000 000 006 250 062 + 1;
  • 63 125 625 000 000 000 000 006 250 062 ÷ 2 = 31 562 812 500 000 000 000 003 125 031 + 0;
  • 31 562 812 500 000 000 000 003 125 031 ÷ 2 = 15 781 406 250 000 000 000 001 562 515 + 1;
  • 15 781 406 250 000 000 000 001 562 515 ÷ 2 = 7 890 703 125 000 000 000 000 781 257 + 1;
  • 7 890 703 125 000 000 000 000 781 257 ÷ 2 = 3 945 351 562 500 000 000 000 390 628 + 1;
  • 3 945 351 562 500 000 000 000 390 628 ÷ 2 = 1 972 675 781 250 000 000 000 195 314 + 0;
  • 1 972 675 781 250 000 000 000 195 314 ÷ 2 = 986 337 890 625 000 000 000 097 657 + 0;
  • 986 337 890 625 000 000 000 097 657 ÷ 2 = 493 168 945 312 500 000 000 048 828 + 1;
  • 493 168 945 312 500 000 000 048 828 ÷ 2 = 246 584 472 656 250 000 000 024 414 + 0;
  • 246 584 472 656 250 000 000 024 414 ÷ 2 = 123 292 236 328 125 000 000 012 207 + 0;
  • 123 292 236 328 125 000 000 012 207 ÷ 2 = 61 646 118 164 062 500 000 006 103 + 1;
  • 61 646 118 164 062 500 000 006 103 ÷ 2 = 30 823 059 082 031 250 000 003 051 + 1;
  • 30 823 059 082 031 250 000 003 051 ÷ 2 = 15 411 529 541 015 625 000 001 525 + 1;
  • 15 411 529 541 015 625 000 001 525 ÷ 2 = 7 705 764 770 507 812 500 000 762 + 1;
  • 7 705 764 770 507 812 500 000 762 ÷ 2 = 3 852 882 385 253 906 250 000 381 + 0;
  • 3 852 882 385 253 906 250 000 381 ÷ 2 = 1 926 441 192 626 953 125 000 190 + 1;
  • 1 926 441 192 626 953 125 000 190 ÷ 2 = 963 220 596 313 476 562 500 095 + 0;
  • 963 220 596 313 476 562 500 095 ÷ 2 = 481 610 298 156 738 281 250 047 + 1;
  • 481 610 298 156 738 281 250 047 ÷ 2 = 240 805 149 078 369 140 625 023 + 1;
  • 240 805 149 078 369 140 625 023 ÷ 2 = 120 402 574 539 184 570 312 511 + 1;
  • 120 402 574 539 184 570 312 511 ÷ 2 = 60 201 287 269 592 285 156 255 + 1;
  • 60 201 287 269 592 285 156 255 ÷ 2 = 30 100 643 634 796 142 578 127 + 1;
  • 30 100 643 634 796 142 578 127 ÷ 2 = 15 050 321 817 398 071 289 063 + 1;
  • 15 050 321 817 398 071 289 063 ÷ 2 = 7 525 160 908 699 035 644 531 + 1;
  • 7 525 160 908 699 035 644 531 ÷ 2 = 3 762 580 454 349 517 822 265 + 1;
  • 3 762 580 454 349 517 822 265 ÷ 2 = 1 881 290 227 174 758 911 132 + 1;
  • 1 881 290 227 174 758 911 132 ÷ 2 = 940 645 113 587 379 455 566 + 0;
  • 940 645 113 587 379 455 566 ÷ 2 = 470 322 556 793 689 727 783 + 0;
  • 470 322 556 793 689 727 783 ÷ 2 = 235 161 278 396 844 863 891 + 1;
  • 235 161 278 396 844 863 891 ÷ 2 = 117 580 639 198 422 431 945 + 1;
  • 117 580 639 198 422 431 945 ÷ 2 = 58 790 319 599 211 215 972 + 1;
  • 58 790 319 599 211 215 972 ÷ 2 = 29 395 159 799 605 607 986 + 0;
  • 29 395 159 799 605 607 986 ÷ 2 = 14 697 579 899 802 803 993 + 0;
  • 14 697 579 899 802 803 993 ÷ 2 = 7 348 789 949 901 401 996 + 1;
  • 7 348 789 949 901 401 996 ÷ 2 = 3 674 394 974 950 700 998 + 0;
  • 3 674 394 974 950 700 998 ÷ 2 = 1 837 197 487 475 350 499 + 0;
  • 1 837 197 487 475 350 499 ÷ 2 = 918 598 743 737 675 249 + 1;
  • 918 598 743 737 675 249 ÷ 2 = 459 299 371 868 837 624 + 1;
  • 459 299 371 868 837 624 ÷ 2 = 229 649 685 934 418 812 + 0;
  • 229 649 685 934 418 812 ÷ 2 = 114 824 842 967 209 406 + 0;
  • 114 824 842 967 209 406 ÷ 2 = 57 412 421 483 604 703 + 0;
  • 57 412 421 483 604 703 ÷ 2 = 28 706 210 741 802 351 + 1;
  • 28 706 210 741 802 351 ÷ 2 = 14 353 105 370 901 175 + 1;
  • 14 353 105 370 901 175 ÷ 2 = 7 176 552 685 450 587 + 1;
  • 7 176 552 685 450 587 ÷ 2 = 3 588 276 342 725 293 + 1;
  • 3 588 276 342 725 293 ÷ 2 = 1 794 138 171 362 646 + 1;
  • 1 794 138 171 362 646 ÷ 2 = 897 069 085 681 323 + 0;
  • 897 069 085 681 323 ÷ 2 = 448 534 542 840 661 + 1;
  • 448 534 542 840 661 ÷ 2 = 224 267 271 420 330 + 1;
  • 224 267 271 420 330 ÷ 2 = 112 133 635 710 165 + 0;
  • 112 133 635 710 165 ÷ 2 = 56 066 817 855 082 + 1;
  • 56 066 817 855 082 ÷ 2 = 28 033 408 927 541 + 0;
  • 28 033 408 927 541 ÷ 2 = 14 016 704 463 770 + 1;
  • 14 016 704 463 770 ÷ 2 = 7 008 352 231 885 + 0;
  • 7 008 352 231 885 ÷ 2 = 3 504 176 115 942 + 1;
  • 3 504 176 115 942 ÷ 2 = 1 752 088 057 971 + 0;
  • 1 752 088 057 971 ÷ 2 = 876 044 028 985 + 1;
  • 876 044 028 985 ÷ 2 = 438 022 014 492 + 1;
  • 438 022 014 492 ÷ 2 = 219 011 007 246 + 0;
  • 219 011 007 246 ÷ 2 = 109 505 503 623 + 0;
  • 109 505 503 623 ÷ 2 = 54 752 751 811 + 1;
  • 54 752 751 811 ÷ 2 = 27 376 375 905 + 1;
  • 27 376 375 905 ÷ 2 = 13 688 187 952 + 1;
  • 13 688 187 952 ÷ 2 = 6 844 093 976 + 0;
  • 6 844 093 976 ÷ 2 = 3 422 046 988 + 0;
  • 3 422 046 988 ÷ 2 = 1 711 023 494 + 0;
  • 1 711 023 494 ÷ 2 = 855 511 747 + 0;
  • 855 511 747 ÷ 2 = 427 755 873 + 1;
  • 427 755 873 ÷ 2 = 213 877 936 + 1;
  • 213 877 936 ÷ 2 = 106 938 968 + 0;
  • 106 938 968 ÷ 2 = 53 469 484 + 0;
  • 53 469 484 ÷ 2 = 26 734 742 + 0;
  • 26 734 742 ÷ 2 = 13 367 371 + 0;
  • 13 367 371 ÷ 2 = 6 683 685 + 1;
  • 6 683 685 ÷ 2 = 3 341 842 + 1;
  • 3 341 842 ÷ 2 = 1 670 921 + 0;
  • 1 670 921 ÷ 2 = 835 460 + 1;
  • 835 460 ÷ 2 = 417 730 + 0;
  • 417 730 ÷ 2 = 208 865 + 0;
  • 208 865 ÷ 2 = 104 432 + 1;
  • 104 432 ÷ 2 = 52 216 + 0;
  • 52 216 ÷ 2 = 26 108 + 0;
  • 26 108 ÷ 2 = 13 054 + 0;
  • 13 054 ÷ 2 = 6 527 + 0;
  • 6 527 ÷ 2 = 3 263 + 1;
  • 3 263 ÷ 2 = 1 631 + 1;
  • 1 631 ÷ 2 = 815 + 1;
  • 815 ÷ 2 = 407 + 1;
  • 407 ÷ 2 = 203 + 1;
  • 203 ÷ 2 = 101 + 1;
  • 101 ÷ 2 = 50 + 1;
  • 50 ÷ 2 = 25 + 0;
  • 25 ÷ 2 = 12 + 1;
  • 12 ÷ 2 = 6 + 0;
  • 6 ÷ 2 = 3 + 0;
  • 3 ÷ 2 = 1 + 1;
  • 1 ÷ 2 = 0 + 1;

2. Construct the base 2 representation of the positive number.

Take all the remainders starting from the bottom of the list constructed above.

1 010 010 000 000 000 000 000 100 001 005(10) =


1100 1011 1111 1000 0100 1011 0000 1100 0011 1001 1010 1010 1101 1111 0001 1001 0011 1001 1111 1111 0101 1110 0100 1110 1101(2)


3. Normalize the binary representation of the number.

Shift the decimal mark 99 positions to the left, so that only one non zero digit remains to the left of it:


1 010 010 000 000 000 000 000 100 001 005(10) =


1100 1011 1111 1000 0100 1011 0000 1100 0011 1001 1010 1010 1101 1111 0001 1001 0011 1001 1111 1111 0101 1110 0100 1110 1101(2) =


1100 1011 1111 1000 0100 1011 0000 1100 0011 1001 1010 1010 1101 1111 0001 1001 0011 1001 1111 1111 0101 1110 0100 1110 1101(2) × 20 =


1.1001 0111 1111 0000 1001 0110 0001 1000 0111 0011 0101 0101 1011 1110 0011 0010 0111 0011 1111 1110 1011 1100 1001 1101 101(2) × 299


4. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point representation:

Sign 0 (a positive number)


Exponent (unadjusted): 99


Mantissa (not normalized):
1.1001 0111 1111 0000 1001 0110 0001 1000 0111 0011 0101 0101 1011 1110 0011 0010 0111 0011 1111 1110 1011 1100 1001 1101 101


5. Adjust the exponent.

Use the 8 bit excess/bias notation:


Exponent (adjusted) =


Exponent (unadjusted) + 2(8-1) - 1 =


99 + 2(8-1) - 1 =


(99 + 127)(10) =


226(10)


6. Convert the adjusted exponent from the decimal (base 10) to 8 bit binary.

Use the same technique of repeatedly dividing by 2:


  • division = quotient + remainder;
  • 226 ÷ 2 = 113 + 0;
  • 113 ÷ 2 = 56 + 1;
  • 56 ÷ 2 = 28 + 0;
  • 28 ÷ 2 = 14 + 0;
  • 14 ÷ 2 = 7 + 0;
  • 7 ÷ 2 = 3 + 1;
  • 3 ÷ 2 = 1 + 1;
  • 1 ÷ 2 = 0 + 1;

7. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.


Exponent (adjusted) =


226(10) =


1110 0010(2)


8. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.


b) Adjust its length to 23 bits, by removing the excess bits, from the right (if any of the excess bits is set on 1, we are losing precision...).


Mantissa (normalized) =


1. 100 1011 1111 1000 0100 1011 0000 1100 0011 1001 1010 1010 1101 1111 0001 1001 0011 1001 1111 1111 0101 1110 0100 1110 1101 =


100 1011 1111 1000 0100 1011


9. The three elements that make up the number's 32 bit single precision IEEE 754 binary floating point representation:

Sign (1 bit) =
0 (a positive number)


Exponent (8 bits) =
1110 0010


Mantissa (23 bits) =
100 1011 1111 1000 0100 1011


Decimal number 1 010 010 000 000 000 000 000 100 001 005 converted to 32 bit single precision IEEE 754 binary floating point representation:

0 - 1110 0010 - 100 1011 1111 1000 0100 1011


How to convert decimal numbers from base ten to 32 bit single precision IEEE 754 binary floating point standard

Follow the steps below to convert a base 10 decimal number to 32 bit single precision IEEE 754 binary floating point:

  • 1. If the number to be converted is negative, start with its the positive version.
  • 2. First convert the integer part. Divide repeatedly by 2 the base ten positive representation of the integer number that is to be converted to binary, until we get a quotient that is equal to zero, keeping track of each remainder.
  • 3. Construct the base 2 representation of the positive integer part of the number, by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above. Thus, the last remainder of the divisions becomes the first symbol (the leftmost) of the base two number, while the first remainder becomes the last symbol (the rightmost).
  • 4. Then convert the fractional part. Multiply the number repeatedly by 2, until we get a fractional part that is equal to zero, keeping track of each integer part of the results.
  • 5. Construct the base 2 representation of the fractional part of the number by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above (they should appear in the binary representation, from left to right, in the order they have been calculated).
  • 6. Normalize the binary representation of the number, by shifting the decimal point (or if you prefer, the decimal mark) "n" positions either to the left or to the right, so that only one non zero digit remains to the left of the decimal point.
  • 7. Adjust the exponent in 8 bit excess/bias notation and then convert it from decimal (base 10) to 8 bit binary, by using the same technique of repeatedly dividing by 2, as shown above:
    Exponent (adjusted) = Exponent (unadjusted) + 2(8-1) - 1
  • 8. Normalize mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal sign if the case) and adjust its length to 23 bits, either by removing the excess bits from the right (losing precision...) or by adding extra '0' bits to the right.
  • 9. Sign (it takes 1 bit) is either 1 for a negative or 0 for a positive number.

Example: convert the negative number -25.347 from decimal system (base ten) to 32 bit single precision IEEE 754 binary floating point:

  • 1. Start with the positive version of the number:

    |-25.347| = 25.347

  • 2. First convert the integer part, 25. Divide it repeatedly by 2, keeping track of each remainder, until we get a quotient that is equal to zero:
    • division = quotient + remainder;
    • 25 ÷ 2 = 12 + 1;
    • 12 ÷ 2 = 6 + 0;
    • 6 ÷ 2 = 3 + 0;
    • 3 ÷ 2 = 1 + 1;
    • 1 ÷ 2 = 0 + 1;
    • We have encountered a quotient that is ZERO => FULL STOP
  • 3. Construct the base 2 representation of the integer part of the number by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above:

    25(10) = 1 1001(2)

  • 4. Then convert the fractional part, 0.347. Multiply repeatedly by 2, keeping track of each integer part of the results, until we get a fractional part that is equal to zero:
    • #) multiplying = integer + fractional part;
    • 1) 0.347 × 2 = 0 + 0.694;
    • 2) 0.694 × 2 = 1 + 0.388;
    • 3) 0.388 × 2 = 0 + 0.776;
    • 4) 0.776 × 2 = 1 + 0.552;
    • 5) 0.552 × 2 = 1 + 0.104;
    • 6) 0.104 × 2 = 0 + 0.208;
    • 7) 0.208 × 2 = 0 + 0.416;
    • 8) 0.416 × 2 = 0 + 0.832;
    • 9) 0.832 × 2 = 1 + 0.664;
    • 10) 0.664 × 2 = 1 + 0.328;
    • 11) 0.328 × 2 = 0 + 0.656;
    • 12) 0.656 × 2 = 1 + 0.312;
    • 13) 0.312 × 2 = 0 + 0.624;
    • 14) 0.624 × 2 = 1 + 0.248;
    • 15) 0.248 × 2 = 0 + 0.496;
    • 16) 0.496 × 2 = 0 + 0.992;
    • 17) 0.992 × 2 = 1 + 0.984;
    • 18) 0.984 × 2 = 1 + 0.968;
    • 19) 0.968 × 2 = 1 + 0.936;
    • 20) 0.936 × 2 = 1 + 0.872;
    • 21) 0.872 × 2 = 1 + 0.744;
    • 22) 0.744 × 2 = 1 + 0.488;
    • 23) 0.488 × 2 = 0 + 0.976;
    • 24) 0.976 × 2 = 1 + 0.952;
    • We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit = 23) and at least one integer part that was different from zero => FULL STOP (losing precision...).
  • 5. Construct the base 2 representation of the fractional part of the number, by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above:

    0.347(10) = 0.0101 1000 1101 0100 1111 1101(2)

  • 6. Summarizing - the positive number before normalization:

    25.347(10) = 1 1001.0101 1000 1101 0100 1111 1101(2)

  • 7. Normalize the binary representation of the number, shifting the decimal point 4 positions to the left so that only one non-zero digit stays to the left of the decimal point:

    25.347(10) =
    1 1001.0101 1000 1101 0100 1111 1101(2) =
    1 1001.0101 1000 1101 0100 1111 1101(2) × 20 =
    1.1001 0101 1000 1101 0100 1111 1101(2) × 24

  • 8. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point:

    Sign: 1 (a negative number)

    Exponent (unadjusted): 4

    Mantissa (not-normalized): 1.1001 0101 1000 1101 0100 1111 1101

  • 9. Adjust the exponent in 8 bit excess/bias notation and then convert it from decimal (base 10) to 8 bit binary (base 2), by using the same technique of repeatedly dividing it by 2, as already demonstrated above:

    Exponent (adjusted) = Exponent (unadjusted) + 2(8-1) - 1 = (4 + 127)(10) = 131(10) =
    1000 0011(2)

  • 10. Normalize the mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal point) and adjust its length to 23 bits, by removing the excess bits from the right (losing precision...):

    Mantissa (not-normalized): 1.1001 0101 1000 1101 0100 1111 1101

    Mantissa (normalized): 100 1010 1100 0110 1010 0111

  • Conclusion:

    Sign (1 bit) = 1 (a negative number)

    Exponent (8 bits) = 1000 0011

    Mantissa (23 bits) = 100 1010 1100 0110 1010 0111

  • Number -25.347, converted from the decimal system (base 10) to 32 bit single precision IEEE 754 binary floating point =
    1 - 1000 0011 - 100 1010 1100 0110 1010 0111