64bit IEEE 754: Decimal ↗ Double Precision Floating Point Binary: 0.000 000 000 000 000 000 034 4 Convert the Number to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard, From a Base Ten Decimal System Number

Number 0.000 000 000 000 000 000 034 4(10) converted and written in 64 bit double precision IEEE 754 binary floating point representation (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

1. First, convert to binary (in base 2) the integer part: 0.
Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.


  • division = quotient + remainder;
  • 0 ÷ 2 = 0 + 0;

2. Construct the base 2 representation of the integer part of the number.

Take all the remainders starting from the bottom of the list constructed above.


0(10) =


0(2)


3. Convert to binary (base 2) the fractional part: 0.000 000 000 000 000 000 034 4.

Multiply it repeatedly by 2.


Keep track of each integer part of the results.


Stop when we get a fractional part that is equal to zero.


  • #) multiplying = integer + fractional part;
  • 1) 0.000 000 000 000 000 000 034 4 × 2 = 0 + 0.000 000 000 000 000 000 068 8;
  • 2) 0.000 000 000 000 000 000 068 8 × 2 = 0 + 0.000 000 000 000 000 000 137 6;
  • 3) 0.000 000 000 000 000 000 137 6 × 2 = 0 + 0.000 000 000 000 000 000 275 2;
  • 4) 0.000 000 000 000 000 000 275 2 × 2 = 0 + 0.000 000 000 000 000 000 550 4;
  • 5) 0.000 000 000 000 000 000 550 4 × 2 = 0 + 0.000 000 000 000 000 001 100 8;
  • 6) 0.000 000 000 000 000 001 100 8 × 2 = 0 + 0.000 000 000 000 000 002 201 6;
  • 7) 0.000 000 000 000 000 002 201 6 × 2 = 0 + 0.000 000 000 000 000 004 403 2;
  • 8) 0.000 000 000 000 000 004 403 2 × 2 = 0 + 0.000 000 000 000 000 008 806 4;
  • 9) 0.000 000 000 000 000 008 806 4 × 2 = 0 + 0.000 000 000 000 000 017 612 8;
  • 10) 0.000 000 000 000 000 017 612 8 × 2 = 0 + 0.000 000 000 000 000 035 225 6;
  • 11) 0.000 000 000 000 000 035 225 6 × 2 = 0 + 0.000 000 000 000 000 070 451 2;
  • 12) 0.000 000 000 000 000 070 451 2 × 2 = 0 + 0.000 000 000 000 000 140 902 4;
  • 13) 0.000 000 000 000 000 140 902 4 × 2 = 0 + 0.000 000 000 000 000 281 804 8;
  • 14) 0.000 000 000 000 000 281 804 8 × 2 = 0 + 0.000 000 000 000 000 563 609 6;
  • 15) 0.000 000 000 000 000 563 609 6 × 2 = 0 + 0.000 000 000 000 001 127 219 2;
  • 16) 0.000 000 000 000 001 127 219 2 × 2 = 0 + 0.000 000 000 000 002 254 438 4;
  • 17) 0.000 000 000 000 002 254 438 4 × 2 = 0 + 0.000 000 000 000 004 508 876 8;
  • 18) 0.000 000 000 000 004 508 876 8 × 2 = 0 + 0.000 000 000 000 009 017 753 6;
  • 19) 0.000 000 000 000 009 017 753 6 × 2 = 0 + 0.000 000 000 000 018 035 507 2;
  • 20) 0.000 000 000 000 018 035 507 2 × 2 = 0 + 0.000 000 000 000 036 071 014 4;
  • 21) 0.000 000 000 000 036 071 014 4 × 2 = 0 + 0.000 000 000 000 072 142 028 8;
  • 22) 0.000 000 000 000 072 142 028 8 × 2 = 0 + 0.000 000 000 000 144 284 057 6;
  • 23) 0.000 000 000 000 144 284 057 6 × 2 = 0 + 0.000 000 000 000 288 568 115 2;
  • 24) 0.000 000 000 000 288 568 115 2 × 2 = 0 + 0.000 000 000 000 577 136 230 4;
  • 25) 0.000 000 000 000 577 136 230 4 × 2 = 0 + 0.000 000 000 001 154 272 460 8;
  • 26) 0.000 000 000 001 154 272 460 8 × 2 = 0 + 0.000 000 000 002 308 544 921 6;
  • 27) 0.000 000 000 002 308 544 921 6 × 2 = 0 + 0.000 000 000 004 617 089 843 2;
  • 28) 0.000 000 000 004 617 089 843 2 × 2 = 0 + 0.000 000 000 009 234 179 686 4;
  • 29) 0.000 000 000 009 234 179 686 4 × 2 = 0 + 0.000 000 000 018 468 359 372 8;
  • 30) 0.000 000 000 018 468 359 372 8 × 2 = 0 + 0.000 000 000 036 936 718 745 6;
  • 31) 0.000 000 000 036 936 718 745 6 × 2 = 0 + 0.000 000 000 073 873 437 491 2;
  • 32) 0.000 000 000 073 873 437 491 2 × 2 = 0 + 0.000 000 000 147 746 874 982 4;
  • 33) 0.000 000 000 147 746 874 982 4 × 2 = 0 + 0.000 000 000 295 493 749 964 8;
  • 34) 0.000 000 000 295 493 749 964 8 × 2 = 0 + 0.000 000 000 590 987 499 929 6;
  • 35) 0.000 000 000 590 987 499 929 6 × 2 = 0 + 0.000 000 001 181 974 999 859 2;
  • 36) 0.000 000 001 181 974 999 859 2 × 2 = 0 + 0.000 000 002 363 949 999 718 4;
  • 37) 0.000 000 002 363 949 999 718 4 × 2 = 0 + 0.000 000 004 727 899 999 436 8;
  • 38) 0.000 000 004 727 899 999 436 8 × 2 = 0 + 0.000 000 009 455 799 998 873 6;
  • 39) 0.000 000 009 455 799 998 873 6 × 2 = 0 + 0.000 000 018 911 599 997 747 2;
  • 40) 0.000 000 018 911 599 997 747 2 × 2 = 0 + 0.000 000 037 823 199 995 494 4;
  • 41) 0.000 000 037 823 199 995 494 4 × 2 = 0 + 0.000 000 075 646 399 990 988 8;
  • 42) 0.000 000 075 646 399 990 988 8 × 2 = 0 + 0.000 000 151 292 799 981 977 6;
  • 43) 0.000 000 151 292 799 981 977 6 × 2 = 0 + 0.000 000 302 585 599 963 955 2;
  • 44) 0.000 000 302 585 599 963 955 2 × 2 = 0 + 0.000 000 605 171 199 927 910 4;
  • 45) 0.000 000 605 171 199 927 910 4 × 2 = 0 + 0.000 001 210 342 399 855 820 8;
  • 46) 0.000 001 210 342 399 855 820 8 × 2 = 0 + 0.000 002 420 684 799 711 641 6;
  • 47) 0.000 002 420 684 799 711 641 6 × 2 = 0 + 0.000 004 841 369 599 423 283 2;
  • 48) 0.000 004 841 369 599 423 283 2 × 2 = 0 + 0.000 009 682 739 198 846 566 4;
  • 49) 0.000 009 682 739 198 846 566 4 × 2 = 0 + 0.000 019 365 478 397 693 132 8;
  • 50) 0.000 019 365 478 397 693 132 8 × 2 = 0 + 0.000 038 730 956 795 386 265 6;
  • 51) 0.000 038 730 956 795 386 265 6 × 2 = 0 + 0.000 077 461 913 590 772 531 2;
  • 52) 0.000 077 461 913 590 772 531 2 × 2 = 0 + 0.000 154 923 827 181 545 062 4;
  • 53) 0.000 154 923 827 181 545 062 4 × 2 = 0 + 0.000 309 847 654 363 090 124 8;
  • 54) 0.000 309 847 654 363 090 124 8 × 2 = 0 + 0.000 619 695 308 726 180 249 6;
  • 55) 0.000 619 695 308 726 180 249 6 × 2 = 0 + 0.001 239 390 617 452 360 499 2;
  • 56) 0.001 239 390 617 452 360 499 2 × 2 = 0 + 0.002 478 781 234 904 720 998 4;
  • 57) 0.002 478 781 234 904 720 998 4 × 2 = 0 + 0.004 957 562 469 809 441 996 8;
  • 58) 0.004 957 562 469 809 441 996 8 × 2 = 0 + 0.009 915 124 939 618 883 993 6;
  • 59) 0.009 915 124 939 618 883 993 6 × 2 = 0 + 0.019 830 249 879 237 767 987 2;
  • 60) 0.019 830 249 879 237 767 987 2 × 2 = 0 + 0.039 660 499 758 475 535 974 4;
  • 61) 0.039 660 499 758 475 535 974 4 × 2 = 0 + 0.079 320 999 516 951 071 948 8;
  • 62) 0.079 320 999 516 951 071 948 8 × 2 = 0 + 0.158 641 999 033 902 143 897 6;
  • 63) 0.158 641 999 033 902 143 897 6 × 2 = 0 + 0.317 283 998 067 804 287 795 2;
  • 64) 0.317 283 998 067 804 287 795 2 × 2 = 0 + 0.634 567 996 135 608 575 590 4;
  • 65) 0.634 567 996 135 608 575 590 4 × 2 = 1 + 0.269 135 992 271 217 151 180 8;
  • 66) 0.269 135 992 271 217 151 180 8 × 2 = 0 + 0.538 271 984 542 434 302 361 6;
  • 67) 0.538 271 984 542 434 302 361 6 × 2 = 1 + 0.076 543 969 084 868 604 723 2;
  • 68) 0.076 543 969 084 868 604 723 2 × 2 = 0 + 0.153 087 938 169 737 209 446 4;
  • 69) 0.153 087 938 169 737 209 446 4 × 2 = 0 + 0.306 175 876 339 474 418 892 8;
  • 70) 0.306 175 876 339 474 418 892 8 × 2 = 0 + 0.612 351 752 678 948 837 785 6;
  • 71) 0.612 351 752 678 948 837 785 6 × 2 = 1 + 0.224 703 505 357 897 675 571 2;
  • 72) 0.224 703 505 357 897 675 571 2 × 2 = 0 + 0.449 407 010 715 795 351 142 4;
  • 73) 0.449 407 010 715 795 351 142 4 × 2 = 0 + 0.898 814 021 431 590 702 284 8;
  • 74) 0.898 814 021 431 590 702 284 8 × 2 = 1 + 0.797 628 042 863 181 404 569 6;
  • 75) 0.797 628 042 863 181 404 569 6 × 2 = 1 + 0.595 256 085 726 362 809 139 2;
  • 76) 0.595 256 085 726 362 809 139 2 × 2 = 1 + 0.190 512 171 452 725 618 278 4;
  • 77) 0.190 512 171 452 725 618 278 4 × 2 = 0 + 0.381 024 342 905 451 236 556 8;
  • 78) 0.381 024 342 905 451 236 556 8 × 2 = 0 + 0.762 048 685 810 902 473 113 6;
  • 79) 0.762 048 685 810 902 473 113 6 × 2 = 1 + 0.524 097 371 621 804 946 227 2;
  • 80) 0.524 097 371 621 804 946 227 2 × 2 = 1 + 0.048 194 743 243 609 892 454 4;
  • 81) 0.048 194 743 243 609 892 454 4 × 2 = 0 + 0.096 389 486 487 219 784 908 8;
  • 82) 0.096 389 486 487 219 784 908 8 × 2 = 0 + 0.192 778 972 974 439 569 817 6;
  • 83) 0.192 778 972 974 439 569 817 6 × 2 = 0 + 0.385 557 945 948 879 139 635 2;
  • 84) 0.385 557 945 948 879 139 635 2 × 2 = 0 + 0.771 115 891 897 758 279 270 4;
  • 85) 0.771 115 891 897 758 279 270 4 × 2 = 1 + 0.542 231 783 795 516 558 540 8;
  • 86) 0.542 231 783 795 516 558 540 8 × 2 = 1 + 0.084 463 567 591 033 117 081 6;
  • 87) 0.084 463 567 591 033 117 081 6 × 2 = 0 + 0.168 927 135 182 066 234 163 2;
  • 88) 0.168 927 135 182 066 234 163 2 × 2 = 0 + 0.337 854 270 364 132 468 326 4;
  • 89) 0.337 854 270 364 132 468 326 4 × 2 = 0 + 0.675 708 540 728 264 936 652 8;
  • 90) 0.675 708 540 728 264 936 652 8 × 2 = 1 + 0.351 417 081 456 529 873 305 6;
  • 91) 0.351 417 081 456 529 873 305 6 × 2 = 0 + 0.702 834 162 913 059 746 611 2;
  • 92) 0.702 834 162 913 059 746 611 2 × 2 = 1 + 0.405 668 325 826 119 493 222 4;
  • 93) 0.405 668 325 826 119 493 222 4 × 2 = 0 + 0.811 336 651 652 238 986 444 8;
  • 94) 0.811 336 651 652 238 986 444 8 × 2 = 1 + 0.622 673 303 304 477 972 889 6;
  • 95) 0.622 673 303 304 477 972 889 6 × 2 = 1 + 0.245 346 606 608 955 945 779 2;
  • 96) 0.245 346 606 608 955 945 779 2 × 2 = 0 + 0.490 693 213 217 911 891 558 4;
  • 97) 0.490 693 213 217 911 891 558 4 × 2 = 0 + 0.981 386 426 435 823 783 116 8;
  • 98) 0.981 386 426 435 823 783 116 8 × 2 = 1 + 0.962 772 852 871 647 566 233 6;
  • 99) 0.962 772 852 871 647 566 233 6 × 2 = 1 + 0.925 545 705 743 295 132 467 2;
  • 100) 0.925 545 705 743 295 132 467 2 × 2 = 1 + 0.851 091 411 486 590 264 934 4;
  • 101) 0.851 091 411 486 590 264 934 4 × 2 = 1 + 0.702 182 822 973 180 529 868 8;
  • 102) 0.702 182 822 973 180 529 868 8 × 2 = 1 + 0.404 365 645 946 361 059 737 6;
  • 103) 0.404 365 645 946 361 059 737 6 × 2 = 0 + 0.808 731 291 892 722 119 475 2;
  • 104) 0.808 731 291 892 722 119 475 2 × 2 = 1 + 0.617 462 583 785 444 238 950 4;
  • 105) 0.617 462 583 785 444 238 950 4 × 2 = 1 + 0.234 925 167 570 888 477 900 8;
  • 106) 0.234 925 167 570 888 477 900 8 × 2 = 0 + 0.469 850 335 141 776 955 801 6;
  • 107) 0.469 850 335 141 776 955 801 6 × 2 = 0 + 0.939 700 670 283 553 911 603 2;
  • 108) 0.939 700 670 283 553 911 603 2 × 2 = 1 + 0.879 401 340 567 107 823 206 4;
  • 109) 0.879 401 340 567 107 823 206 4 × 2 = 1 + 0.758 802 681 134 215 646 412 8;
  • 110) 0.758 802 681 134 215 646 412 8 × 2 = 1 + 0.517 605 362 268 431 292 825 6;
  • 111) 0.517 605 362 268 431 292 825 6 × 2 = 1 + 0.035 210 724 536 862 585 651 2;
  • 112) 0.035 210 724 536 862 585 651 2 × 2 = 0 + 0.070 421 449 073 725 171 302 4;
  • 113) 0.070 421 449 073 725 171 302 4 × 2 = 0 + 0.140 842 898 147 450 342 604 8;
  • 114) 0.140 842 898 147 450 342 604 8 × 2 = 0 + 0.281 685 796 294 900 685 209 6;
  • 115) 0.281 685 796 294 900 685 209 6 × 2 = 0 + 0.563 371 592 589 801 370 419 2;
  • 116) 0.563 371 592 589 801 370 419 2 × 2 = 1 + 0.126 743 185 179 602 740 838 4;
  • 117) 0.126 743 185 179 602 740 838 4 × 2 = 0 + 0.253 486 370 359 205 481 676 8;

We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit) and at least one integer that was different from zero => FULL STOP (losing precision...)


4. Construct the base 2 representation of the fractional part of the number.

Take all the integer parts of the multiplying operations, starting from the top of the constructed list above:


0.000 000 000 000 000 000 034 4(10) =


0.0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 1010 0010 0111 0011 0000 1100 0101 0110 0111 1101 1001 1110 0001 0(2)


5. Positive number before normalization:

0.000 000 000 000 000 000 034 4(10) =


0.0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 1010 0010 0111 0011 0000 1100 0101 0110 0111 1101 1001 1110 0001 0(2)

6. Normalize the binary representation of the number.

Shift the decimal mark 65 positions to the right, so that only one non zero digit remains to the left of it:


0.000 000 000 000 000 000 034 4(10) =


0.0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 1010 0010 0111 0011 0000 1100 0101 0110 0111 1101 1001 1110 0001 0(2) =


0.0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 1010 0010 0111 0011 0000 1100 0101 0110 0111 1101 1001 1110 0001 0(2) × 20 =


1.0100 0100 1110 0110 0001 1000 1010 1100 1111 1011 0011 1100 0010(2) × 2-65


7. Up to this moment, there are the following elements that would feed into the 64 bit double precision IEEE 754 binary floating point representation:

Sign 0 (a positive number)


Exponent (unadjusted): -65


Mantissa (not normalized):
1.0100 0100 1110 0110 0001 1000 1010 1100 1111 1011 0011 1100 0010


8. Adjust the exponent.

Use the 11 bit excess/bias notation:


Exponent (adjusted) =


Exponent (unadjusted) + 2(11-1) - 1 =


-65 + 2(11-1) - 1 =


(-65 + 1 023)(10) =


958(10)


9. Convert the adjusted exponent from the decimal (base 10) to 11 bit binary.

Use the same technique of repeatedly dividing by 2:


  • division = quotient + remainder;
  • 958 ÷ 2 = 479 + 0;
  • 479 ÷ 2 = 239 + 1;
  • 239 ÷ 2 = 119 + 1;
  • 119 ÷ 2 = 59 + 1;
  • 59 ÷ 2 = 29 + 1;
  • 29 ÷ 2 = 14 + 1;
  • 14 ÷ 2 = 7 + 0;
  • 7 ÷ 2 = 3 + 1;
  • 3 ÷ 2 = 1 + 1;
  • 1 ÷ 2 = 0 + 1;

10. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.


Exponent (adjusted) =


958(10) =


011 1011 1110(2)


11. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.


b) Adjust its length to 52 bits, only if necessary (not the case here).


Mantissa (normalized) =


1. 0100 0100 1110 0110 0001 1000 1010 1100 1111 1011 0011 1100 0010 =


0100 0100 1110 0110 0001 1000 1010 1100 1111 1011 0011 1100 0010


12. The three elements that make up the number's 64 bit double precision IEEE 754 binary floating point representation:

Sign (1 bit) =
0 (a positive number)


Exponent (11 bits) =
011 1011 1110


Mantissa (52 bits) =
0100 0100 1110 0110 0001 1000 1010 1100 1111 1011 0011 1100 0010


The base ten decimal number 0.000 000 000 000 000 000 034 4 converted and written in 64 bit double precision IEEE 754 binary floating point representation:
0 - 011 1011 1110 - 0100 0100 1110 0110 0001 1000 1010 1100 1111 1011 0011 1100 0010

The latest decimal numbers converted from base ten to 64 bit double precision IEEE 754 floating point binary standard representation

Number 2.71 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
Number -77.15 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
Number 2 755 141 999 999 999 999 999 999 945 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
Number 118 111 111 197 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
Number 1 125 537 995 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
Number 93 825 032 205 037 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
Number 10.379 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
Number -1 229 770 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
Number 343 243 273 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
Number 32 000 converted from decimal system (written in base ten) to 64 bit double precision IEEE 754 binary floating point representation standard Sep 08 02:33 UTC (GMT)
All base ten decimal numbers converted to 64 bit double precision IEEE 754 binary floating point

How to convert numbers from the decimal system (base ten) to 64 bit double precision IEEE 754 binary floating point standard

Follow the steps below to convert a base 10 decimal number to 64 bit double precision IEEE 754 binary floating point:

  • 1. If the number to be converted is negative, start with its the positive version.
  • 2. First convert the integer part. Divide repeatedly by 2 the positive representation of the integer number that is to be converted to binary, until we get a quotient that is equal to zero, keeping track of each remainder.
  • 3. Construct the base 2 representation of the positive integer part of the number, by taking all the remainders from the previous operations, starting from the bottom of the list constructed above. Thus, the last remainder of the divisions becomes the first symbol (the leftmost) of the base two number, while the first remainder becomes the last symbol (the rightmost).
  • 4. Then convert the fractional part. Multiply the number repeatedly by 2, until we get a fractional part that is equal to zero, keeping track of each integer part of the results.
  • 5. Construct the base 2 representation of the fractional part of the number, by taking all the integer parts of the multiplying operations, starting from the top of the list constructed above (they should appear in the binary representation, from left to right, in the order they have been calculated).
  • 6. Normalize the binary representation of the number, shifting the decimal mark (the decimal point) "n" positions either to the left, or to the right, so that only one non zero digit remains to the left of the decimal mark.
  • 7. Adjust the exponent in 11 bit excess/bias notation and then convert it from decimal (base 10) to 11 bit binary, by using the same technique of repeatedly dividing by 2, as shown above:
    Exponent (adjusted) = Exponent (unadjusted) + 2(11-1) - 1
  • 8. Normalize mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal mark, if the case) and adjust its length to 52 bits, either by removing the excess bits from the right (losing precision...) or by adding extra bits set on '0' to the right.
  • 9. Sign (it takes 1 bit) is either 1 for a negative or 0 for a positive number.

Example: convert the negative number -31.640 215 from the decimal system (base ten) to 64 bit double precision IEEE 754 binary floating point:

  • 1. Start with the positive version of the number:

    |-31.640 215| = 31.640 215

  • 2. First convert the integer part, 31. Divide it repeatedly by 2, keeping track of each remainder, until we get a quotient that is equal to zero:
    • division = quotient + remainder;
    • 31 ÷ 2 = 15 + 1;
    • 15 ÷ 2 = 7 + 1;
    • 7 ÷ 2 = 3 + 1;
    • 3 ÷ 2 = 1 + 1;
    • 1 ÷ 2 = 0 + 1;
    • We have encountered a quotient that is ZERO => FULL STOP
  • 3. Construct the base 2 representation of the integer part of the number by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above:

    31(10) = 1 1111(2)

  • 4. Then, convert the fractional part, 0.640 215. Multiply repeatedly by 2, keeping track of each integer part of the results, until we get a fractional part that is equal to zero:
    • #) multiplying = integer + fractional part;
    • 1) 0.640 215 × 2 = 1 + 0.280 43;
    • 2) 0.280 43 × 2 = 0 + 0.560 86;
    • 3) 0.560 86 × 2 = 1 + 0.121 72;
    • 4) 0.121 72 × 2 = 0 + 0.243 44;
    • 5) 0.243 44 × 2 = 0 + 0.486 88;
    • 6) 0.486 88 × 2 = 0 + 0.973 76;
    • 7) 0.973 76 × 2 = 1 + 0.947 52;
    • 8) 0.947 52 × 2 = 1 + 0.895 04;
    • 9) 0.895 04 × 2 = 1 + 0.790 08;
    • 10) 0.790 08 × 2 = 1 + 0.580 16;
    • 11) 0.580 16 × 2 = 1 + 0.160 32;
    • 12) 0.160 32 × 2 = 0 + 0.320 64;
    • 13) 0.320 64 × 2 = 0 + 0.641 28;
    • 14) 0.641 28 × 2 = 1 + 0.282 56;
    • 15) 0.282 56 × 2 = 0 + 0.565 12;
    • 16) 0.565 12 × 2 = 1 + 0.130 24;
    • 17) 0.130 24 × 2 = 0 + 0.260 48;
    • 18) 0.260 48 × 2 = 0 + 0.520 96;
    • 19) 0.520 96 × 2 = 1 + 0.041 92;
    • 20) 0.041 92 × 2 = 0 + 0.083 84;
    • 21) 0.083 84 × 2 = 0 + 0.167 68;
    • 22) 0.167 68 × 2 = 0 + 0.335 36;
    • 23) 0.335 36 × 2 = 0 + 0.670 72;
    • 24) 0.670 72 × 2 = 1 + 0.341 44;
    • 25) 0.341 44 × 2 = 0 + 0.682 88;
    • 26) 0.682 88 × 2 = 1 + 0.365 76;
    • 27) 0.365 76 × 2 = 0 + 0.731 52;
    • 28) 0.731 52 × 2 = 1 + 0.463 04;
    • 29) 0.463 04 × 2 = 0 + 0.926 08;
    • 30) 0.926 08 × 2 = 1 + 0.852 16;
    • 31) 0.852 16 × 2 = 1 + 0.704 32;
    • 32) 0.704 32 × 2 = 1 + 0.408 64;
    • 33) 0.408 64 × 2 = 0 + 0.817 28;
    • 34) 0.817 28 × 2 = 1 + 0.634 56;
    • 35) 0.634 56 × 2 = 1 + 0.269 12;
    • 36) 0.269 12 × 2 = 0 + 0.538 24;
    • 37) 0.538 24 × 2 = 1 + 0.076 48;
    • 38) 0.076 48 × 2 = 0 + 0.152 96;
    • 39) 0.152 96 × 2 = 0 + 0.305 92;
    • 40) 0.305 92 × 2 = 0 + 0.611 84;
    • 41) 0.611 84 × 2 = 1 + 0.223 68;
    • 42) 0.223 68 × 2 = 0 + 0.447 36;
    • 43) 0.447 36 × 2 = 0 + 0.894 72;
    • 44) 0.894 72 × 2 = 1 + 0.789 44;
    • 45) 0.789 44 × 2 = 1 + 0.578 88;
    • 46) 0.578 88 × 2 = 1 + 0.157 76;
    • 47) 0.157 76 × 2 = 0 + 0.315 52;
    • 48) 0.315 52 × 2 = 0 + 0.631 04;
    • 49) 0.631 04 × 2 = 1 + 0.262 08;
    • 50) 0.262 08 × 2 = 0 + 0.524 16;
    • 51) 0.524 16 × 2 = 1 + 0.048 32;
    • 52) 0.048 32 × 2 = 0 + 0.096 64;
    • 53) 0.096 64 × 2 = 0 + 0.193 28;
    • We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit = 52) and at least one integer part that was different from zero => FULL STOP (losing precision...).
  • 5. Construct the base 2 representation of the fractional part of the number, by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above:

    0.640 215(10) = 0.1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0(2)

  • 6. Summarizing - the positive number before normalization:

    31.640 215(10) = 1 1111.1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0(2)

  • 7. Normalize the binary representation of the number, shifting the decimal mark 4 positions to the left so that only one non-zero digit stays to the left of the decimal mark:

    31.640 215(10) =
    1 1111.1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0(2) =
    1 1111.1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0(2) × 20 =
    1.1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0(2) × 24

  • 8. Up to this moment, there are the following elements that would feed into the 64 bit double precision IEEE 754 binary floating point representation:

    Sign: 1 (a negative number)

    Exponent (unadjusted): 4

    Mantissa (not-normalized): 1.1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0

  • 9. Adjust the exponent in 11 bit excess/bias notation and then convert it from decimal (base 10) to 11 bit binary (base 2), by using the same technique of repeatedly dividing it by 2, as shown above:

    Exponent (adjusted) = Exponent (unadjusted) + 2(11-1) - 1 = (4 + 1023)(10) = 1027(10) =
    100 0000 0011(2)

  • 10. Normalize mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal sign) and adjust its length to 52 bits, by removing the excess bits, from the right (losing precision...):

    Mantissa (not-normalized): 1.1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0

    Mantissa (normalized): 1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100

  • Conclusion:

    Sign (1 bit) = 1 (a negative number)

    Exponent (8 bits) = 100 0000 0011

    Mantissa (52 bits) = 1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100

  • Number -31.640 215, converted from decimal system (base 10) to 64 bit double precision IEEE 754 binary floating point =
    1 - 100 0000 0011 - 1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100