64bit IEEE 754: Decimal ↗ Double Precision Floating Point Binary: 1.000 000 021 979 552 668 138 406 Convert the Number to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard, From a Base Ten Decimal System Number

Number 1.000 000 021 979 552 668 138 406(10) converted and written in 64 bit double precision IEEE 754 binary floating point representation (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

1. First, convert to binary (in base 2) the integer part: 1.
Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.


  • division = quotient + remainder;
  • 1 ÷ 2 = 0 + 1;

2. Construct the base 2 representation of the integer part of the number.

Take all the remainders starting from the bottom of the list constructed above.


1(10) =


1(2)


3. Convert to binary (base 2) the fractional part: 0.000 000 021 979 552 668 138 406.

Multiply it repeatedly by 2.


Keep track of each integer part of the results.


Stop when we get a fractional part that is equal to zero.


  • #) multiplying = integer + fractional part;
  • 1) 0.000 000 021 979 552 668 138 406 × 2 = 0 + 0.000 000 043 959 105 336 276 812;
  • 2) 0.000 000 043 959 105 336 276 812 × 2 = 0 + 0.000 000 087 918 210 672 553 624;
  • 3) 0.000 000 087 918 210 672 553 624 × 2 = 0 + 0.000 000 175 836 421 345 107 248;
  • 4) 0.000 000 175 836 421 345 107 248 × 2 = 0 + 0.000 000 351 672 842 690 214 496;
  • 5) 0.000 000 351 672 842 690 214 496 × 2 = 0 + 0.000 000 703 345 685 380 428 992;
  • 6) 0.000 000 703 345 685 380 428 992 × 2 = 0 + 0.000 001 406 691 370 760 857 984;
  • 7) 0.000 001 406 691 370 760 857 984 × 2 = 0 + 0.000 002 813 382 741 521 715 968;
  • 8) 0.000 002 813 382 741 521 715 968 × 2 = 0 + 0.000 005 626 765 483 043 431 936;
  • 9) 0.000 005 626 765 483 043 431 936 × 2 = 0 + 0.000 011 253 530 966 086 863 872;
  • 10) 0.000 011 253 530 966 086 863 872 × 2 = 0 + 0.000 022 507 061 932 173 727 744;
  • 11) 0.000 022 507 061 932 173 727 744 × 2 = 0 + 0.000 045 014 123 864 347 455 488;
  • 12) 0.000 045 014 123 864 347 455 488 × 2 = 0 + 0.000 090 028 247 728 694 910 976;
  • 13) 0.000 090 028 247 728 694 910 976 × 2 = 0 + 0.000 180 056 495 457 389 821 952;
  • 14) 0.000 180 056 495 457 389 821 952 × 2 = 0 + 0.000 360 112 990 914 779 643 904;
  • 15) 0.000 360 112 990 914 779 643 904 × 2 = 0 + 0.000 720 225 981 829 559 287 808;
  • 16) 0.000 720 225 981 829 559 287 808 × 2 = 0 + 0.001 440 451 963 659 118 575 616;
  • 17) 0.001 440 451 963 659 118 575 616 × 2 = 0 + 0.002 880 903 927 318 237 151 232;
  • 18) 0.002 880 903 927 318 237 151 232 × 2 = 0 + 0.005 761 807 854 636 474 302 464;
  • 19) 0.005 761 807 854 636 474 302 464 × 2 = 0 + 0.011 523 615 709 272 948 604 928;
  • 20) 0.011 523 615 709 272 948 604 928 × 2 = 0 + 0.023 047 231 418 545 897 209 856;
  • 21) 0.023 047 231 418 545 897 209 856 × 2 = 0 + 0.046 094 462 837 091 794 419 712;
  • 22) 0.046 094 462 837 091 794 419 712 × 2 = 0 + 0.092 188 925 674 183 588 839 424;
  • 23) 0.092 188 925 674 183 588 839 424 × 2 = 0 + 0.184 377 851 348 367 177 678 848;
  • 24) 0.184 377 851 348 367 177 678 848 × 2 = 0 + 0.368 755 702 696 734 355 357 696;
  • 25) 0.368 755 702 696 734 355 357 696 × 2 = 0 + 0.737 511 405 393 468 710 715 392;
  • 26) 0.737 511 405 393 468 710 715 392 × 2 = 1 + 0.475 022 810 786 937 421 430 784;
  • 27) 0.475 022 810 786 937 421 430 784 × 2 = 0 + 0.950 045 621 573 874 842 861 568;
  • 28) 0.950 045 621 573 874 842 861 568 × 2 = 1 + 0.900 091 243 147 749 685 723 136;
  • 29) 0.900 091 243 147 749 685 723 136 × 2 = 1 + 0.800 182 486 295 499 371 446 272;
  • 30) 0.800 182 486 295 499 371 446 272 × 2 = 1 + 0.600 364 972 590 998 742 892 544;
  • 31) 0.600 364 972 590 998 742 892 544 × 2 = 1 + 0.200 729 945 181 997 485 785 088;
  • 32) 0.200 729 945 181 997 485 785 088 × 2 = 0 + 0.401 459 890 363 994 971 570 176;
  • 33) 0.401 459 890 363 994 971 570 176 × 2 = 0 + 0.802 919 780 727 989 943 140 352;
  • 34) 0.802 919 780 727 989 943 140 352 × 2 = 1 + 0.605 839 561 455 979 886 280 704;
  • 35) 0.605 839 561 455 979 886 280 704 × 2 = 1 + 0.211 679 122 911 959 772 561 408;
  • 36) 0.211 679 122 911 959 772 561 408 × 2 = 0 + 0.423 358 245 823 919 545 122 816;
  • 37) 0.423 358 245 823 919 545 122 816 × 2 = 0 + 0.846 716 491 647 839 090 245 632;
  • 38) 0.846 716 491 647 839 090 245 632 × 2 = 1 + 0.693 432 983 295 678 180 491 264;
  • 39) 0.693 432 983 295 678 180 491 264 × 2 = 1 + 0.386 865 966 591 356 360 982 528;
  • 40) 0.386 865 966 591 356 360 982 528 × 2 = 0 + 0.773 731 933 182 712 721 965 056;
  • 41) 0.773 731 933 182 712 721 965 056 × 2 = 1 + 0.547 463 866 365 425 443 930 112;
  • 42) 0.547 463 866 365 425 443 930 112 × 2 = 1 + 0.094 927 732 730 850 887 860 224;
  • 43) 0.094 927 732 730 850 887 860 224 × 2 = 0 + 0.189 855 465 461 701 775 720 448;
  • 44) 0.189 855 465 461 701 775 720 448 × 2 = 0 + 0.379 710 930 923 403 551 440 896;
  • 45) 0.379 710 930 923 403 551 440 896 × 2 = 0 + 0.759 421 861 846 807 102 881 792;
  • 46) 0.759 421 861 846 807 102 881 792 × 2 = 1 + 0.518 843 723 693 614 205 763 584;
  • 47) 0.518 843 723 693 614 205 763 584 × 2 = 1 + 0.037 687 447 387 228 411 527 168;
  • 48) 0.037 687 447 387 228 411 527 168 × 2 = 0 + 0.075 374 894 774 456 823 054 336;
  • 49) 0.075 374 894 774 456 823 054 336 × 2 = 0 + 0.150 749 789 548 913 646 108 672;
  • 50) 0.150 749 789 548 913 646 108 672 × 2 = 0 + 0.301 499 579 097 827 292 217 344;
  • 51) 0.301 499 579 097 827 292 217 344 × 2 = 0 + 0.602 999 158 195 654 584 434 688;
  • 52) 0.602 999 158 195 654 584 434 688 × 2 = 1 + 0.205 998 316 391 309 168 869 376;
  • 53) 0.205 998 316 391 309 168 869 376 × 2 = 0 + 0.411 996 632 782 618 337 738 752;

We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit) and at least one integer that was different from zero => FULL STOP (losing precision...)


4. Construct the base 2 representation of the fractional part of the number.

Take all the integer parts of the multiplying operations, starting from the top of the constructed list above:


0.000 000 021 979 552 668 138 406(10) =


0.0000 0000 0000 0000 0000 0000 0101 1110 0110 0110 1100 0110 0001 0(2)


5. Positive number before normalization:

1.000 000 021 979 552 668 138 406(10) =


1.0000 0000 0000 0000 0000 0000 0101 1110 0110 0110 1100 0110 0001 0(2)

6. Normalize the binary representation of the number.

Shift the decimal mark 0 positions to the left, so that only one non zero digit remains to the left of it:


1.000 000 021 979 552 668 138 406(10) =


1.0000 0000 0000 0000 0000 0000 0101 1110 0110 0110 1100 0110 0001 0(2) =


1.0000 0000 0000 0000 0000 0000 0101 1110 0110 0110 1100 0110 0001 0(2) × 20


7. Up to this moment, there are the following elements that would feed into the 64 bit double precision IEEE 754 binary floating point representation:

Sign 0 (a positive number)


Exponent (unadjusted): 0


Mantissa (not normalized):
1.0000 0000 0000 0000 0000 0000 0101 1110 0110 0110 1100 0110 0001 0


8. Adjust the exponent.

Use the 11 bit excess/bias notation:


Exponent (adjusted) =


Exponent (unadjusted) + 2(11-1) - 1 =


0 + 2(11-1) - 1 =


(0 + 1 023)(10) =


1 023(10)


9. Convert the adjusted exponent from the decimal (base 10) to 11 bit binary.

Use the same technique of repeatedly dividing by 2:


  • division = quotient + remainder;
  • 1 023 ÷ 2 = 511 + 1;
  • 511 ÷ 2 = 255 + 1;
  • 255 ÷ 2 = 127 + 1;
  • 127 ÷ 2 = 63 + 1;
  • 63 ÷ 2 = 31 + 1;
  • 31 ÷ 2 = 15 + 1;
  • 15 ÷ 2 = 7 + 1;
  • 7 ÷ 2 = 3 + 1;
  • 3 ÷ 2 = 1 + 1;
  • 1 ÷ 2 = 0 + 1;

10. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.


Exponent (adjusted) =


1023(10) =


011 1111 1111(2)


11. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.


b) Adjust its length to 52 bits, by removing the excess bits, from the right (if any of the excess bits is set on 1, we are losing precision...).


Mantissa (normalized) =


1. 0000 0000 0000 0000 0000 0000 0101 1110 0110 0110 1100 0110 0001 0 =


0000 0000 0000 0000 0000 0000 0101 1110 0110 0110 1100 0110 0001


12. The three elements that make up the number's 64 bit double precision IEEE 754 binary floating point representation:

Sign (1 bit) =
0 (a positive number)


Exponent (11 bits) =
011 1111 1111


Mantissa (52 bits) =
0000 0000 0000 0000 0000 0000 0101 1110 0110 0110 1100 0110 0001


The base ten decimal number 1.000 000 021 979 552 668 138 406 converted and written in 64 bit double precision IEEE 754 binary floating point representation:
0 - 011 1111 1111 - 0000 0000 0000 0000 0000 0000 0101 1110 0110 0110 1100 0110 0001

The latest decimal numbers converted from base ten to 64 bit double precision IEEE 754 floating point binary standard representation