64bit IEEE 754: Decimal ↗ Double Precision Floating Point Binary: 9 838 263 505 978 427 545 Convert the Number to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard, From a Base Ten Decimal System Number

Number 9 838 263 505 978 427 545(10) converted and written in 64 bit double precision IEEE 754 binary floating point representation (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

1. Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.


  • division = quotient + remainder;
  • 9 838 263 505 978 427 545 ÷ 2 = 4 919 131 752 989 213 772 + 1;
  • 4 919 131 752 989 213 772 ÷ 2 = 2 459 565 876 494 606 886 + 0;
  • 2 459 565 876 494 606 886 ÷ 2 = 1 229 782 938 247 303 443 + 0;
  • 1 229 782 938 247 303 443 ÷ 2 = 614 891 469 123 651 721 + 1;
  • 614 891 469 123 651 721 ÷ 2 = 307 445 734 561 825 860 + 1;
  • 307 445 734 561 825 860 ÷ 2 = 153 722 867 280 912 930 + 0;
  • 153 722 867 280 912 930 ÷ 2 = 76 861 433 640 456 465 + 0;
  • 76 861 433 640 456 465 ÷ 2 = 38 430 716 820 228 232 + 1;
  • 38 430 716 820 228 232 ÷ 2 = 19 215 358 410 114 116 + 0;
  • 19 215 358 410 114 116 ÷ 2 = 9 607 679 205 057 058 + 0;
  • 9 607 679 205 057 058 ÷ 2 = 4 803 839 602 528 529 + 0;
  • 4 803 839 602 528 529 ÷ 2 = 2 401 919 801 264 264 + 1;
  • 2 401 919 801 264 264 ÷ 2 = 1 200 959 900 632 132 + 0;
  • 1 200 959 900 632 132 ÷ 2 = 600 479 950 316 066 + 0;
  • 600 479 950 316 066 ÷ 2 = 300 239 975 158 033 + 0;
  • 300 239 975 158 033 ÷ 2 = 150 119 987 579 016 + 1;
  • 150 119 987 579 016 ÷ 2 = 75 059 993 789 508 + 0;
  • 75 059 993 789 508 ÷ 2 = 37 529 996 894 754 + 0;
  • 37 529 996 894 754 ÷ 2 = 18 764 998 447 377 + 0;
  • 18 764 998 447 377 ÷ 2 = 9 382 499 223 688 + 1;
  • 9 382 499 223 688 ÷ 2 = 4 691 249 611 844 + 0;
  • 4 691 249 611 844 ÷ 2 = 2 345 624 805 922 + 0;
  • 2 345 624 805 922 ÷ 2 = 1 172 812 402 961 + 0;
  • 1 172 812 402 961 ÷ 2 = 586 406 201 480 + 1;
  • 586 406 201 480 ÷ 2 = 293 203 100 740 + 0;
  • 293 203 100 740 ÷ 2 = 146 601 550 370 + 0;
  • 146 601 550 370 ÷ 2 = 73 300 775 185 + 0;
  • 73 300 775 185 ÷ 2 = 36 650 387 592 + 1;
  • 36 650 387 592 ÷ 2 = 18 325 193 796 + 0;
  • 18 325 193 796 ÷ 2 = 9 162 596 898 + 0;
  • 9 162 596 898 ÷ 2 = 4 581 298 449 + 0;
  • 4 581 298 449 ÷ 2 = 2 290 649 224 + 1;
  • 2 290 649 224 ÷ 2 = 1 145 324 612 + 0;
  • 1 145 324 612 ÷ 2 = 572 662 306 + 0;
  • 572 662 306 ÷ 2 = 286 331 153 + 0;
  • 286 331 153 ÷ 2 = 143 165 576 + 1;
  • 143 165 576 ÷ 2 = 71 582 788 + 0;
  • 71 582 788 ÷ 2 = 35 791 394 + 0;
  • 35 791 394 ÷ 2 = 17 895 697 + 0;
  • 17 895 697 ÷ 2 = 8 947 848 + 1;
  • 8 947 848 ÷ 2 = 4 473 924 + 0;
  • 4 473 924 ÷ 2 = 2 236 962 + 0;
  • 2 236 962 ÷ 2 = 1 118 481 + 0;
  • 1 118 481 ÷ 2 = 559 240 + 1;
  • 559 240 ÷ 2 = 279 620 + 0;
  • 279 620 ÷ 2 = 139 810 + 0;
  • 139 810 ÷ 2 = 69 905 + 0;
  • 69 905 ÷ 2 = 34 952 + 1;
  • 34 952 ÷ 2 = 17 476 + 0;
  • 17 476 ÷ 2 = 8 738 + 0;
  • 8 738 ÷ 2 = 4 369 + 0;
  • 4 369 ÷ 2 = 2 184 + 1;
  • 2 184 ÷ 2 = 1 092 + 0;
  • 1 092 ÷ 2 = 546 + 0;
  • 546 ÷ 2 = 273 + 0;
  • 273 ÷ 2 = 136 + 1;
  • 136 ÷ 2 = 68 + 0;
  • 68 ÷ 2 = 34 + 0;
  • 34 ÷ 2 = 17 + 0;
  • 17 ÷ 2 = 8 + 1;
  • 8 ÷ 2 = 4 + 0;
  • 4 ÷ 2 = 2 + 0;
  • 2 ÷ 2 = 1 + 0;
  • 1 ÷ 2 = 0 + 1;

2. Construct the base 2 representation of the positive number.

Take all the remainders starting from the bottom of the list constructed above.


9 838 263 505 978 427 545(10) =


1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1001 1001(2)


3. Normalize the binary representation of the number.

Shift the decimal mark 63 positions to the left, so that only one non zero digit remains to the left of it:


9 838 263 505 978 427 545(10) =


1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1001 1001(2) =


1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1001 1001(2) × 20 =


1.0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0011 001(2) × 263


4. Up to this moment, there are the following elements that would feed into the 64 bit double precision IEEE 754 binary floating point representation:

Sign 0 (a positive number)


Exponent (unadjusted): 63


Mantissa (not normalized):
1.0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0011 001


5. Adjust the exponent.

Use the 11 bit excess/bias notation:


Exponent (adjusted) =


Exponent (unadjusted) + 2(11-1) - 1 =


63 + 2(11-1) - 1 =


(63 + 1 023)(10) =


1 086(10)


6. Convert the adjusted exponent from the decimal (base 10) to 11 bit binary.

Use the same technique of repeatedly dividing by 2:


  • division = quotient + remainder;
  • 1 086 ÷ 2 = 543 + 0;
  • 543 ÷ 2 = 271 + 1;
  • 271 ÷ 2 = 135 + 1;
  • 135 ÷ 2 = 67 + 1;
  • 67 ÷ 2 = 33 + 1;
  • 33 ÷ 2 = 16 + 1;
  • 16 ÷ 2 = 8 + 0;
  • 8 ÷ 2 = 4 + 0;
  • 4 ÷ 2 = 2 + 0;
  • 2 ÷ 2 = 1 + 0;
  • 1 ÷ 2 = 0 + 1;

7. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.


Exponent (adjusted) =


1086(10) =


100 0011 1110(2)


8. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.


b) Adjust its length to 52 bits, by removing the excess bits, from the right (if any of the excess bits is set on 1, we are losing precision...).


Mantissa (normalized) =


1. 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 000 1001 1001 =


0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001


9. The three elements that make up the number's 64 bit double precision IEEE 754 binary floating point representation:

Sign (1 bit) =
0 (a positive number)


Exponent (11 bits) =
100 0011 1110


Mantissa (52 bits) =
0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001


The base ten decimal number 9 838 263 505 978 427 545 converted and written in 64 bit double precision IEEE 754 binary floating point representation:
0 - 100 0011 1110 - 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001

The latest decimal numbers converted from base ten to 64 bit double precision IEEE 754 floating point binary standard representation