Convert Decimal 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357 to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard

Convert decimal 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357₍₁₀₎ to 64 bit double precision IEEE 754 binary floating point representation standard (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

Conversions:

Convert decimal numbers to 64 bit double precision IEEE 754 binary floating point representation standard

What are the steps to convert decimal number
11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357₍₁₀₎ to 64 bit double precision IEEE 754 binary floating point representation (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

1. Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.

division = quotient + remainder;
11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357 ÷ 2 = 5 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 678 + 1;
5 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 678 ÷ 2 = 2 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 839 + 0;
2 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 839 ÷ 2 = 1 499 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 919 + 1;
1 499 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 919 ÷ 2 = 749 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 959 + 1;
749 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 959 ÷ 2 = 374 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 979 + 1;
374 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 979 ÷ 2 = 187 499 999 999 999 999 999 999 999 999 999 999 999 999 999 999 989 + 1;
187 499 999 999 999 999 999 999 999 999 999 999 999 999 999 999 989 ÷ 2 = 93 749 999 999 999 999 999 999 999 999 999 999 999 999 999 999 994 + 1;
93 749 999 999 999 999 999 999 999 999 999 999 999 999 999 999 994 ÷ 2 = 46 874 999 999 999 999 999 999 999 999 999 999 999 999 999 999 997 + 0;
46 874 999 999 999 999 999 999 999 999 999 999 999 999 999 999 997 ÷ 2 = 23 437 499 999 999 999 999 999 999 999 999 999 999 999 999 999 998 + 1;
23 437 499 999 999 999 999 999 999 999 999 999 999 999 999 999 998 ÷ 2 = 11 718 749 999 999 999 999 999 999 999 999 999 999 999 999 999 999 + 0;
11 718 749 999 999 999 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 5 859 374 999 999 999 999 999 999 999 999 999 999 999 999 999 999 + 1;
5 859 374 999 999 999 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 2 929 687 499 999 999 999 999 999 999 999 999 999 999 999 999 999 + 1;
2 929 687 499 999 999 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 1 464 843 749 999 999 999 999 999 999 999 999 999 999 999 999 999 + 1;
1 464 843 749 999 999 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 732 421 874 999 999 999 999 999 999 999 999 999 999 999 999 999 + 1;
732 421 874 999 999 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 366 210 937 499 999 999 999 999 999 999 999 999 999 999 999 999 + 1;
366 210 937 499 999 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 183 105 468 749 999 999 999 999 999 999 999 999 999 999 999 999 + 1;
183 105 468 749 999 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 91 552 734 374 999 999 999 999 999 999 999 999 999 999 999 999 + 1;
91 552 734 374 999 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 45 776 367 187 499 999 999 999 999 999 999 999 999 999 999 999 + 1;
45 776 367 187 499 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 22 888 183 593 749 999 999 999 999 999 999 999 999 999 999 999 + 1;
22 888 183 593 749 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 11 444 091 796 874 999 999 999 999 999 999 999 999 999 999 999 + 1;
11 444 091 796 874 999 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 5 722 045 898 437 499 999 999 999 999 999 999 999 999 999 999 + 1;
5 722 045 898 437 499 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 2 861 022 949 218 749 999 999 999 999 999 999 999 999 999 999 + 1;
2 861 022 949 218 749 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 1 430 511 474 609 374 999 999 999 999 999 999 999 999 999 999 + 1;
1 430 511 474 609 374 999 999 999 999 999 999 999 999 999 999 ÷ 2 = 715 255 737 304 687 499 999 999 999 999 999 999 999 999 999 + 1;
715 255 737 304 687 499 999 999 999 999 999 999 999 999 999 ÷ 2 = 357 627 868 652 343 749 999 999 999 999 999 999 999 999 999 + 1;
357 627 868 652 343 749 999 999 999 999 999 999 999 999 999 ÷ 2 = 178 813 934 326 171 874 999 999 999 999 999 999 999 999 999 + 1;
178 813 934 326 171 874 999 999 999 999 999 999 999 999 999 ÷ 2 = 89 406 967 163 085 937 499 999 999 999 999 999 999 999 999 + 1;
89 406 967 163 085 937 499 999 999 999 999 999 999 999 999 ÷ 2 = 44 703 483 581 542 968 749 999 999 999 999 999 999 999 999 + 1;
44 703 483 581 542 968 749 999 999 999 999 999 999 999 999 ÷ 2 = 22 351 741 790 771 484 374 999 999 999 999 999 999 999 999 + 1;
22 351 741 790 771 484 374 999 999 999 999 999 999 999 999 ÷ 2 = 11 175 870 895 385 742 187 499 999 999 999 999 999 999 999 + 1;
11 175 870 895 385 742 187 499 999 999 999 999 999 999 999 ÷ 2 = 5 587 935 447 692 871 093 749 999 999 999 999 999 999 999 + 1;
5 587 935 447 692 871 093 749 999 999 999 999 999 999 999 ÷ 2 = 2 793 967 723 846 435 546 874 999 999 999 999 999 999 999 + 1;
2 793 967 723 846 435 546 874 999 999 999 999 999 999 999 ÷ 2 = 1 396 983 861 923 217 773 437 499 999 999 999 999 999 999 + 1;
1 396 983 861 923 217 773 437 499 999 999 999 999 999 999 ÷ 2 = 698 491 930 961 608 886 718 749 999 999 999 999 999 999 + 1;
698 491 930 961 608 886 718 749 999 999 999 999 999 999 ÷ 2 = 349 245 965 480 804 443 359 374 999 999 999 999 999 999 + 1;
349 245 965 480 804 443 359 374 999 999 999 999 999 999 ÷ 2 = 174 622 982 740 402 221 679 687 499 999 999 999 999 999 + 1;
174 622 982 740 402 221 679 687 499 999 999 999 999 999 ÷ 2 = 87 311 491 370 201 110 839 843 749 999 999 999 999 999 + 1;
87 311 491 370 201 110 839 843 749 999 999 999 999 999 ÷ 2 = 43 655 745 685 100 555 419 921 874 999 999 999 999 999 + 1;
43 655 745 685 100 555 419 921 874 999 999 999 999 999 ÷ 2 = 21 827 872 842 550 277 709 960 937 499 999 999 999 999 + 1;
21 827 872 842 550 277 709 960 937 499 999 999 999 999 ÷ 2 = 10 913 936 421 275 138 854 980 468 749 999 999 999 999 + 1;
10 913 936 421 275 138 854 980 468 749 999 999 999 999 ÷ 2 = 5 456 968 210 637 569 427 490 234 374 999 999 999 999 + 1;
5 456 968 210 637 569 427 490 234 374 999 999 999 999 ÷ 2 = 2 728 484 105 318 784 713 745 117 187 499 999 999 999 + 1;
2 728 484 105 318 784 713 745 117 187 499 999 999 999 ÷ 2 = 1 364 242 052 659 392 356 872 558 593 749 999 999 999 + 1;
1 364 242 052 659 392 356 872 558 593 749 999 999 999 ÷ 2 = 682 121 026 329 696 178 436 279 296 874 999 999 999 + 1;
682 121 026 329 696 178 436 279 296 874 999 999 999 ÷ 2 = 341 060 513 164 848 089 218 139 648 437 499 999 999 + 1;
341 060 513 164 848 089 218 139 648 437 499 999 999 ÷ 2 = 170 530 256 582 424 044 609 069 824 218 749 999 999 + 1;
170 530 256 582 424 044 609 069 824 218 749 999 999 ÷ 2 = 85 265 128 291 212 022 304 534 912 109 374 999 999 + 1;
85 265 128 291 212 022 304 534 912 109 374 999 999 ÷ 2 = 42 632 564 145 606 011 152 267 456 054 687 499 999 + 1;
42 632 564 145 606 011 152 267 456 054 687 499 999 ÷ 2 = 21 316 282 072 803 005 576 133 728 027 343 749 999 + 1;
21 316 282 072 803 005 576 133 728 027 343 749 999 ÷ 2 = 10 658 141 036 401 502 788 066 864 013 671 874 999 + 1;
10 658 141 036 401 502 788 066 864 013 671 874 999 ÷ 2 = 5 329 070 518 200 751 394 033 432 006 835 937 499 + 1;
5 329 070 518 200 751 394 033 432 006 835 937 499 ÷ 2 = 2 664 535 259 100 375 697 016 716 003 417 968 749 + 1;
2 664 535 259 100 375 697 016 716 003 417 968 749 ÷ 2 = 1 332 267 629 550 187 848 508 358 001 708 984 374 + 1;
1 332 267 629 550 187 848 508 358 001 708 984 374 ÷ 2 = 666 133 814 775 093 924 254 179 000 854 492 187 + 0;
666 133 814 775 093 924 254 179 000 854 492 187 ÷ 2 = 333 066 907 387 546 962 127 089 500 427 246 093 + 1;
333 066 907 387 546 962 127 089 500 427 246 093 ÷ 2 = 166 533 453 693 773 481 063 544 750 213 623 046 + 1;
166 533 453 693 773 481 063 544 750 213 623 046 ÷ 2 = 83 266 726 846 886 740 531 772 375 106 811 523 + 0;
83 266 726 846 886 740 531 772 375 106 811 523 ÷ 2 = 41 633 363 423 443 370 265 886 187 553 405 761 + 1;
41 633 363 423 443 370 265 886 187 553 405 761 ÷ 2 = 20 816 681 711 721 685 132 943 093 776 702 880 + 1;
20 816 681 711 721 685 132 943 093 776 702 880 ÷ 2 = 10 408 340 855 860 842 566 471 546 888 351 440 + 0;
10 408 340 855 860 842 566 471 546 888 351 440 ÷ 2 = 5 204 170 427 930 421 283 235 773 444 175 720 + 0;
5 204 170 427 930 421 283 235 773 444 175 720 ÷ 2 = 2 602 085 213 965 210 641 617 886 722 087 860 + 0;
2 602 085 213 965 210 641 617 886 722 087 860 ÷ 2 = 1 301 042 606 982 605 320 808 943 361 043 930 + 0;
1 301 042 606 982 605 320 808 943 361 043 930 ÷ 2 = 650 521 303 491 302 660 404 471 680 521 965 + 0;
650 521 303 491 302 660 404 471 680 521 965 ÷ 2 = 325 260 651 745 651 330 202 235 840 260 982 + 1;
325 260 651 745 651 330 202 235 840 260 982 ÷ 2 = 162 630 325 872 825 665 101 117 920 130 491 + 0;
162 630 325 872 825 665 101 117 920 130 491 ÷ 2 = 81 315 162 936 412 832 550 558 960 065 245 + 1;
81 315 162 936 412 832 550 558 960 065 245 ÷ 2 = 40 657 581 468 206 416 275 279 480 032 622 + 1;
40 657 581 468 206 416 275 279 480 032 622 ÷ 2 = 20 328 790 734 103 208 137 639 740 016 311 + 0;
20 328 790 734 103 208 137 639 740 016 311 ÷ 2 = 10 164 395 367 051 604 068 819 870 008 155 + 1;
10 164 395 367 051 604 068 819 870 008 155 ÷ 2 = 5 082 197 683 525 802 034 409 935 004 077 + 1;
5 082 197 683 525 802 034 409 935 004 077 ÷ 2 = 2 541 098 841 762 901 017 204 967 502 038 + 1;
2 541 098 841 762 901 017 204 967 502 038 ÷ 2 = 1 270 549 420 881 450 508 602 483 751 019 + 0;
1 270 549 420 881 450 508 602 483 751 019 ÷ 2 = 635 274 710 440 725 254 301 241 875 509 + 1;
635 274 710 440 725 254 301 241 875 509 ÷ 2 = 317 637 355 220 362 627 150 620 937 754 + 1;
317 637 355 220 362 627 150 620 937 754 ÷ 2 = 158 818 677 610 181 313 575 310 468 877 + 0;
158 818 677 610 181 313 575 310 468 877 ÷ 2 = 79 409 338 805 090 656 787 655 234 438 + 1;
79 409 338 805 090 656 787 655 234 438 ÷ 2 = 39 704 669 402 545 328 393 827 617 219 + 0;
39 704 669 402 545 328 393 827 617 219 ÷ 2 = 19 852 334 701 272 664 196 913 808 609 + 1;
19 852 334 701 272 664 196 913 808 609 ÷ 2 = 9 926 167 350 636 332 098 456 904 304 + 1;
9 926 167 350 636 332 098 456 904 304 ÷ 2 = 4 963 083 675 318 166 049 228 452 152 + 0;
4 963 083 675 318 166 049 228 452 152 ÷ 2 = 2 481 541 837 659 083 024 614 226 076 + 0;
2 481 541 837 659 083 024 614 226 076 ÷ 2 = 1 240 770 918 829 541 512 307 113 038 + 0;
1 240 770 918 829 541 512 307 113 038 ÷ 2 = 620 385 459 414 770 756 153 556 519 + 0;
620 385 459 414 770 756 153 556 519 ÷ 2 = 310 192 729 707 385 378 076 778 259 + 1;
310 192 729 707 385 378 076 778 259 ÷ 2 = 155 096 364 853 692 689 038 389 129 + 1;
155 096 364 853 692 689 038 389 129 ÷ 2 = 77 548 182 426 846 344 519 194 564 + 1;
77 548 182 426 846 344 519 194 564 ÷ 2 = 38 774 091 213 423 172 259 597 282 + 0;
38 774 091 213 423 172 259 597 282 ÷ 2 = 19 387 045 606 711 586 129 798 641 + 0;
19 387 045 606 711 586 129 798 641 ÷ 2 = 9 693 522 803 355 793 064 899 320 + 1;
9 693 522 803 355 793 064 899 320 ÷ 2 = 4 846 761 401 677 896 532 449 660 + 0;
4 846 761 401 677 896 532 449 660 ÷ 2 = 2 423 380 700 838 948 266 224 830 + 0;
2 423 380 700 838 948 266 224 830 ÷ 2 = 1 211 690 350 419 474 133 112 415 + 0;
1 211 690 350 419 474 133 112 415 ÷ 2 = 605 845 175 209 737 066 556 207 + 1;
605 845 175 209 737 066 556 207 ÷ 2 = 302 922 587 604 868 533 278 103 + 1;
302 922 587 604 868 533 278 103 ÷ 2 = 151 461 293 802 434 266 639 051 + 1;
151 461 293 802 434 266 639 051 ÷ 2 = 75 730 646 901 217 133 319 525 + 1;
75 730 646 901 217 133 319 525 ÷ 2 = 37 865 323 450 608 566 659 762 + 1;
37 865 323 450 608 566 659 762 ÷ 2 = 18 932 661 725 304 283 329 881 + 0;
18 932 661 725 304 283 329 881 ÷ 2 = 9 466 330 862 652 141 664 940 + 1;
9 466 330 862 652 141 664 940 ÷ 2 = 4 733 165 431 326 070 832 470 + 0;
4 733 165 431 326 070 832 470 ÷ 2 = 2 366 582 715 663 035 416 235 + 0;
2 366 582 715 663 035 416 235 ÷ 2 = 1 183 291 357 831 517 708 117 + 1;
1 183 291 357 831 517 708 117 ÷ 2 = 591 645 678 915 758 854 058 + 1;
591 645 678 915 758 854 058 ÷ 2 = 295 822 839 457 879 427 029 + 0;
295 822 839 457 879 427 029 ÷ 2 = 147 911 419 728 939 713 514 + 1;
147 911 419 728 939 713 514 ÷ 2 = 73 955 709 864 469 856 757 + 0;
73 955 709 864 469 856 757 ÷ 2 = 36 977 854 932 234 928 378 + 1;
36 977 854 932 234 928 378 ÷ 2 = 18 488 927 466 117 464 189 + 0;
18 488 927 466 117 464 189 ÷ 2 = 9 244 463 733 058 732 094 + 1;
9 244 463 733 058 732 094 ÷ 2 = 4 622 231 866 529 366 047 + 0;
4 622 231 866 529 366 047 ÷ 2 = 2 311 115 933 264 683 023 + 1;
2 311 115 933 264 683 023 ÷ 2 = 1 155 557 966 632 341 511 + 1;
1 155 557 966 632 341 511 ÷ 2 = 577 778 983 316 170 755 + 1;
577 778 983 316 170 755 ÷ 2 = 288 889 491 658 085 377 + 1;
288 889 491 658 085 377 ÷ 2 = 144 444 745 829 042 688 + 1;
144 444 745 829 042 688 ÷ 2 = 72 222 372 914 521 344 + 0;
72 222 372 914 521 344 ÷ 2 = 36 111 186 457 260 672 + 0;
36 111 186 457 260 672 ÷ 2 = 18 055 593 228 630 336 + 0;
18 055 593 228 630 336 ÷ 2 = 9 027 796 614 315 168 + 0;
9 027 796 614 315 168 ÷ 2 = 4 513 898 307 157 584 + 0;
4 513 898 307 157 584 ÷ 2 = 2 256 949 153 578 792 + 0;
2 256 949 153 578 792 ÷ 2 = 1 128 474 576 789 396 + 0;
1 128 474 576 789 396 ÷ 2 = 564 237 288 394 698 + 0;
564 237 288 394 698 ÷ 2 = 282 118 644 197 349 + 0;
282 118 644 197 349 ÷ 2 = 141 059 322 098 674 + 1;
141 059 322 098 674 ÷ 2 = 70 529 661 049 337 + 0;
70 529 661 049 337 ÷ 2 = 35 264 830 524 668 + 1;
35 264 830 524 668 ÷ 2 = 17 632 415 262 334 + 0;
17 632 415 262 334 ÷ 2 = 8 816 207 631 167 + 0;
8 816 207 631 167 ÷ 2 = 4 408 103 815 583 + 1;
4 408 103 815 583 ÷ 2 = 2 204 051 907 791 + 1;
2 204 051 907 791 ÷ 2 = 1 102 025 953 895 + 1;
1 102 025 953 895 ÷ 2 = 551 012 976 947 + 1;
551 012 976 947 ÷ 2 = 275 506 488 473 + 1;
275 506 488 473 ÷ 2 = 137 753 244 236 + 1;
137 753 244 236 ÷ 2 = 68 876 622 118 + 0;
68 876 622 118 ÷ 2 = 34 438 311 059 + 0;
34 438 311 059 ÷ 2 = 17 219 155 529 + 1;
17 219 155 529 ÷ 2 = 8 609 577 764 + 1;
8 609 577 764 ÷ 2 = 4 304 788 882 + 0;
4 304 788 882 ÷ 2 = 2 152 394 441 + 0;
2 152 394 441 ÷ 2 = 1 076 197 220 + 1;
1 076 197 220 ÷ 2 = 538 098 610 + 0;
538 098 610 ÷ 2 = 269 049 305 + 0;
269 049 305 ÷ 2 = 134 524 652 + 1;
134 524 652 ÷ 2 = 67 262 326 + 0;
67 262 326 ÷ 2 = 33 631 163 + 0;
33 631 163 ÷ 2 = 16 815 581 + 1;
16 815 581 ÷ 2 = 8 407 790 + 1;
8 407 790 ÷ 2 = 4 203 895 + 0;
4 203 895 ÷ 2 = 2 101 947 + 1;
2 101 947 ÷ 2 = 1 050 973 + 1;
1 050 973 ÷ 2 = 525 486 + 1;
525 486 ÷ 2 = 262 743 + 0;
262 743 ÷ 2 = 131 371 + 1;
131 371 ÷ 2 = 65 685 + 1;
65 685 ÷ 2 = 32 842 + 1;
32 842 ÷ 2 = 16 421 + 0;
16 421 ÷ 2 = 8 210 + 1;
8 210 ÷ 2 = 4 105 + 0;
4 105 ÷ 2 = 2 052 + 1;
2 052 ÷ 2 = 1 026 + 0;
1 026 ÷ 2 = 513 + 0;
513 ÷ 2 = 256 + 1;
256 ÷ 2 = 128 + 0;
128 ÷ 2 = 64 + 0;
64 ÷ 2 = 32 + 0;
32 ÷ 2 = 16 + 0;
16 ÷ 2 = 8 + 0;
8 ÷ 2 = 4 + 0;
4 ÷ 2 = 2 + 0;
2 ÷ 2 = 1 + 0;
1 ÷ 2 = 0 + 1;

2. Construct the base 2 representation of the positive number.

Take all the remainders starting from the bottom of the list constructed above.

11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357₍₁₀₎ =

10 0000 0001 0010 1011 1011 1011 0010 0100 1100 1111 1100 1010 0000 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101₍₂₎

3. Normalize the binary representation of the number.

Shift the decimal mark 173 positions to the left, so that only one non zero digit remains to the left of it:

11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357₍₁₀₎=

10 0000 0001 0010 1011 1011 1011 0010 0100 1100 1111 1100 1010 0000 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101₍₂₎=

10 0000 0001 0010 1011 1011 1011 0010 0100 1100 1111 1100 1010 0000 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101₍₂₎ × 2⁰=

1.0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000 0000 0111 1101 0101 0110 0101 1111 0001 0011 1000 0110 1011 0111 0110 1000 0011 0110 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1110 1011 1110 1₍₂₎ × 2¹⁷³

4. Up to this moment, there are the following elements that would feed into the 64 bit double precision IEEE 754 binary floating point representation:

Sign 0 (a positive number)

Exponent (unadjusted): 173

Mantissa (not normalized):
1.0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000 0000 0111 1101 0101 0110 0101 1111 0001 0011 1000 0110 1011 0111 0110 1000 0011 0110 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1110 1011 1110 1

5. Adjust the exponent.

Use the 11 bit excess/bias notation:

Exponent (adjusted) =

Exponent (unadjusted) + 2^(11-1) - 1 =

173 + 2^(11-1) - 1 =

(173 + 1 023)₍₁₀₎ =

1 196₍₁₀₎

6. Convert the adjusted exponent from the decimal (base 10) to 11 bit binary.

Use the same technique of repeatedly dividing by 2:

division = quotient + remainder;
1 196 ÷ 2 = 598 + 0;
598 ÷ 2 = 299 + 0;
299 ÷ 2 = 149 + 1;
149 ÷ 2 = 74 + 1;
74 ÷ 2 = 37 + 0;
37 ÷ 2 = 18 + 1;
18 ÷ 2 = 9 + 0;
9 ÷ 2 = 4 + 1;
4 ÷ 2 = 2 + 0;
2 ÷ 2 = 1 + 0;
1 ÷ 2 = 0 + 1;

7. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.

Exponent (adjusted) =

1196₍₁₀₎ =

100 1010 1100₍₂₎

8. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.

b) Adjust its length to 52 bits, by removing the excess bits, from the right (if any of the excess bits is set on 1, we are losing precision...).

Mantissa (normalized) =

1. 0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000 0 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101 =

0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000

9. The three elements that make up the number's 64 bit double precision IEEE 754 binary floating point representation:

Sign (1 bit) =
0 (a positive number)

Exponent (11 bits) =
100 1010 1100

Mantissa (52 bits) =
0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000

Decimal number 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357 converted to 64 bit double precision IEEE 754 binary floating point representation:

0 - 100 1010 1100 - 0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000

» Convert decimal 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 351 to 64 bit double precision IEEE 754 binary floating point representation standard

» Calculations Performed by Our Visitors: Decimal Numbers Converted to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard. Data organized on a Monthly Basis

» Month 09, 2025 [September]: Decimal Numbers Converted to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard. Calculations performed during the month of: September - by our visitors

How to convert numbers from the decimal system (base ten) to 64 bit double precision IEEE 754 binary floating point standard

Follow the steps below to convert a base 10 decimal number to 64 bit double precision IEEE 754 binary floating point:

1. If the number to be converted is negative, start with its the positive version.
2. First convert the integer part. Divide repeatedly by 2 the positive representation of the integer number that is to be converted to binary, until we get a quotient that is equal to zero, keeping track of each remainder.
3. Construct the base 2 representation of the positive integer part of the number, by taking all the remainders from the previous operations, starting from the bottom of the list constructed above. Thus, the last remainder of the divisions becomes the first symbol (the leftmost) of the base two number, while the first remainder becomes the last symbol (the rightmost).
4. Then convert the fractional part. Multiply the number repeatedly by 2, until we get a fractional part that is equal to zero, keeping track of each integer part of the results.
5. Construct the base 2 representation of the fractional part of the number, by taking all the integer parts of the multiplying operations, starting from the top of the list constructed above (they should appear in the binary representation, from left to right, in the order they have been calculated).
6. Normalize the binary representation of the number, shifting the decimal mark (the decimal point) "n" positions either to the left, or to the right, so that only one non zero digit remains to the left of the decimal mark.
7. Adjust the exponent in 11 bit excess/bias notation and then convert it from decimal (base 10) to 11 bit binary, by using the same technique of repeatedly dividing by 2, as shown above:
Exponent (adjusted) = Exponent (unadjusted) + 2^(11-1) - 1
8. Normalize mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal mark, if the case) and adjust its length to 52 bits, either by removing the excess bits from the right (losing precision...) or by adding extra bits set on '0' to the right.
9. Sign (it takes 1 bit) is either 1 for a negative or 0 for a positive number.

Example: convert the negative number -31.640 215 from the decimal system (base ten) to 64 bit double precision IEEE 754 binary floating point:

1. Start with the positive version of the number:
|-31.640 215| = 31.640 215
2. First convert the integer part, 31. Divide it repeatedly by 2, keeping track of each remainder, until we get a quotient that is equal to zero:
- division = quotient + remainder;
- 31 ÷ 2 = 15 + 1;
- 15 ÷ 2 = 7 + 1;
- 7 ÷ 2 = 3 + 1;
- 3 ÷ 2 = 1 + 1;
- 1 ÷ 2 = 0 + 1;
- We have encountered a quotient that is ZERO => FULL STOP
3. Construct the base 2 representation of the integer part of the number by taking all the remainders of the previous dividing operations, starting from the bottom of the list constructed above:
31₍₁₀₎ = 1 1111₍₂₎
4. Then, convert the fractional part, 0.640 215. Multiply repeatedly by 2, keeping track of each integer part of the results, until we get a fractional part that is equal to zero:
- #) multiplying = integer + fractional part;
- 1) 0.640 215 × 2 = 1 + 0.280 43;
- 2) 0.280 43 × 2 = 0 + 0.560 86;
- 3) 0.560 86 × 2 = 1 + 0.121 72;
- 4) 0.121 72 × 2 = 0 + 0.243 44;
- 5) 0.243 44 × 2 = 0 + 0.486 88;
- 6) 0.486 88 × 2 = 0 + 0.973 76;
- 7) 0.973 76 × 2 = 1 + 0.947 52;
- 8) 0.947 52 × 2 = 1 + 0.895 04;
- 9) 0.895 04 × 2 = 1 + 0.790 08;
- 10) 0.790 08 × 2 = 1 + 0.580 16;
- 11) 0.580 16 × 2 = 1 + 0.160 32;
- 12) 0.160 32 × 2 = 0 + 0.320 64;
- 13) 0.320 64 × 2 = 0 + 0.641 28;
- 14) 0.641 28 × 2 = 1 + 0.282 56;
- 15) 0.282 56 × 2 = 0 + 0.565 12;
- 16) 0.565 12 × 2 = 1 + 0.130 24;
- 17) 0.130 24 × 2 = 0 + 0.260 48;
- 18) 0.260 48 × 2 = 0 + 0.520 96;
- 19) 0.520 96 × 2 = 1 + 0.041 92;
- 20) 0.041 92 × 2 = 0 + 0.083 84;
- 21) 0.083 84 × 2 = 0 + 0.167 68;
- 22) 0.167 68 × 2 = 0 + 0.335 36;
- 23) 0.335 36 × 2 = 0 + 0.670 72;
- 24) 0.670 72 × 2 = 1 + 0.341 44;
- 25) 0.341 44 × 2 = 0 + 0.682 88;
- 26) 0.682 88 × 2 = 1 + 0.365 76;
- 27) 0.365 76 × 2 = 0 + 0.731 52;
- 28) 0.731 52 × 2 = 1 + 0.463 04;
- 29) 0.463 04 × 2 = 0 + 0.926 08;
- 30) 0.926 08 × 2 = 1 + 0.852 16;
- 31) 0.852 16 × 2 = 1 + 0.704 32;
- 32) 0.704 32 × 2 = 1 + 0.408 64;
- 33) 0.408 64 × 2 = 0 + 0.817 28;
- 34) 0.817 28 × 2 = 1 + 0.634 56;
- 35) 0.634 56 × 2 = 1 + 0.269 12;
- 36) 0.269 12 × 2 = 0 + 0.538 24;
- 37) 0.538 24 × 2 = 1 + 0.076 48;
- 38) 0.076 48 × 2 = 0 + 0.152 96;
- 39) 0.152 96 × 2 = 0 + 0.305 92;
- 40) 0.305 92 × 2 = 0 + 0.611 84;
- 41) 0.611 84 × 2 = 1 + 0.223 68;
- 42) 0.223 68 × 2 = 0 + 0.447 36;
- 43) 0.447 36 × 2 = 0 + 0.894 72;
- 44) 0.894 72 × 2 = 1 + 0.789 44;
- 45) 0.789 44 × 2 = 1 + 0.578 88;
- 46) 0.578 88 × 2 = 1 + 0.157 76;
- 47) 0.157 76 × 2 = 0 + 0.315 52;
- 48) 0.315 52 × 2 = 0 + 0.631 04;
- 49) 0.631 04 × 2 = 1 + 0.262 08;
- 50) 0.262 08 × 2 = 0 + 0.524 16;
- 51) 0.524 16 × 2 = 1 + 0.048 32;
- 52) 0.048 32 × 2 = 0 + 0.096 64;
- 53) 0.096 64 × 2 = 0 + 0.193 28;
- We didn't get any fractional part that was equal to zero. But we had enough iterations (over Mantissa limit = 52) and at least one integer part that was different from zero => FULL STOP (losing precision...).
5. Construct the base 2 representation of the fractional part of the number, by taking all the integer parts of the previous multiplying operations, starting from the top of the constructed list above:
0.640 215₍₁₀₎ = 0.1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0₍₂₎
6. Summarizing - the positive number before normalization:
31.640 215₍₁₀₎ = 1 1111.1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0₍₂₎
7. Normalize the binary representation of the number, shifting the decimal mark 4 positions to the left so that only one non-zero digit stays to the left of the decimal mark:
31.640 215₍₁₀₎ =
1 1111.1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0₍₂₎ =
1 1111.1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0₍₂₎ × 2⁰ =
1.1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0₍₂₎ × 2⁴
8. Up to this moment, there are the following elements that would feed into the 64 bit double precision IEEE 754 binary floating point representation:
Sign: 1 (a negative number)
Exponent (unadjusted): 4
Mantissa (not-normalized): 1.1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0
9. Adjust the exponent in 11 bit excess/bias notation and then convert it from decimal (base 10) to 11 bit binary (base 2), by using the same technique of repeatedly dividing it by 2, as shown above:
Exponent (adjusted) = Exponent (unadjusted) + 2^(11-1) - 1 = (4 + 1023)₍₁₀₎ = 1027₍₁₀₎ =
100 0000 0011₍₂₎
10. Normalize mantissa, remove the leading (leftmost) bit, since it's allways '1' (and the decimal sign) and adjust its length to 52 bits, by removing the excess bits, from the right (losing precision...):
Mantissa (not-normalized): 1.1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100 1010 0
Mantissa (normalized): 1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100
Conclusion:
Sign (1 bit) = 1 (a negative number)
Exponent (8 bits) = 100 0000 0011
Mantissa (52 bits) = 1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100
Number -31.640 215, converted from decimal system (base 10) to 64 bit double precision IEEE 754 binary floating point =
1 - 100 0000 0011 - 1111 1010 0011 1110 0101 0010 0001 0101 0111 0110 1000 1001 1100

Convert Decimal 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357 to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard

Convert decimal 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357(10) to 64 bit double precision IEEE 754 binary floating point representation standard (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

What are the steps to convert decimal number 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357(10) to 64 bit double precision IEEE 754 binary floating point representation (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

1. Divide the number repeatedly by 2.

Keep track of each remainder.

We stop when we get a quotient that is equal to zero.

2. Construct the base 2 representation of the positive number.

Take all the remainders starting from the bottom of the list constructed above.

11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357(10) =

10 0000 0001 0010 1011 1011 1011 0010 0100 1100 1111 1100 1010 0000 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101(2)

3. Normalize the binary representation of the number.

Shift the decimal mark 173 positions to the left, so that only one non zero digit remains to the left of it:

11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357(10) =

10 0000 0001 0010 1011 1011 1011 0010 0100 1100 1111 1100 1010 0000 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101(2) =

10 0000 0001 0010 1011 1011 1011 0010 0100 1100 1111 1100 1010 0000 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101(2) × 20 =

1.0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000 0000 0111 1101 0101 0110 0101 1111 0001 0011 1000 0110 1011 0111 0110 1000 0011 0110 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1110 1011 1110 1(2) × 2173

4. Up to this moment, there are the following elements that would feed into the 64 bit double precision IEEE 754 binary floating point representation:

Sign 0 (a positive number)

Exponent (unadjusted): 173

Mantissa (not normalized): 1.0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000 0000 0111 1101 0101 0110 0101 1111 0001 0011 1000 0110 1011 0111 0110 1000 0011 0110 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1110 1011 1110 1

5. Adjust the exponent.

Use the 11 bit excess/bias notation:

Exponent (adjusted) =

Exponent (unadjusted) + 2(11-1) - 1 =

173 + 2(11-1) - 1 =

(173 + 1 023)(10) =

1 196(10)

6. Convert the adjusted exponent from the decimal (base 10) to 11 bit binary.

Use the same technique of repeatedly dividing by 2:

7. Construct the base 2 representation of the adjusted exponent.

Take all the remainders starting from the bottom of the list constructed above.

Exponent (adjusted) =

1196(10) =

100 1010 1100(2)

8. Normalize the mantissa.

a) Remove the leading (the leftmost) bit, since it's allways 1, and the decimal point, if the case.

b) Adjust its length to 52 bits, by removing the excess bits, from the right (if any of the excess bits is set on 1, we are losing precision...).

Mantissa (normalized) =

1. 0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000 0 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101 =

0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000

9. The three elements that make up the number's 64 bit double precision IEEE 754 binary floating point representation:

Sign (1 bit) = 0 (a positive number)

Exponent (11 bits) = 100 1010 1100

Mantissa (52 bits) = 0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000

Decimal number 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357 converted to 64 bit double precision IEEE 754 binary floating point representation:

0 - 100 1010 1100 - 0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000

» Convert decimal 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 351 to 64 bit double precision IEEE 754 binary floating point representation standard

» Calculations Performed by Our Visitors: Decimal Numbers Converted to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard. Data organized on a Monthly Basis

» Month 09, 2025 [September]: Decimal Numbers Converted to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard. Calculations performed during the month of: September - by our visitors

How to convert numbers from the decimal system (base ten) to 64 bit double precision IEEE 754 binary floating point standard

Follow the steps below to convert a base 10 decimal number to 64 bit double precision IEEE 754 binary floating point:

Example: convert the negative number -31.640 215 from the decimal system (base ten) to 64 bit double precision IEEE 754 binary floating point:

Convert decimal 11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357₍₁₀₎ to 64 bit double precision IEEE 754 binary floating point representation standard (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

What are the steps to convert decimal number
11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357₍₁₀₎ to 64 bit double precision IEEE 754 binary floating point representation (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357₍₁₀₎ =

10 0000 0001 0010 1011 1011 1011 0010 0100 1100 1111 1100 1010 0000 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101₍₂₎

11 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 999 357₍₁₀₎=

10 0000 0001 0010 1011 1011 1011 0010 0100 1100 1111 1100 1010 0000 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101₍₂₎=

10 0000 0001 0010 1011 1011 1011 0010 0100 1100 1111 1100 1010 0000 0000 1111 1010 1010 1100 1011 1110 0010 0111 0000 1101 0110 1110 1101 0000 0110 1101 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1101 0111 1101₍₂₎ × 2⁰=

1.0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000 0000 0111 1101 0101 0110 0101 1111 0001 0011 1000 0110 1011 0111 0110 1000 0011 0110 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1110 1011 1110 1₍₂₎ × 2¹⁷³

Mantissa (not normalized):
1.0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000 0000 0111 1101 0101 0110 0101 1111 0001 0011 1000 0110 1011 0111 0110 1000 0011 0110 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1110 1011 1110 1

Exponent (unadjusted) + 2^(11-1) - 1 =

173 + 2^(11-1) - 1 =

(173 + 1 023)₍₁₀₎ =

1 196₍₁₀₎

1196₍₁₀₎ =

100 1010 1100₍₂₎

Sign (1 bit) =
0 (a positive number)

Exponent (11 bits) =
100 1010 1100

Mantissa (52 bits) =
0000 0000 1001 0101 1101 1101 1001 0010 0110 0111 1110 0101 0000