0 - 011 1000 0001 - 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 1000 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard Converted to Decimal
0 - 011 1000 0001 - 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 1000: 64 bit double precision IEEE 754 binary floating point representation standard converted to decimal
What are the steps to convert
0 - 011 1000 0001 - 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 1000, a 64 bit double precision IEEE 754 binary floating point representation standard to decimal?
1. Identify the elements that make up the binary representation of the number:
The first bit (the leftmost) indicates the sign,
1 = negative, 0 = positive.
0
The next 11 bits contain the exponent:
011 1000 0001
The last 52 bits contain the mantissa:
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 1000
2. Convert the exponent from binary (from base 2) to decimal (in base 10).
The exponent is allways a positive integer.
011 1000 0001(2) =
0 × 210 + 1 × 29 + 1 × 28 + 1 × 27 + 0 × 26 + 0 × 25 + 0 × 24 + 0 × 23 + 0 × 22 + 0 × 21 + 1 × 20 =
0 + 512 + 256 + 128 + 0 + 0 + 0 + 0 + 0 + 0 + 1 =
512 + 256 + 128 + 1 =
897(10)
3. Adjust the exponent.
Subtract the excess bits: 2(11 - 1) - 1 = 1023,
that is due to the 11 bit excess/bias notation.
The exponent, adjusted = 897 - 1023 = -126
4. Convert the mantissa from binary (from base 2) to decimal (in base 10).
The mantissa represents the fractional part of the number (what comes after the whole part of the number, separated from it by a comma).
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 1000(2) =
0 × 2-1 + 0 × 2-2 + 0 × 2-3 + 0 × 2-4 + 0 × 2-5 + 0 × 2-6 + 0 × 2-7 + 0 × 2-8 + 0 × 2-9 + 0 × 2-10 + 0 × 2-11 + 0 × 2-12 + 0 × 2-13 + 0 × 2-14 + 0 × 2-15 + 0 × 2-16 + 0 × 2-17 + 0 × 2-18 + 0 × 2-19 + 0 × 2-20 + 0 × 2-21 + 0 × 2-22 + 0 × 2-23 + 0 × 2-24 + 0 × 2-25 + 0 × 2-26 + 0 × 2-27 + 0 × 2-28 + 0 × 2-29 + 0 × 2-30 + 0 × 2-31 + 0 × 2-32 + 0 × 2-33 + 0 × 2-34 + 0 × 2-35 + 0 × 2-36 + 0 × 2-37 + 0 × 2-38 + 0 × 2-39 + 0 × 2-40 + 0 × 2-41 + 0 × 2-42 + 0 × 2-43 + 0 × 2-44 + 0 × 2-45 + 1 × 2-46 + 0 × 2-47 + 0 × 2-48 + 1 × 2-49 + 0 × 2-50 + 0 × 2-51 + 0 × 2-52 =
0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0.000 000 000 000 014 210 854 715 202 003 717 422 485 351 562 5 + 0 + 0 + 0.000 000 000 000 001 776 356 839 400 250 464 677 810 668 945 312 5 + 0 + 0 + 0 =
0.000 000 000 000 014 210 854 715 202 003 717 422 485 351 562 5 + 0.000 000 000 000 001 776 356 839 400 250 464 677 810 668 945 312 5 =
0.000 000 000 000 015 987 211 554 602 254 182 100 296 020 507 812 5(10)
5. Put all the numbers into expression to calculate the double precision floating point decimal value:
(-1)Sign × (1 + Mantissa) × 2(Adjusted exponent) =
(-1)0 × (1 + 0.000 000 000 000 015 987 211 554 602 254 182 100 296 020 507 812 5) × 2-126 =
1.000 000 000 000 015 987 211 554 602 254 182 100 296 020 507 812 5 × 2-126 = ...
= 0.000 000 000 000 000 000 000 000 000 000 000 000 011 754 943 508 223 063 008 456 043 729 728 826 034 657 002 156 290 479 408 618 149 449 020 748 043 277 189 256 545 069 407 603 344 019 408 453 391 406 510 490 924 119 949 340 820 312 5
0 - 011 1000 0001 - 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 1000, a 64 bit double precision IEEE 754 binary floating point representation standard to a decimal number, written in base ten (double) = 0.000 000 000 000 000 000 000 000 000 000 000 000 011 754 943 508 223 063 008 456 043 729 728 826 034 657 002 156 290 479 408 618 149 449 020 748 043 277 189 256 545 069 407 603 344 019 408 453 391 406 510 490 924 119 949 340 820 312 5(10)
Spaces were used to group digits: for binary, by 4, for decimal, by 3.