# 64bit IEEE 754: Decimal ↗ Double Precision Floating Point Binary: 0.358 335 Convert the Number to 64 Bit Double Precision IEEE 754 Binary Floating Point Representation Standard, From a Base Ten Decimal System Number

## Number 0.358 335(10) converted and written in 64 bit double precision IEEE 754 binary floating point representation (1 bit for sign, 11 bits for exponent, 52 bits for mantissa)

### 1. First, convert to binary (in base 2) the integer part: 0. Divide the number repeatedly by 2.

#### We stop when we get a quotient that is equal to zero.

• division = quotient + remainder;
• 0 ÷ 2 = 0 + 0;

### 3. Convert to binary (base 2) the fractional part: 0.358 335.

#### Stop when we get a fractional part that is equal to zero.

• #) multiplying = integer + fractional part;
• 1) 0.358 335 × 2 = 0 + 0.716 67;
• 2) 0.716 67 × 2 = 1 + 0.433 34;
• 3) 0.433 34 × 2 = 0 + 0.866 68;
• 4) 0.866 68 × 2 = 1 + 0.733 36;
• 5) 0.733 36 × 2 = 1 + 0.466 72;
• 6) 0.466 72 × 2 = 0 + 0.933 44;
• 7) 0.933 44 × 2 = 1 + 0.866 88;
• 8) 0.866 88 × 2 = 1 + 0.733 76;
• 9) 0.733 76 × 2 = 1 + 0.467 52;
• 10) 0.467 52 × 2 = 0 + 0.935 04;
• 11) 0.935 04 × 2 = 1 + 0.870 08;
• 12) 0.870 08 × 2 = 1 + 0.740 16;
• 13) 0.740 16 × 2 = 1 + 0.480 32;
• 14) 0.480 32 × 2 = 0 + 0.960 64;
• 15) 0.960 64 × 2 = 1 + 0.921 28;
• 16) 0.921 28 × 2 = 1 + 0.842 56;
• 17) 0.842 56 × 2 = 1 + 0.685 12;
• 18) 0.685 12 × 2 = 1 + 0.370 24;
• 19) 0.370 24 × 2 = 0 + 0.740 48;
• 20) 0.740 48 × 2 = 1 + 0.480 96;
• 21) 0.480 96 × 2 = 0 + 0.961 92;
• 22) 0.961 92 × 2 = 1 + 0.923 84;
• 23) 0.923 84 × 2 = 1 + 0.847 68;
• 24) 0.847 68 × 2 = 1 + 0.695 36;
• 25) 0.695 36 × 2 = 1 + 0.390 72;
• 26) 0.390 72 × 2 = 0 + 0.781 44;
• 27) 0.781 44 × 2 = 1 + 0.562 88;
• 28) 0.562 88 × 2 = 1 + 0.125 76;
• 29) 0.125 76 × 2 = 0 + 0.251 52;
• 30) 0.251 52 × 2 = 0 + 0.503 04;
• 31) 0.503 04 × 2 = 1 + 0.006 08;
• 32) 0.006 08 × 2 = 0 + 0.012 16;
• 33) 0.012 16 × 2 = 0 + 0.024 32;
• 34) 0.024 32 × 2 = 0 + 0.048 64;
• 35) 0.048 64 × 2 = 0 + 0.097 28;
• 36) 0.097 28 × 2 = 0 + 0.194 56;
• 37) 0.194 56 × 2 = 0 + 0.389 12;
• 38) 0.389 12 × 2 = 0 + 0.778 24;
• 39) 0.778 24 × 2 = 1 + 0.556 48;
• 40) 0.556 48 × 2 = 1 + 0.112 96;
• 41) 0.112 96 × 2 = 0 + 0.225 92;
• 42) 0.225 92 × 2 = 0 + 0.451 84;
• 43) 0.451 84 × 2 = 0 + 0.903 68;
• 44) 0.903 68 × 2 = 1 + 0.807 36;
• 45) 0.807 36 × 2 = 1 + 0.614 72;
• 46) 0.614 72 × 2 = 1 + 0.229 44;
• 47) 0.229 44 × 2 = 0 + 0.458 88;
• 48) 0.458 88 × 2 = 0 + 0.917 76;
• 49) 0.917 76 × 2 = 1 + 0.835 52;
• 50) 0.835 52 × 2 = 1 + 0.671 04;
• 51) 0.671 04 × 2 = 1 + 0.342 08;
• 52) 0.342 08 × 2 = 0 + 0.684 16;
• 53) 0.684 16 × 2 = 1 + 0.368 32;
• 54) 0.368 32 × 2 = 0 + 0.736 64;

### 9. Convert the adjusted exponent from the decimal (base 10) to 11 bit binary.

#### Use the same technique of repeatedly dividing by 2:

• division = quotient + remainder;
• 1 021 ÷ 2 = 510 + 1;
• 510 ÷ 2 = 255 + 0;
• 255 ÷ 2 = 127 + 1;
• 127 ÷ 2 = 63 + 1;
• 63 ÷ 2 = 31 + 1;
• 31 ÷ 2 = 15 + 1;
• 15 ÷ 2 = 7 + 1;
• 7 ÷ 2 = 3 + 1;
• 3 ÷ 2 = 1 + 1;
• 1 ÷ 2 = 0 + 1;