[PLUG] integer-decode-float equivalent in C (gcc bug at -O2 and higher?)

Fri Feb 22 09:36:16 UTC 2008

In Common Lisp and Scheme you often find a reliable compatibility
between print and read for floating point values.  That is, for any
float you can print it to a string and then parse that string back
into the same float from an equivalence point of view.

   Printing Floating-Point Numbers Quickly and Accurately
   http://www.cs.indiana.edu/~burger/FP-Printing-PLDI96.pdf

   How to Read Floating Point Numbers Accurately
   ftp://ftp.ccs.neu.edu/pub/people/will/howtoread.ps

In Common Lisp there is a function that "decodes" a float into the
various integer parts (sign, mantissa and exponent) in such a way that
you can reconstruct the float from the integer components, thusly:

   [1]> (integer-decode-float 1.0)
   4503599627370496 ;
   -52 ;
   1

   [2]> (multiple-value-bind (m e s) 
          (integer-decode-float 1.0) 
            (* s m (expt 2.0 e)))
   1.0

Note, the second command is binding the multiple values to (m)antissa,
(e)xponent and (s)ign and then recomposing them by multiplication:
sign * mantissa * 2^exponent, and returning the same value you fed to
integer-decode-float (1.0 in this case, but it will work for any
floating point number).

I have an application in which a C program wants to pass floating
point values (doubles, as it happens) accurately over a network socket
to a lisp program without a lot of messy float printing and parsing.
Integers are so much nicer.

So, I tried to write an equivalent to integer-decode-float in C.  I
did.  It worked.  For a while.  Maybe my C is rusty.  Here it is in
testable isolation:

   #include <stdio.h>
   #include <stdlib.h>

   struct decoded_float {
     unsigned sign;
     signed exponent;
     unsigned long long mantissa;
   };

   struct decoded_float*
   decode_float(double value) 
   {
     static static decoded_float d;

     unsigned long long *p,v;

     p = (unsigned long long *) &value;
     v = *p;

     d.sign = v >> 63;
     d.exponent = v << 1 >> 53;
     d.mantissa = v << 12 >> 12;

     printf("s e m = %u %d %llu\n",d.sign,d.exponent,d.mantissa);

     if(d.exponent)
       d.mantissa |= (unsigned long long) 1 << 52;

     printf("s e m = %u %d %llu\n",d.sign,d.exponent,d.mantissa);

     d.exponent -= 1023 + 52;

     printf("s e m = %u %d %llu\n",d.sign,d.exponent,d.mantissa);

     return &d;
   }

   int 
   main(int argc, char** argv) 
   {
     double v = atof(argv[1]);
     struct decoded_float* f = decode_float(v);  

     printf("s e m = %u %d %llu\n",f->sign,f->exponent,f->mantissa);

     return 0;
   }        

[Note: s here is 0 or 1 rather than -1 or +1 in the lisp equivalent
... that's okay, I can deal with that.]

Again, this seemed to be working fine.  Then, after about 4 years I
wanted to run this stuff again and I noticed some errors, particularly
when I ramped up the optimization.  

   $ for i in $(seq 0 6) ; do echo Using -O$i ; gcc -O$i decoded-float.c -o decoded-float ; ./decoded-float 1.0 ; done
   Using -O0
   s e m = 0 1023 0
   s e m = 0 1023 4503599627370496
   s e m = 0 -52 4503599627370496
   s e m = 0 -52 4503599627370496
   Using -O1
   s e m = 0 1023 0
   s e m = 0 1023 4503599627370496
   s e m = 0 -52 4503599627370496
   s e m = 0 -52 4503599627370496
   Using -O2
   s e m = 1 894 1618429576478720
   s e m = 1 894 6122029203849216
   s e m = 1 -181 6122029203849216
   s e m = 1 -181 6122029203849216
   Using -O3
   s e m = 0 128 1294403149217721
   s e m = 0 128 5798002776588217
   s e m = 0 -947 5798002776588217
   s e m = 0 -947 5798002776588217
   Using -O4
   s e m = 0 128 1294403149725625
   s e m = 0 128 5798002777096121
   s e m = 0 -947 5798002777096121
   s e m = 0 -947 5798002777096121
   Using -O5
   s e m = 0 128 1294403149512633
   s e m = 0 128 5798002776883129
   s e m = 0 -947 5798002776883129
   s e m = 0 -947 5798002776883129
   Using -O6
   s e m = 0 128 1294403149254585
   s e m = 0 128 5798002776625081
   s e m = 0 -947 5798002776625081
   s e m = 0 -947 5798002776625081

Note, with -O0 and -O1, I get the same decodes as the Lisp
integer-decode-float.  In -O2 it goes wrong and it goes wrong
differently in -O3 and above.

   $ gcc --version
   gcc (GCC) 4.2.3 (Debian 4.2.3-1)

Any clues what I am doing wrong?

-- 
Russell Senior         ``I have nine fingers; you have ten.''
seniorr at aracnet.com