[PATCH] Short-term workaround for hidden x86 float precision in calculateSpillWeightAndHint()

Mon Apr 21 11:24:27 PDT 2014

----- Original Message -----
> From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Commit Messages and Patches for LLVM" <llvm-commits at cs.uiuc.edu>, "Andrew Trick" <atrick at apple.com>, "Chandler
> Carruth" <chandlerc at google.com>
> Sent: Monday, April 21, 2014 1:12:54 PM
> Subject: Re: [PATCH] Short-term workaround for hidden x86 float precision in calculateSpillWeightAndHint()
> 
> 
> On 2014-Apr-21, at 10:19, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com>
> >> To: "Andrew Trick" <atrick at apple.com>, "Chandler Carruth"
> >> <chandlerc at google.com>
> >> Cc: "Commit Messages and Patches for LLVM"
> >> <llvm-commits at cs.uiuc.edu>
> >> Sent: Monday, April 21, 2014 11:39:48 AM
> >> Subject: [PATCH] Short-term workaround for hidden x86 float
> >> precision in	calculateSpillWeightAndHint()
> >> 
> >> 
> >> 
> >> The attached patch is a workaround that prevents hidden precision
> >> from
> >> changing results in x86 floats in calculateSpillWeightAndHint().
> >> 
> >> Applying my BlockFrequencyInfo patch causes two test failures on
> >> i386
> >> (see, e.g., r206704). My patch changes the output of the pass, and
> >> somehow exposes this problem in calculateSpillWeightAndHint(). In
> >> the affected code:
> >> 
> >> float hweight = Hint[hint] += weight;
> >> if (TargetRegisterInfo::isPhysicalRegister(hint)) {
> >> if (hweight > bestPhys && mri.isAllocatable(hint))
> >> bestPhys = hweight, hintPhys = hint;
> >> }
> >> 
> >> I converted `hweight` and `bestPhys` to APFloat and used
> >> APFloat::toString() to dump them. In the problematic code path,
> >> they
> >> are both exactly equal to 1. However, the comparison
> >> `hweight > bestPHys` gives `true` due to hidden precision in x86.
> >> Marking `hweight` as `volatile` forces it to be a stack variable
> >> (preventing hidden precision), so this comparison correctly gives
> >> `false` even on i386 (hack suggested by chapuni!).
> > 
> > To be honest, I don't understand why you're seeing this problem. I
> > know this was a problem pre-SSE, but I think that once SSE became
> > available, the cost of forcing IEEE strict math became reasonable.
> > IEEE arithmetic (although not the transcendental functions) should
> > be strictly deterministic across platforms, and I don't see any
> > reason we should support building LLVM in any other mode.
> > 
> > -Hal
> 
> Frankly, neither do I, but I find floats *really* scary and hide
> under
> the bed when I see them.
> 
> I've committed the workaround for now (r206765).  If want me to
> investigate further (and have something you want me to test), let me
> know.  I was kindly given access to the i386-freebsd [1] buildslave
> to
> debug this and haven't given it up yet.

Perhaps this explains it: http://stackoverflow.com/questions/16888621/why-does-returning-a-floating-point-value-change-its-value

In short, from the gcc man page:
  -mfpmath=unit
    387 ... This is the default choice for i386 compiler.
    sse ... This is the default choice for the x86-64 compiler.

As suggested on that SO page, building with -msse2 -mfpmath=sse on i386 should "fix" this issue (except that, obviously, we require an SSE-enabled chip). -- but, that would break the ABI and that's probably a no-go for anyone who needs to embed LLVM as a library with an existing codebase on i386.

 -Hal

> 
> [1]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory