[LLVMbugs] [Bug 21385] optimize reciprocals with fast-math (x86)

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Thu Nov 27 21:40:22 PST 2014


http://llvm.org/bugs/show_bug.cgi?id=21385

Steven Noonan <steven at uplinklabs.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
                 CC|                            |steven at uplinklabs.net
         Resolution|FIXED                       |---

--- Comment #23 from Steven Noonan <steven at uplinklabs.net> ---
I'd like to reopen this issue to ask why FeatureUseSqrtEst and
FeatureUseRecipEst were only enabled for Jaguar. It has demonstrable benefits
on many if not all of the other x86 microarchitectures. Is there a reason it
cannot be enabled more broadly?

I'm not sure what GCC's criteria is for enabling it, but I've seen reciprocal
square root estimate enabled on every x86 -march= I know of whenever
-ffast-math is specified and SSE is available. For example, even as far back as
-march=pentium3 rsqrtss is used:

float rsqrtf(float f)
{
        return 1.0f / sqrtf(f);
}

$ gcc -m32 -O3 -ffast-math -mfpmath=sse -march=pentium3 -S -o - rsqrt.c
[...]
rsqrtf:

        subl    $4, %esp
        rsqrtss 8(%esp), %xmm1
        movss   8(%esp), %xmm0
        mulss   %xmm1, %xmm0
        mulss   %xmm1, %xmm0
        mulss   .LC1, %xmm1
        addss   .LC0, %xmm0
        mulss   %xmm1, %xmm0
        movss   %xmm0, (%esp)
        flds    (%esp)
        popl    %eax
        ret
.LC0:
        .long   3225419776
.LC1:
        .long   3204448256


GCC does not, however, emit reciprocal estimates, even with -march=haswell.
It's possible that GCC does not implement any selection of RCPSS:

float recipf(float f)
{
        return 1.0f / f;
}

$ gcc -O3 -ffast-math -mfpmath=sse -march=haswell -S -o - recip.c
[...]
        vmovss  .LC0(%rip), %xmm1
        vdivss  %xmm0, %xmm1, %xmm0
        ret
.LC0:
        .long   1065353216

At the very least I'd like to see reciprocal square root estimates added by
default on x86 for -ffast-math. How can we get such a change implemented? Is it
a matter of building confidence in the safety and benefit of such a change?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20141128/e4ef3f01/attachment.html>


More information about the llvm-bugs mailing list