[LLVMbugs] [Bug 21231] New: use fsel to avoid branch and compare

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Thu Oct 9 15:10:14 PDT 2014


http://llvm.org/bugs/show_bug.cgi?id=21231

            Bug ID: 21231
           Summary: use fsel to avoid branch and compare
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: PowerPC
          Assignee: unassignedbugs at nondot.org
          Reporter: spatel+llvm at rotateright.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

This testcase is derived from llvm/test/CodeGen/PowerPC/recipest.ll. 

Using -ffast-math, we can convert a sqrt intrinsic into a reciprocal square
root multiplied by its argument ( X * X ** -0.5 = X ** 0.5 )...with one
problem: we can't let a '0.0f' input turn into a 'NaN' output. 

The current PPC scalar codegen compares and branches around that:

$ cat sqrtf.ll
declare float @llvm.sqrt.f32(float)

define float @goo3(float %a) nounwind {
  %r = call float @llvm.sqrt.f32(float %a)
  ret float %r
}

$ ./llc -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr7 -enable-unsafe-fp-math
sqrtf.ll -o -
...
.L.goo3:
# BB#0:
    addis 3, 2, .LCPI0_1 at toc@ha
    lfs 0, .LCPI0_1 at toc@l(3)
    fcmpu 0, 1, 0
    beq 0, .LBB0_2
# BB#1:
    frsqrtes 0, 1
    addis 3, 2, .LCPI0_0 at toc@ha
    lfs 2, .LCPI0_0 at toc@l(3)
    fnmsubs 3, 1, 2, 1
    fmuls 4, 0, 0
    fmadds 2, 3, 4, 2
    fmuls 0, 0, 2
    fmuls 0, 1, 0
.LBB0_2:
    fmr 1, 0
    blr

------------------------------------------------------

An 'fsel' would probably be a better choice here for performance. For the
vector PPC case, we do generate vcmpeqfp/vandc.

X86 scalar code (when enabled) will use a similar pattern to do the select:
    vrsqrtss    %xmm0, %xmm0, %xmm1
    vmulss    LCPI0_0(%rip), %xmm1, %xmm2
    vmulss    %xmm1, %xmm1, %xmm1
    vmulss    %xmm0, %xmm1, %xmm1
    vaddss    LCPI0_1(%rip), %xmm1, %xmm1
    vmulss    %xmm2, %xmm1, %xmm1
    vxorps    %xmm2, %xmm2, %xmm2
    vmulss    %xmm1, %xmm0, %xmm1
    vcmpeqss    %xmm2, %xmm0, %xmm0
    vandnps    %xmm1, %xmm0, %xmm0

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20141009/36b811cf/attachment.html>


More information about the llvm-bugs mailing list