[LLVMbugs] [Bug 21231] New: use fsel to avoid branch and compare
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Thu Oct 9 15:10:14 PDT 2014
http://llvm.org/bugs/show_bug.cgi?id=21231
Bug ID: 21231
Summary: use fsel to avoid branch and compare
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Backend: PowerPC
Assignee: unassignedbugs at nondot.org
Reporter: spatel+llvm at rotateright.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
This testcase is derived from llvm/test/CodeGen/PowerPC/recipest.ll.
Using -ffast-math, we can convert a sqrt intrinsic into a reciprocal square
root multiplied by its argument ( X * X ** -0.5 = X ** 0.5 )...with one
problem: we can't let a '0.0f' input turn into a 'NaN' output.
The current PPC scalar codegen compares and branches around that:
$ cat sqrtf.ll
declare float @llvm.sqrt.f32(float)
define float @goo3(float %a) nounwind {
%r = call float @llvm.sqrt.f32(float %a)
ret float %r
}
$ ./llc -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr7 -enable-unsafe-fp-math
sqrtf.ll -o -
...
.L.goo3:
# BB#0:
addis 3, 2, .LCPI0_1 at toc@ha
lfs 0, .LCPI0_1 at toc@l(3)
fcmpu 0, 1, 0
beq 0, .LBB0_2
# BB#1:
frsqrtes 0, 1
addis 3, 2, .LCPI0_0 at toc@ha
lfs 2, .LCPI0_0 at toc@l(3)
fnmsubs 3, 1, 2, 1
fmuls 4, 0, 0
fmadds 2, 3, 4, 2
fmuls 0, 0, 2
fmuls 0, 1, 0
.LBB0_2:
fmr 1, 0
blr
------------------------------------------------------
An 'fsel' would probably be a better choice here for performance. For the
vector PPC case, we do generate vcmpeqfp/vandc.
X86 scalar code (when enabled) will use a similar pattern to do the select:
vrsqrtss %xmm0, %xmm0, %xmm1
vmulss LCPI0_0(%rip), %xmm1, %xmm2
vmulss %xmm1, %xmm1, %xmm1
vmulss %xmm0, %xmm1, %xmm1
vaddss LCPI0_1(%rip), %xmm1, %xmm1
vmulss %xmm2, %xmm1, %xmm1
vxorps %xmm2, %xmm2, %xmm2
vmulss %xmm1, %xmm0, %xmm1
vcmpeqss %xmm2, %xmm0, %xmm0
vandnps %xmm1, %xmm0, %xmm0
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20141009/36b811cf/attachment.html>
More information about the llvm-bugs
mailing list