[llvm-bugs] [Bug 43611] New: [X86] repeated memory references for infinity checks
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Oct 8 14:20:14 PDT 2019
https://bugs.llvm.org/show_bug.cgi?id=43611
Bug ID: 43611
Summary: [X86] repeated memory references for infinity checks
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: andres at anarazel.de
CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
llvm-dev at redking.me.uk, spatel+llvm at rotateright.com
Hi,
LLVM, for x64, does generate repeated memory references when checking the
results of floating point math for infinity. Clang generates IR like:
%add = fadd double %a, %b
%0 = tail call double @llvm.fabs.f64(double %add) #3
%cmpinf = fcmp oeq double %0, 0x7FF0000000000000
br i1 %cmpinf, label %if.then, label %if.end, !prof !2
...
if.end: ; preds = %entry
%add2 = fadd double %add, %c
%1 = tail call double @llvm.fabs.f64(double %add2) #3
%cmpinf3 = fcmp oeq double %1, 0x7FF0000000000000
br i1 %cmpinf3, label %if.then9, label %if.end10, !prof !2
...
for
...
sum += b;
if (unlikely(__builtin_isinf(sum)))
overflowerror();
sum += c;
if (unlikely(__builtin_isinf(sum)))
overflowerror();
...
which end up with instructions that reference the NaN/+Inf constants multiple
times:
vaddsd %xmm1, %xmm0, %xmm0
vandpd .LCPI0_0(%rip), %xmm0, %xmm1
vucomisd .LCPI0_1(%rip), %xmm1
jae .LBB0_5
vaddsd %xmm2, %xmm0, %xmm0
vandpd .LCPI0_0(%rip), %xmm0, %xmm1
vucomisd .LCPI0_1(%rip), %xmm1
jae .LBB0_5
...
rather than moving them into a register once.
Here's a godbolt link showing the issue:
https://godbolt.org/z/MnGGMU
Interestingly llvm generates slightly better code for single precision floats,
moving the NaN constant into a register, but continuing to reference the
positive +Inf from memory.
Thanks to LebedevRI on #llvm, here are the links to llvm-mca analyses for
double/single precision, comparing with GCC generated code (although part of
the cost difference are due to gcc using subq/addq $8, %rsp, and clang using
pushq/popq, and I'm not sure that's accurately costed). Interestingly gcc
generates similar code to clang for single precision code.
https://godbolt.org/z/_2q9cc
https://godbolt.org/z/nytVTG
What I originally was wishing for was that some pass would recognize that the
isinf statements are redundant, and remove all but the last, but that's a
separate issue. In my case the repeated additions are only apparent after IPO.
- Andres
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20191008/a4e82e31/attachment.html>
More information about the llvm-bugs
mailing list