[llvm-bugs] [Bug 25146] New: Suboptimal across lane float min/max reduction
via llvm-bugs
llvm-bugs at lists.llvm.org
Mon Oct 12 06:03:39 PDT 2015
https://llvm.org/bugs/show_bug.cgi?id=25146
Bug ID: 25146
Summary: Suboptimal across lane float min/max reduction
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Backend: AArch64
Assignee: unassignedbugs at nondot.org
Reporter: charlesturner7c5 at gmail.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
Created attachment 15056
--> https://llvm.org/bugs/attachment.cgi?id=15056&action=edit
LLVM function showing missed opportunity.
For the attached test case, I'm seeing codegen with some room for improvement.
oversized_f_fmaxvec: // @oversized_f_fmaxvec
.cfi_startproc
// BB#0:
ldp q0, q1, [x0, #32]
ldp q3, q2, [x0]
fmaxnm v4.4s, v0.4s, v0.4s
fmaxnm v5.4s, v1.4s, v0.4s
fmaxnm v1.4s, v2.4s, v1.4s
fmaxnm v0.4s, v3.4s, v0.4s
fmaxnm v0.4s, v0.4s, v1.4s
fmaxnm v1.4s, v1.4s, v0.4s
fmaxnm v2.4s, v5.4s, v0.4s
fmaxnm v3.4s, v4.4s, v0.4s
ext v4.16b, v0.16b, v0.16b, #8
fmaxnm v3.4s, v3.4s, v0.4s
fmaxnm v2.4s, v2.4s, v0.4s
fmaxnm v1.4s, v1.4s, v0.4s
fmaxnm v0.4s, v0.4s, v4.4s
fcmge v1.4s, v1.4s, v0.4s
fcmge v2.4s, v2.4s, v0.4s
fcmge v3.4s, v3.4s, v0.4s
dup v4.4s, v0.s[1]
xtn v1.4h, v1.4s
xtn v2.4h, v2.4s
xtn v3.4h, v3.4s
fcmge v4.4s, v0.4s, v4.4s
uzp1 v2.8b, v3.8b, v2.8b
xtn v3.4h, v4.4s
shl v2.8b, v2.8b, #7
uzp1 v1.8b, v3.8b, v1.8b
sshr v2.8b, v2.8b, #7
shl v1.8b, v1.8b, #7
sshr v1.8b, v1.8b, #7
ins v1.d[1], v2.d[0]
shl v1.16b, v1.16b, #7
sshr v1.16b, v1.16b, #7
umov w8, v1.b[0]
mov s1, v0.s[1]
tst w8, #0x1
fcsel s0, s0, s1, ne
ret
I expect to see the FMAXNMV used here instead. I think r249834 was moving in
this direction.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20151012/05b844e5/attachment-0001.html>
More information about the llvm-bugs
mailing list