[llvm-bugs] [Bug 40905] New: [x86] load/splat scalar constants for use with scalar FP logic ops?
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Feb 28 11:36:50 PST 2019
https://bugs.llvm.org/show_bug.cgi?id=40905
Bug ID: 40905
Summary: [x86] load/splat scalar constants for use with scalar
FP logic ops?
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: spatel+llvm at rotateright.com
CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
llvm-dev at redking.me.uk, spatel+llvm at rotateright.com
Seen in:
https://reviews.llvm.org/D58282
declare <4 x double> @llvm.copysign.v4f64(<4 x double>, <4 x double>)
define double @copysign_v4f64(<4 x double> %x, <4 x double> %y) nounwind {
%v = call <4 x double> @llvm.copysign.v4f64(<4 x double> %x, <4 x double> %y)
%r = extractelement <4 x double> %v, i32 0
ret double %r
}
We convert this to a scalar op, but the x86 FP logic ops only allow load
folding with vector constants so:
$ ./llc -o - -mattr=avx copysign.ll
LCPI0_0:
.quad -9223372036854775808 ## double -0
.quad -9223372036854775808 ## double -0
LCPI0_1:
.quad 9223372036854775807 ## double NaN
.quad 9223372036854775807 ## double NaN
.section __TEXT,__text,regular,pure_instructions
.globl _copysign_v4f64
.p2align 4, 0x90
_copysign_v4f64:
vandps LCPI0_0(%rip), %xmm1, %xmm1
vandps LCPI0_1(%rip), %xmm0, %xmm0
vorps %xmm1, %xmm0, %xmm0
vzeroupper
retq
----------------------------------------------------------------------
Should we broadcast scalar constants instead? The answer may depend on whether
we are optimizing for size and/or subtarget.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190228/9b16fd12/attachment-0001.html>
More information about the llvm-bugs
mailing list