[llvm-bugs] [Bug 42022] New: Failure to merge 2 * <2 x float> load + fadd and then split the result back to 2 * <2 x float>
via llvm-bugs
llvm-bugs at lists.llvm.org
Sun May 26 05:12:34 PDT 2019
https://bugs.llvm.org/show_bug.cgi?id=42022
Bug ID: 42022
Summary: Failure to merge 2 * <2 x float> load + fadd and then
split the result back to 2 * <2 x float>
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: llvm-dev at redking.me.uk
CC: a.bataev at hotmail.com, anton.a.afanasyev at gmail.com,
craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
llvm-dev at redking.me.uk, spatel+llvm at rotateright.com
Another missed vectorization opportunity for
struct Vector4 {
float x, y, z, w;
};
https://godbolt.org/z/lbAN92
// Bad: Failed to combine loads + split result
Vector4 Add(const Vector4 &a, const Vector4 &b) {
Vector4 r;
r.x = a.x + b.x;
r.y = a.y + b.y;
r.z = a.z + b.z;
r.w = a.w + b.w;
return r;
}
_Z3AddRK7Vector4S1_: # @_Z3AddRK7Vector4S1_
vmovsd (%rdi), %xmm0 # xmm0 = mem[0],zero
vmovsd (%rsi), %xmm2 # xmm2 = mem[0],zero
vmovsd 8(%rdi), %xmm1 # xmm1 = mem[0],zero
vaddps %xmm2, %xmm0, %xmm0
vmovsd 8(%rsi), %xmm2 # xmm2 = mem[0],zero
vaddps %xmm2, %xmm1, %xmm1
retq
Not sure whether this can be handled in the SLP or DAG but we should be able to
do something like:
_Z3AddRK7Vector4S1_: # @_Z3AddRK7Vector4S1_
vmovups (%rdi), %xmm0
vaddps (%rsi), %xmm0, %xmm0
vunpckhpd %xmm0, %xmm0, %xmm1 # xmm1 = xmm0[2,3,2,3]
retq
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190526/4bad5156/attachment.html>
More information about the llvm-bugs
mailing list