[llvm-bugs] [Bug 48429] New: Generated scatter instructions are slower than scalar version
via llvm-bugs
llvm-bugs at lists.llvm.org
Mon Dec 7 10:23:09 PST 2020
https://bugs.llvm.org/show_bug.cgi?id=48429
Bug ID: 48429
Summary: Generated scatter instructions are slower than scalar
version
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: carrot at google.com
CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
llvm-dev at redking.me.uk, pengfei.wang at intel.com,
spatel+llvm at rotateright.com
Compile the following code with command line
clang '--target=x86_64-grtev4-linux-gnu' -maes -m64 -mcx16 -msse4.2 -mpclmul
'-mprefer-vector-width=128' -fexperimental-new-pass-manager
-fsized-deallocation -O3 '-std=gnu++17' -c scatter.cc -save-temps
__attribute((target("avx,avx2,fma,avx512f,avx512dq,avx512bw"))) void
foo(int d, const float* ptr, float* dest)
{
const float* ptr_end = ptr + d;
for (; ptr != ptr_end; ++ptr, dest += 16) {
dest[0] = ptr[-1 * d];
dest[1] = ptr[0 * d];
dest[2] = ptr[1 * d];
dest[3] = ptr[2 * d];
}
}
llvm generates 4 element scatters, which is more than 50% slower than scalar
version on my skylake desktop.
The problem is in function int X86TTIImpl::getGatherScatterOpCost(), it has
already found scatter is not profitable if avx512vl is not enabled, so it
should be scalarized, and return a scalarized cost. But the caller
LoopVectorize doesn't know it's a scalarized cost, it thinks it's a scatter
cost, and compares it with a different scalar cost computed by
getMemInstScalarizationCost, and unfortunately X86 backend computed scalar cost
is smaller than LoopVectorize computed scalar cost, so LoopVectorize thinks
scatter is cheaper than scalarize, and generates the slow scatter version.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20201207/aa3744f0/attachment.html>
More information about the llvm-bugs
mailing list