[llvm-bugs] [Bug 37882] New: Compiler generating inefficient code for some horizontal add/sub patterns after r334958
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Jun 20 14:51:01 PDT 2018
https://bugs.llvm.org/show_bug.cgi?id=37882
Bug ID: 37882
Summary: Compiler generating inefficient code for some
horizontal add/sub patterns after r334958
Product: new-bugs
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: douglas_yung at playstation.sony.com
CC: llvm-bugs at lists.llvm.org
After the upstream change r334958, the compiler is no longer efficiently
optimizing a certain horizontal add/sub pattern. Consider the following code:
/* test.cpp */
#include <x86intrin.h>
__attribute__((noinline))
__m256d add_pd_001(__m256d a, __m256d b) {
return (__m256d){ a[0] + a[1], b[0] + b[1], a[2] + a[3], b[2] + b[3] };
}
If you build the above code using "-S -O2 -march=bdver2 test.cpp" with a
compiler prior to r334958, the compiler generates just one horizontal add
instruction for the function add_pd_001():
vhaddpd %ymm1, %ymm0, %ymm0
The same function when built with r334958 or later now produces the following
assembly:
vhaddpd %xmm1, %xmm0, %xmm2
vextractf128 $1, %ymm1, %xmm1
vextractf128 $1, %ymm0, %xmm0
vhaddpd %xmm1, %xmm0, %xmm0
vinsertf128 $1, %xmm0, %ymm2, %ymm0
Horizontal subtraction is also affected similarly. If you replace the '+' with
'-' in the original example, you will see a similar change in the codegen after
r334958.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180620/18723e2c/attachment.html>
More information about the llvm-bugs
mailing list