[llvm-bugs] [Bug 37882] New: Compiler generating inefficient code for some horizontal add/sub patterns after r334958

via llvm-bugs llvm-bugs at lists.llvm.org
Wed Jun 20 14:51:01 PDT 2018


https://bugs.llvm.org/show_bug.cgi?id=37882

            Bug ID: 37882
           Summary: Compiler generating inefficient code for some
                    horizontal add/sub patterns after r334958
           Product: new-bugs
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: douglas_yung at playstation.sony.com
                CC: llvm-bugs at lists.llvm.org

After the upstream change r334958, the compiler is no longer efficiently
optimizing a certain horizontal add/sub pattern. Consider the following code:

/* test.cpp */
#include <x86intrin.h>

__attribute__((noinline))
__m256d add_pd_001(__m256d a, __m256d b) {
  return (__m256d){ a[0] + a[1], b[0] + b[1], a[2] + a[3], b[2] + b[3] };
}

If you build the above code using "-S -O2 -march=bdver2 test.cpp" with a
compiler prior to r334958, the compiler generates just one horizontal add
instruction for the function add_pd_001():

vhaddpd %ymm1, %ymm0, %ymm0

The same function when built with r334958 or later now produces the following
assembly:

vhaddpd %xmm1, %xmm0, %xmm2
vextractf128    $1, %ymm1, %xmm1
vextractf128    $1, %ymm0, %xmm0
vhaddpd %xmm1, %xmm0, %xmm0
vinsertf128     $1, %xmm0, %ymm2, %ymm0

Horizontal subtraction is also affected similarly. If you replace the '+' with
'-' in the original example, you will see a similar change in the codegen after
r334958.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180620/18723e2c/attachment.html>


More information about the llvm-bugs mailing list