[llvm-bugs] [Bug 35284] New: Vectorization improvement opportunity for max/min

Mon Nov 13 00:11:25 PST 2017

https://bugs.llvm.org/show_bug.cgi?id=35284

            Bug ID: 35284
           Summary: Vectorization improvement opportunity for max/min
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: serguei.katkov at azul.com
                CC: llvm-bugs at lists.llvm.org

This is about possibly better vectorization coding for float/double max/min.

Let's consider the following c++ program (https://godbolt.org/g/yWxTCB):
float test(float a[], int N) {
  float max = -1;
  for (int i = 0; i < N; i++)
    max = max < a[i] ? a[i] : max;
  return max;
}

clang trunk now generates a main loop with the following pattern:
movss + maxss (N times, where N might be 4 or 8).

icc 18 is able to use maxps packed instruction.

Even the code looking as:
movss 16 times (load values to xmm0-xmm15 registers)
ucommiss + ga (16 times)

and movaps in case of ga on slowpath

seems might be better.

I do not know what code would be best here but potentially there is an
opportunity here...

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171113/0a561389/attachment-0001.html>