[llvm-bugs] [Bug 42411] New: Suboptimal code after vectorization of not unrolled loop with 8 iterations

Wed Jun 26 11:39:58 PDT 2019

https://bugs.llvm.org/show_bug.cgi?id=42411

            Bug ID: 42411
           Summary: Suboptimal code after vectorization of not unrolled
                    loop with 8 iterations
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Loop Optimizer
          Assignee: unassignedbugs at nondot.org
          Reporter: david.bolvansky at gmail.com
                CC: llvm-bugs at lists.llvm.org

Found in PR42410.

void foo (int *__restrict arr1, int *__restrict arr2)
{
  for (int i = 0; i < 8; i++)
    arr1[i] += arr2[i];
}

Clang -O3 -march=skylake -fno-unroll-loops 

foo(int*, int*):                             # @foo(int*, int*)
        xor     eax, eax
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        vmovdqu ymm0, ymmword ptr [rdi + 4*rax]
        vpaddd  ymm0, ymm0, ymmword ptr [rsi + 4*rax]
        vmovdqu ymmword ptr [rdi + 4*rax], ymm0
        add     rax, 8
        cmp     rax, 8
        jne     .LBB0_1
        vzeroupper
        ret

GCC / ICC produces with same flags:
foo(int*, int*):
        vmovdqu   ymm0, YMMWORD PTR [rdi]                       #6.5
        vpaddd    ymm1, ymm0, YMMWORD PTR [rsi]                 #6.5
        vmovdqu   YMMWORD PTR [rdi], ymm1                       #6.5
        vzeroupper                                              #7.1
        ret

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190626/1e097760/attachment.html>