[llvm-bugs] [Bug 50598] New: suboptimal vectorization on function call with >10 parameters with -march=znver2

via llvm-bugs llvm-bugs at lists.llvm.org
Sun Jun 6 12:49:10 PDT 2021


https://bugs.llvm.org/show_bug.cgi?id=50598

            Bug ID: 50598
           Summary: suboptimal vectorization on function call with >10
                    parameters with -march=znver2
           Product: new-bugs
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: ehb5643 at gmail.com
                CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org

example code can be found on [github](https://github.com/Apache-HB/bench).

when benchmarking this code on a 2700X (znver2) the highest optimization level
performs 10% slower than GCC generated code on average.

llvm pushes arguments onto the stack with
```
vbroadcastsd r256, m64
vmovups m256, r256
```

which is slower on znver2 than the less vectorized
```
push r64
push r64
push r64
push r64
```

currently llvm generates
```
mov m64, imm64
mov m64, imm64
mov m64, imm64
mov m64, imm64
```
without vectorization enabled which is 20% slower than GCC and 10% slower than
the vectorized equivalent on znver2.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210606/5d8d9eb7/attachment.html>


More information about the llvm-bugs mailing list