[llvm-bugs] [Bug 51854] New: memset with length 2^N where N=2..7 is vectorized even with -Oz enabled

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Sep 14 08:49:25 PDT 2021


https://bugs.llvm.org/show_bug.cgi?id=51854

            Bug ID: 51854
           Summary: memset with length 2^N where N=2..7 is vectorized even
                    with -Oz enabled
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: vdsered at gmail.com
                CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
                    llvm-dev at redking.me.uk, pengfei.wang at intel.com,
                    spatel+llvm at rotateright.com

Memset is vectorized with flags -Oz and -Os when the length is equal to 2^N
where N=2..7. There is no such behaviour in gcc, for example. I guess, it is
okay to vectorize this code with O3, but for Oz this shouldn't be done.

Source:
void func(int *P) {
    memset(P, 0, 128);
}


Clang's output with Oz (trunk, https://godbolt.org/z/a6vjjxKhz):
func(int*, int):                             # @func(int*, int)
        xorps   xmm0, xmm0
        movups  xmmword ptr [rdi + 112], xmm0
        movups  xmmword ptr [rdi + 96], xmm0
        movups  xmmword ptr [rdi + 80], xmm0
        movups  xmmword ptr [rdi + 64], xmm0
        movups  xmmword ptr [rdi + 48], xmm0
        movups  xmmword ptr [rdi + 32], xmm0
        movups  xmmword ptr [rdi + 16], xmm0
        movups  xmmword ptr [rdi], xmm0
        ret


If length > 128 with Oz/Os, then we generate this:
func(int*, int):                             # @func(int*, int)
        mov     edx, 256
        xor     esi, esi
        jmp     memset at PLT                      # TAILCALL


For gcc with Os the output is the same for any length (see
https://godbolt.org/z/1shqe319r):
func(int*, int):
        mov     ecx, X <-- X is the length
        xor     eax, eax
        rep stosd
        ret

So we expect that with Os and Oz flags we don't vectorize and generate the same
code as for the case with length > 128

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210914/1b8d81fc/attachment.html>


More information about the llvm-bugs mailing list