[llvm-bugs] [Bug 50256] New: vectorize widening instructions

via llvm-bugs llvm-bugs at lists.llvm.org
Fri May 7 01:59:36 PDT 2021


https://bugs.llvm.org/show_bug.cgi?id=50256

            Bug ID: 50256
           Summary: vectorize widening instructions
           Product: new-bugs
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: sjoerd.meijer at arm.com
                CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org

GCC11 learned a new trick[1] and is now able to vectorise widening instruction
much better. Copying for completeness the example[2] here:

  void wide1(char * __restrict a, short *__restrict b, int n) {
    for (int x = 0; x < 16; x++)
      b[x] = a[x] << 8;
  }

GCC11 generates:

        ldr     q0, [x0]
        shll    v1.8h, v0.8b, 8
        shll2   v0.8h, v0.16b, 8
        stp     q1, q0, [x1]
        ret

whereas with trunk we generate:

        ldrb    w8, [x0]
        ldrb    w9, [x0, #1]
        ldrb    w10, [x0, #15]
        lsl     w8, w8, #8
        strh    w8, [x1]
        ldrb    w8, [x0, #2]
        lsl     w9, w9, #8
        strh    w9, [x1, #2]
        ldrb    w9, [x0, #3]
        lsl     w8, w8, #8
        strh    w8, [x1, #4]
        ldrb    w8, [x0, #4]
        lsl     w9, w9, #8
        strh    w9, [x1, #6]
        ldrb    w9, [x0, #5]
        lsl     w8, w8, #8
        strh    w8, [x1, #8]
        ldrb    w8, [x0, #6]
        lsl     w9, w9, #8
        strh    w9, [x1, #10]
        ldrb    w9, [x0, #7]
        lsl     w8, w8, #8
        strh    w8, [x1, #12]
        ldrb    w8, [x0, #8]
        lsl     w9, w9, #8
        strh    w9, [x1, #14]
        ldrb    w9, [x0, #9]
        lsl     w8, w8, #8
        strh    w8, [x1, #16]
        ldrb    w8, [x0, #10]
        lsl     w9, w9, #8
        strh    w9, [x1, #18]
        ldrb    w9, [x0, #11]
        lsl     w8, w8, #8
        strh    w8, [x1, #20]
        ldrb    w8, [x0, #12]
        lsl     w9, w9, #8
        strh    w9, [x1, #22]
        ldrb    w9, [x0, #13]
        lsl     w8, w8, #8
        strh    w8, [x1, #24]
        ldrb    w8, [x0, #14]
        lsl     w9, w9, #8
        strh    w9, [x1, #26]
        lsl     w9, w10, #8
        lsl     w8, w8, #8
        strh    w8, [x1, #28]
        strh    w9, [x1, #30]
        ret

We completely unroll this very early, and then fail to loop or slp vectorise
this (haven't looked into this yet, don't know yet which one).

[1]
https://community.arm.com/developer/tools-software/tools/b/tools-software-ides-blog/posts/performance-improvements-in-gcc-11
[2] https://godbolt.org/z/KPe6xjfed

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210507/cd55fdee/attachment-0001.html>


More information about the llvm-bugs mailing list