[llvm-bugs] [Bug 49577] New: USRA is replaced with USHR+ORR which results in poor codegen

via llvm-bugs llvm-bugs at lists.llvm.org
Sat Mar 13 05:12:19 PST 2021


https://bugs.llvm.org/show_bug.cgi?id=49577

            Bug ID: 49577
           Summary: USRA is replaced with USHR+ORR which results in poor
                    codegen
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: AArch64
          Assignee: unassignedbugs at nondot.org
          Reporter: kutdanila at yandex.ru
                CC: arnaud.degrandmaison at arm.com,
                    llvm-bugs at lists.llvm.org, smithp352 at googlemail.com,
                    Ties.Stuij at arm.com

int MoveMask(uint8x16_t input)
{
    uint16x8_t high_bits = vreinterpretq_u16_u8(vshrq_n_u8(input, 7));
    uint32x4_t paired16 =
        vreinterpretq_u32_u16(vsraq_n_u16(high_bits, high_bits, 7));
    uint64x2_t paired32 =
        vreinterpretq_u64_u32(vsraq_n_u32(paired16, paired16, 14));
    uint8x16_t paired64 =
        vreinterpretq_u8_u64(vsraq_n_u64(paired32, paired32, 28));
    return vgetq_lane_u8(paired64, 0) | ((int) vgetq_lane_u8(paired64, 8) <<
8);
}

Generates for vsraq_n_u16 and vsraq_n_u32 USHR+ORR instead of USRA like GCC
does

https://gcc.godbolt.org/z/MxP63x

Also in Match function there are two redundant AND with 0xff

Match(unsigned char):                              // @Match(unsigned char)
        adrp    x8, ctrl
        ldr     q0, [x8, :lo12:ctrl]
        dup     v1.16b, w0
        cmeq    v0.16b, v1.16b, v0.16b
        movi    v1.16b, #1
        and     v0.16b, v0.16b, v1.16b
        ushr    v1.8h, v0.8h, #7
        orr     v0.16b, v1.16b, v0.16b
        ushr    v1.4s, v0.4s, #14
        orr     v0.16b, v1.16b, v0.16b
        usra    v0.2d, v0.2d, #28
        umov    w8, v0.b[0]
        umov    w9, v0.b[8]
        and     x0, x8, #0xff // Not needed?
        and     x8, x9, #0xff // Not needed?
        bfi     x0, x8, #8, #8
        ret

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210313/41b1971e/attachment.html>


More information about the llvm-bugs mailing list