[llvm-bugs] [Bug 49577] New: USRA is replaced with USHR+ORR which results in poor codegen
via llvm-bugs
llvm-bugs at lists.llvm.org
Sat Mar 13 05:12:19 PST 2021
https://bugs.llvm.org/show_bug.cgi?id=49577
Bug ID: 49577
Summary: USRA is replaced with USHR+ORR which results in poor
codegen
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: AArch64
Assignee: unassignedbugs at nondot.org
Reporter: kutdanila at yandex.ru
CC: arnaud.degrandmaison at arm.com,
llvm-bugs at lists.llvm.org, smithp352 at googlemail.com,
Ties.Stuij at arm.com
int MoveMask(uint8x16_t input)
{
uint16x8_t high_bits = vreinterpretq_u16_u8(vshrq_n_u8(input, 7));
uint32x4_t paired16 =
vreinterpretq_u32_u16(vsraq_n_u16(high_bits, high_bits, 7));
uint64x2_t paired32 =
vreinterpretq_u64_u32(vsraq_n_u32(paired16, paired16, 14));
uint8x16_t paired64 =
vreinterpretq_u8_u64(vsraq_n_u64(paired32, paired32, 28));
return vgetq_lane_u8(paired64, 0) | ((int) vgetq_lane_u8(paired64, 8) <<
8);
}
Generates for vsraq_n_u16 and vsraq_n_u32 USHR+ORR instead of USRA like GCC
does
https://gcc.godbolt.org/z/MxP63x
Also in Match function there are two redundant AND with 0xff
Match(unsigned char): // @Match(unsigned char)
adrp x8, ctrl
ldr q0, [x8, :lo12:ctrl]
dup v1.16b, w0
cmeq v0.16b, v1.16b, v0.16b
movi v1.16b, #1
and v0.16b, v0.16b, v1.16b
ushr v1.8h, v0.8h, #7
orr v0.16b, v1.16b, v0.16b
ushr v1.4s, v0.4s, #14
orr v0.16b, v1.16b, v0.16b
usra v0.2d, v0.2d, #28
umov w8, v0.b[0]
umov w9, v0.b[8]
and x0, x8, #0xff // Not needed?
and x8, x9, #0xff // Not needed?
bfi x0, x8, #8, #8
ret
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210313/41b1971e/attachment.html>
More information about the llvm-bugs
mailing list