[llvm-bugs] [Bug 43495] New: [AARCH64] Clang should use xtn/shrn for shuffles

Sat Sep 28 21:34:28 PDT 2019

https://bugs.llvm.org/show_bug.cgi?id=43495

            Bug ID: 43495
           Summary: [AARCH64] Clang should use xtn/shrn for shuffles
           Product: libraries
           Version: 9.0
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: AArch64
          Assignee: unassignedbugs at nondot.org
          Reporter: husseydevin at gmail.com
                CC: arnaud.degrandmaison at arm.com,
                    llvm-bugs at lists.llvm.org, peter.smith at linaro.org,
                    Ties.Stuij at arm.com

On aarch64, shuffles suck.

Not only do you need to split up the instructions and use ext more, but they
are so much slower than on ARMv7a.

uint32x2_t get_even_lanes(uint32x4_t x)
{
    return __builtin_shufflevector(x, x, 0, 2);
}
uint32x2_t get_odd_lanes(uint32x4_t x)
{
    return __builtin_shufflevector(x, x, 1, 3);
}
uint32x2x2_t vzip_pairwise(uint32x4_t x)
{
    return vzip_u32(vget_low_u32(x), vget_high_u32(x));
}

Clang-9 emits this:

get_even_lanes:
        ext     v1.16b, v0.16b, v0.16b, #8
        zip1    v0.2s, v0.2s, v1.2s
        ret
get_odd_lanes:
        ext     v1.16b, v0.16b, v0.16b, #8
        zip1    v0.2s, v0.2s, v1.2s
        ret
vzip_pairwise:
        ext     v1.16b, v0.16b, v0.16b, #8
        zip1    v2.2s, v0.2s, v1.2s
        zip2    v1.2s, v0.2s, v1.2s
        mov     v0.16b, v2.16b
        ret

This is garbage.

It significantly better to do this instead:

get_even_lanes:
        xtn     v0.2s, v0.2d
        ret
get_odd_lanes:
        shrn    v0.2s, v0.2d, #32
        ret
vzip_pairwise:
        shrn    v1.2s, v0.2d, #32
        xtn     v0.2s, v0.2d
        ret

Side note: on 32-bit, if we kill the source, vshrn+vmovn and vrevNq+vuzpq (and
only using onne result) both lose a vzip in place, and is the best 0213 shuffle
aside from vld2.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190929/e440385f/attachment.html>