[PATCH] ARM NEON: Handle v16i8 and v8i16 reverse shuffles
Renato Golin
renato.golin at linaro.org
Sat Feb 9 11:31:31 PST 2013
Looks good.
On 9 February 2013 19:02, Arnold Schwaighofer <aschwaighofer at apple.com>wrote:
> Lower reverse shuffles to a vrev64 and a vext instruction instead of the
> default
> legalization of storing and loading to the stack. This is important
> because we
> generate reverse shuffles in the loop vectorizer when we reverse store to
> an
> array.
>
> uint8_t Arr[N];
> for (i = 0; i < N; ++i)
> Arr[N - i - 1] = …
>
> For v8i16 we now generate something like:
>
> vrev64.16 q9, q9
> vext.16 q9, q9, q9, #4
>
> instead of:
>
> orr r1, r0, #14
> vst1.16 {d16[0]}, [r1, :16]
> orr r1, r0, #12
> vst1.16 {d16[1]}, [r1, :16]
> orr r1, r0, #10
> vst1.16 {d16[2]}, [r1, :16]
> orr r1, r0, #8
> vst1.16 {d16[3]}, [r1, :16]
> orr r1, r0, #6
> vst1.16 {d17[0]}, [r1, :16]
> orr r1, r0, #4
> vst1.16 {d17[1]}, [r1, :16]
> orr r1, r0, #2
> vst1.16 {d17[2]}, [r1, :16]
> vst1.16 {d17[3]}, [r0, :16]
> vld1.64 {d16, d17}, [r0, :128]
>
>
> For v16i8 we now generate something like:
>
> vrev64.8 q8, q8
> vext.8 q8, q8, q8, #8
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130209/983ad8f6/attachment.html>
More information about the llvm-commits
mailing list