[PATCH] ARM NEON: Handle v16i8 and v8i16 reverse shuffles
Arnold Schwaighofer
aschwaighofer at apple.com
Sat Feb 9 11:02:38 PST 2013
Lower reverse shuffles to a vrev64 and a vext instruction instead of the default
legalization of storing and loading to the stack. This is important because we
generate reverse shuffles in the loop vectorizer when we reverse store to an
array.
uint8_t Arr[N];
for (i = 0; i < N; ++i)
Arr[N - i - 1] = …
For v8i16 we now generate something like:
vrev64.16 q9, q9
vext.16 q9, q9, q9, #4
instead of:
orr r1, r0, #14
vst1.16 {d16[0]}, [r1, :16]
orr r1, r0, #12
vst1.16 {d16[1]}, [r1, :16]
orr r1, r0, #10
vst1.16 {d16[2]}, [r1, :16]
orr r1, r0, #8
vst1.16 {d16[3]}, [r1, :16]
orr r1, r0, #6
vst1.16 {d17[0]}, [r1, :16]
orr r1, r0, #4
vst1.16 {d17[1]}, [r1, :16]
orr r1, r0, #2
vst1.16 {d17[2]}, [r1, :16]
vst1.16 {d17[3]}, [r0, :16]
vld1.64 {d16, d17}, [r0, :128]
For v16i8 we now generate something like:
vrev64.8 q8, q8
vext.8 q8, q8, q8, #8
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ARM-NEON-Handle-v16i8-and-v8i16-reverse-shuffles.patch
Type: application/octet-stream
Size: 5237 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130209/7abdeb9c/attachment.obj>
-------------- next part --------------
More information about the llvm-commits
mailing list