[llvm-dev] [arm, aarch64] Alignment checking in interleaved access pass

Alina Sbirlea via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 19 13:52:59 PDT 2016


Hi,

As a follow up to Patch D23646 <https://reviews.llvm.org/D23646>, I'm
trying to figure out if there should be an alignment check and what the
correct approach is.

Some background:
For stores, the pass turns:
%i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1,
                 <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
store <12 x i32> %i.vec, <12 x i32>* %ptr
Into:
%sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
%sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
%sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
call void llvm.aarch64.neon.st3(%sub.v0, %sub.v1, %sub.v2, %ptr)

The purpose of the above patch is to enable more general patterns such as
turning:
%i.vec = shuffle <32 x i32> %v0, <32 x i32> %v1,
                <4, 32, 16, 5, 33, 17, 6, 34, 18, 7, 35, 19>
store <12 x i32> %i.vec, <12 x i32>* %ptr
Into:
%sub.v0 = shuffle <32 x i32> %v0, <32 x i32> v1, <4, 5, 6, 7>
%sub.v1 = shuffle <32 x i32> %v0, <32 x i32> v1, <32, 33, 34, 35>
%sub.v2 = shuffle <32 x i32> %v0, <32 x i32> v1, <16, 17, 18, 19>
call void llvm.aarch64.neon.st3(%sub.v0, %sub.v1, %sub.v2, %ptr)

The question I'm trying to get answered if there should have been an
alignment check for the original pass, and, similarly, if there should be
an expanded one for the more general pattern.
In the example above, I was looking to check if the data at positions 4,
16, 32 is aligned, but I cannot get a clear picture on the impact on
performance (hence the side question below).
Also, some preliminary alignment checks I added break some ARM tests (and
not their AArch64 counterparts). The cause is getting "not fast" from
allowsMisalignedMemoryAccesses, from checking hasV7Ops.
I'd appreciate getting some guidance one how to best address and analyze
this.

Side question for Tim and other ARM folks, could I get a recommendation on
reading material for performance tuning for the different ARM archs?

Thank you,
Alina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160919/516a784b/attachment.html>


More information about the llvm-dev mailing list