[PATCH] D12635: merge vector stores into wider vector stores and fix AArch64 misaligned access TLI hook (PR21711)
Ahmed Bougacha via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 25 11:44:18 PDT 2015
ab added a subscriber: ab.
ab accepted this revision.
ab added a reviewer: ab.
ab added a comment.
This revision is now accepted and ready to land.
LGTM
================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:809-812
@@ +808,6 @@
+
+ // Code that uses clang vector extensions can mark that it
+ // wants unaligned accesses to be treated as fast by
+ // underspecifying alignment to be 1 or 2.
+ Align <= 2 ||
+
----------------
spatel wrote:
> arsenm wrote:
> > Which extensions do you mean? I've been looking for a way to specify alignment of vector loads from C, but nothing I've tried seems to work.
> >
> > However, using the existence of this to justify reporting a different alignment as fast seems suspect.
> I agree that this looks hacky (along with the comment about optimizing for benchmarks), but the comments and code are copied directly from the existing performSTORECombine() (see around line 8476).
>
> I don't want to alter the existing Aarch logic for this patch (other than to fix the obviously broken allowsMisalignedMemoryAccesses() implementation to allow the vector merging in DAGCombiner).
I guess this refers to something like:
```
typedef int __attribute__((ext_vector_type(4))) v4i32;
typedef v4i32 __attribute__((aligned(2))) v4i32_a2;
v4i32 foo(v4i32 *p) {
v4i32_a2 *p2 = p;
return *p2;
}
```
Interestingly, this generates a naturally aligned load:
```
typedef int __attribute__((ext_vector_type(4))) v4i32;
typedef v4i32 __attribute__((aligned(2))) v4i32_a2;
v4i32 foo(v4i32 *p) {
return *(v4i32_a2 *)p;
}
```
http://reviews.llvm.org/D12635
More information about the llvm-commits
mailing list