[PATCH] D12635: merge vector stores into wider vector stores and fix AArch64 misaligned access TLI hook (PR21711)

Fri Sep 25 16:54:25 PDT 2015

ab added inline comments.

================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:809-812
@@ +808,6 @@
+
+            // Code that uses clang vector extensions can mark that it
+            // wants unaligned accesses to be treated as fast by
+            // underspecifying alignment to be 1 or 2.
+            Align <= 2 ||
+
----------------
scanon wrote:
> arsenm wrote:
> > ab wrote:
> > > spatel wrote:
> > > > arsenm wrote:
> > > > > Which extensions do you mean? I've been looking for a way to specify alignment of vector loads from C, but nothing I've tried seems to work.
> > > > > 
> > > > > However, using the existence of this to justify reporting a different alignment as fast seems suspect.
> > > > I agree that this looks hacky (along with the comment about optimizing for benchmarks), but the comments and code are copied directly from the existing performSTORECombine() (see around line 8476). 
> > > > 
> > > > I don't want to alter the existing Aarch logic for this patch (other than to fix the obviously broken allowsMisalignedMemoryAccesses() implementation to allow the vector merging in DAGCombiner).
> > > I guess this refers to something like:
> > > 
> > > ```
> > > typedef int __attribute__((ext_vector_type(4))) v4i32;
> > > typedef v4i32 __attribute__((aligned(2))) v4i32_a2;
> > > 
> > > v4i32 foo(v4i32 *p) {
> > >   v4i32_a2 *p2 = p;
> > >   return *p2;
> > > }
> > > ```
> > > 
> > > Interestingly, this generates a naturally aligned load:
> > > 
> > > ```
> > > typedef int __attribute__((ext_vector_type(4))) v4i32;
> > > typedef v4i32 __attribute__((aligned(2))) v4i32_a2;
> > > 
> > > v4i32 foo(v4i32 *p) {
> > >   return *(v4i32_a2 *)p;
> > > }
> > > ```
> > Looks like a bug to me
> That's expected, because p has 16B alignment; that alignment doesn't go away just because you cast the pointer to a less-aligned type.
> 
> To get an unaligned load, you do something like:
> 
> ```
> typedef int __attribute__((ext_vector_type(4))) v4i32;
> typedef int __attribute__((ext_vector_type(4), aligned(4))) v4i32_a4;
> v4i32 foo(const int *p) {
>   return *(const v4i32_a4 *)p;
> }
> ```
> That's expected, because p has 16B alignment; that alignment doesn't go away just because you cast the pointer to a less-aligned type.

You're right, but can't we argue the opposite? If I want to make the alignment go away, isn't attribute((aligned)) the best (or at least a very obvious) tool at my disposal?

In any case, I guess we should most importantly match gcc's behavior, and from what I see this changed since ~r246705, possibly as unintentional fallout of the recent alignment tracking improvements: filed PR24944 to look into this.

Repository:
  rL LLVM

http://reviews.llvm.org/D12635