[PATCH] D10905: move DAGCombiner's allowableAlignment() helper function into the TLI

Tue Jul 7 13:24:15 PDT 2015

================
Comment at: lib/CodeGen/TargetLoweringBase.cpp:1547
@@ +1546,3 @@
+  // aligned access of this type is allowed too.
+  if (allowsMisalignedMemoryAccesses(VT, AddrSpace, Alignment, Fast))
+    return true;
----------------
spatel wrote:
> jyknight wrote:
> > This won't return the right result in Fast.
> > 
> > E.g. allowsMisalignedMemoryAccesses might return with `{ *Fast = false; return true; }`, which would then cause this function to claim that a properly aligned memory access is not fast.
> > 
> > That's obviously wrong, but I'm not really sure what Right is.
> > 
> > What's the actual contract for the ABI alignment and Preferred alignment information in DataLayout?
> > 
> > I think there's three things that LLVM might want to know about alignment and memory accesses:
> > 
> > 1) If I can choose any alignment for this data, what should it be? (getPrefTypeAlignment should return that, I think)
> > 
> > 2) Is this memory access guaranteed to work *AT ALL*, no matter the speed. I think that's a combination of getABITypeAlignment, and then allowsMisalignedMemoryAccesses without checking "Fast".
> > 
> >  Although: is it always guaranteed that the ABI alignment for a type is always okay to load/store from? It'd be sane for it to be, but it's not really totally crazy to have a target where that's not true: e.g. ABI alignment of 32bits for an i64, but requires 64bit alignment for an i64 load/store?) I guess that's not the case of any existing target, since LLVM doesn't support it.
> > 
> > 3) Is it worthwhile to combine some memory accesses into a larger memory access, given alignment of X? I'm not sure this is actually adequately expressed in LLVM right now. Could be it's intended to be getPrefTypeAlignment and allowsMisalignedMemoryAccesses (with testing Fast==true) is supposed to be this.
> > 
> >  But, is "preferred alignment" and "fast to access" supposed to always be the same thing? Because, e.g. MIPS says i8:8:32-i16:16:32-i64:64 -- so both i8 and i16 have abi alignment of their size, but preferred alignment of 32 bits. I don't know a lot about MIPS, so I don't know why it claims a preferred alignment of 32bits, but I think it has a perfectly good 1/2-byte load instructions that work at their natural alignment.
> > 
> >  So on MIPS (even pre-MIPS32r6/MIPS64r6 which apparently made unaligned access allowed), it seems like it should always be an improvement to merge 8-bit loads which are 16-bit aligned into a 16-bit load. Yet, going by getPrefTypeAlignment would say that's not ok.
> Hi James -
> 
> Thanks for the close review - much appreciated!
> 
> I can certainly fix the Fast bug that you noted, but I also don't know what the right answer is. 
> 
> My guess is that the MIPS DataLayout is wrong. Let's see if we can find out...
> 
> The 32-bit preferred alignment for MIPS i8/i16 was introduced way back here:
> http://reviews.llvm.org/rL53585 
> 
> Adding Bruno (author) and Daniel (currently listed as owner of the MIPS backend).
> 
I'm afraid I'm not sure why we prefer 32-bit alignment for i8 and i16. My best guess is that it pays off in library calls by allowing better-optimised memcpy/strcpy/strcmp/etc.

> I don't know a lot about MIPS, so I don't know why it claims a preferred alignment of 32bits, 
> but I think it has a perfectly good 1/2-byte load instructions that work at their natural alignment. 

That's right. As far as I know they're usually (possibly always) the same latency too.

> ... (even pre-MIPS32r6/MIPS64r6 which apparently made unaligned access 
> allowed) ...

That's right. 32r6/64r6 dropped the unaligned load/store left/right instruction pairs in favour a promise that something in the system will make unaligned load/store work without special instructions. Depending on the MIPS implementation, the 'something' could be full hardware support, or a software exception handler, or a hybrid such as hardware-support for unaligned accesses inside a cache-line and software otherwise.

> So on MIPS (even pre-MIPS32r6/MIPS64r6 which apparently made unaligned access 
> allowed), it seems like it should always be an improvement to merge 8-bit loads which are 
> 16-bit aligned into a 16-bit load. Yet, going by getPrefTypeAlignment would say that's not ok.

As far as the load is concerned, I agree it's always an improvement to merge like this. The risk is that manipulating the data might cost more than the saving.

It's worth mentioning that some MIPS implementations will do this optimization in hardware without the risk of needing additional data manipulation.

http://reviews.llvm.org/D10905