[PATCH] D94964: [LangRef] Describe memory layout for vectors types

Wed Mar 17 09:53:27 PDT 2021

bjope added inline comments.

================
Comment at: llvm/docs/LangRef.rst:3226
+
+When ``<N*M>`` isn't evenly divisible by the byte size the memory layout is
+unspecified.
----------------
dmgreen wrote:
> bjope wrote:
> > dmgreen wrote:
> > > Does this apply to a v4i1? I thought that worked the same way as any other i1 type. The defined bits end up in the MSBs.
> > > Does this apply to a v4i1? I thought that worked the same way as any other i1 type. The defined bits end up in the MSBs.
> > 
> > I did not know about such rules for i1 (or other non-byte-sized first class types). Is that really specified somewhere?
> > 
> > The description for `store`, https://llvm.org/docs/LangRef.html#store-instruction , says that "When writing a value of a type like i20 with a size that is not an integral number of bytes, it is unspecified what happens to the extra bits that do not belong to the type, but they will typically be overwritten.". That is not really saying anything about where the padding bits are placed either. I've assumed that the placement is unspecified as well (as I've never seen any definition).
> I may be wrong about the MSB. It will already be used in certain parts of llvm though, if we have a <X x i1> masked load that is scalarized, it will bitcast the predicate to a iX.
> 
> Alive defines it like this:
> https://alive2.llvm.org/ce/z/w5BhQa
> 
> Which llc seems to agree with, from the mov r0, #4:
> https://godbolt.org/z/7Tx6aM
> 
> But it was the masked load scalarization that was being fixed in D94765, so it may be worth pinning down the meaning.
I don't think the result from a backend (even alive in this case) really say if it is defined in the IR. I believe a target is likely to define where the padding goes if loading/storing non-byte-sized types, but LLVM does not know about it. Transformations on LLVM IR should therefore be extra careful when handling types with different "type size" and "type store size" (and several passes for example use `DataLayout::typeSizeEqualsStoreSize` to avoid certain transformations).

Here is an example using "opt -O3" that show differences between little/big endian, and also that opt isn't able to simplify your example "src2" with `<4 x i1>`:
https://godbolt.org/z/rvMxd1

Another way to see it is that you may bitcast <4 x i1> to i4 (from one first class non-agg type to another one with the same size), but you can't bitcast i4 to i8 (and bitcast is basically defined as a store (using the src type) followed by a load (using the dst type).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94964/new/

https://reviews.llvm.org/D94964