[PATCH] D98114: [Loads] Forward constant vector store to load of first element

Sat Apr 3 09:23:27 PDT 2021

nikic added inline comments.

================
Comment at: llvm/test/Transforms/InstCombine/load-store-forward.ll:33

+define i32 @vec_store_load_first_constexpr(i32* %p) {
+; CHECK-LABEL: @vec_store_load_first_constexpr(
----------------
bjope wrote:
> nikic wrote:
> > bjope wrote:
> > > Maybe we should have some test cases with types when DataLayout::typeSizeEqualsStoreSize returns false. I actually think the result would be wrong here if for example using i4 (or i13) instead of i32, because we don't really know which bits that are loaded.
> > > 
> > > So there might be a bug in ConstantFoldLoadThroughBitcast, that it does not bail out if typeSizeEqualsStoreSize is false for the DestTy, that would be exposed by adding such test cases here.
> > > 
> > > (Maybe even typeAllocSize is of importance and not only typeStoreSize, or can we ignore alignment for the scalar data type?)
> > I've added a test in https://github.com/llvm/llvm-project/commit/1470f94d71c544327f76b85c55cb6f7cb43a6cbb, does that match what you had in mind?
> > 
> > After reading through https://llvm.org/docs/LangRef.html#vector-type I //think// the result is fine, but the description there is certainly confusing. For little-endian it should certainly be fine, and for big endian the result is inverted twice (once per the described element order and then again through endianness), so should also come out to the same result.
> As someone has decided that the vectors are bitpacked, there is no individual alignment/padding for the elements in the datalayout, then I figure <2 x i17> must have the same alignment/padding as an i34. This because we allow bitcasting between such types, and bitcast is defined as storing using one type followed by a load with the other type..
> 
> However, it is not well-defined exactly where the padding goes when storing an type such as i34. Looking at https://llvm.org/docs/LangRef.html#load-instruction it says "When loading a value of a type like i20 with a size that is not an integral number of bytes, the result is undefined if the value was not originally written using a store of the same type. ".  I assume that rule applies to storing a vector and loading a difference sized and non byte-sized scalar as well.
> 
> I don't think the optimization should trigger for the test case added in https://github.com/llvm/llvm-project/commit/1470f94d71c544327f76b85c55cb6f7cb43a6cbb
Well, if you want to interpret that load wording strictly, the `load i17` would simply be UB in this case, because the value was not stored with `store i17`, and it's okay to optimize it arbitrarily. The load spec also renders doing perfectly sensible things like a load after a memset UB. I'm pretty sure that was not the intention. Probably this should be changed to specify the actual behavior of whatever SDAG legalization does.

I don't plan to pursue this issue myself though. If there is an issue here (which I'm not convinced of), then it is a pre-existing one, as I'm just reusing a generic helper.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98114/new/

https://reviews.llvm.org/D98114