[PATCH] D108885: [Delinerization] Keep array element byte offset.

Tue Aug 31 21:29:42 PDT 2021

Meinersbur added a comment.

In D108885#2975622 <https://reviews.llvm.org/D108885#2975622>, @bmahjour wrote:

> The Size array includes the element size, so it already makes up for the lack of a value for the outermost dimension.

That's the issue. It is contained in the Size array, but its subscript (which ideally would be between 0 <= subscript < EltSize), unlike for any other size, is dropped.

> Instead of treating the element offset as a subscript, could we make `computeAccessFunctions` return the offset (if non-zero) in a separate output parameter?

This would make handling of the element offset more complicated and error prone, while there is no inherent difference between a array dimension of constant size and the byte offset. As illustrated by this bug and the fact that DependenceAnalysis's subscript checking also just works for the array element offset.

> Or maybe we should consider delinearization unsuccessful when there is a non-zero remainder after the last real subscript is already computed.

Would be possibility, and was already done by computeAccessFunctions if the outermost operation was a SCEVAddRecExpr justified by being "too complicated". How can it be "too complicated" if it is just forgotten about?
It would also make make it impossible do use delinearization for array-of-structs. I would leave that decision what to support to the caller. At least for Polly I was considering supporting array-of-structs.

================
Comment at: llvm/lib/Analysis/DependenceAnalysis.cpp:3454
   // Fail when there is only a subscript: that's a linearized access function.
-  if (SrcSubscripts.size() < 2 || DstSubscripts.size() < 2 ||
+  if (SrcSubscripts.size() <= 2 || DstSubscripts.size() <= 2 ||
       SrcSubscripts.size() != DstSubscripts.size())
----------------
bmahjour wrote:
> I'm worried that having an extra subscript could confuse the analysis. Each separable subscript typically corresponds to a distinct loop level (unless the subscript is ZIV), but if we include the element offset as a "subscript" it would normally not have any corresponding loop levels.
Changed this to pop the last element before continuing unless `isZero()` is false and then bail out, like already done for Polly and LoopCacheAnalysis. We can check later how DA had to be changed to support this. At least the DelinearizationChecks below would just work and I would assume that DA works find as long as the accessed memory falls completely into the elements byte change, which is verified by the Delinearization check.

================
Comment at: llvm/test/Analysis/Delinearization/byte_offset.ll:10
+; CHECK: ArrayRef[0][{(6 + (4 * (sext i16 %i401 to i64))<nsw>)<nsw>,+,1}<%for.body.i4567>][((sext i32 %mul7.i4534 to i64) + {(sext i32 %i1 to i64),+,((sext i32 (1 + ((1 + %shl.i.i) * (1 + %shl.i.i)) + %shl.i.i) to i64) * (sext i32 %i1 to i64))}<%for.body703>)]
+
+%struct.S1 = type { %struct.S1*, i8*, i16, i16, i16, i16 }
----------------
bmahjour wrote:
> could we have the psudo-c for this IR added here as a comment?
Not sure how useful you find these manually-decompiled IR that come from llvm-reduce, but here it is.

You can also find the origin here: https://github.com/blender/blender/blob/765b842f9520843183bf0a3cdcd071f152bbbf9e/source/blender/blenkernel/intern/CCGSubSurf.c#L2131

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108885/new/

https://reviews.llvm.org/D108885