[PATCH] D108885: [Delinerization] Keep array element byte offset.

Wed Sep 1 10:13:39 PDT 2021

bmahjour added a comment.

In D108885#2976089 <https://reviews.llvm.org/D108885#2976089>, @Meinersbur wrote:

> In D108885#2975622 <https://reviews.llvm.org/D108885#2975622>, @bmahjour wrote:
>
>> Instead of treating the element offset as a subscript, could we make `computeAccessFunctions` return the offset (if non-zero) in a separate output parameter?
>
> This would make handling of the element offset more complicated and error prone, while there is no inherent difference between a array dimension of constant size and the byte offset. As illustrated by this bug and the fact that DependenceAnalysis's subscript checking also just works for the array element offset.

The way "element offset" is being handled in this patch is to isolate it (from the rest of the subscripts) and ignore it everywhere except for `ScopDetection.cpp` where it is still isolated but used to decide if the access is affine. If it needs to be isolated to be handled, then I don't see how returning it as a separate parameter makes it more complicated. The original goal of the delinearization algorithm is to recover source level subscripts. While the "element offset" can be thought of as a byte index into an imaginary inner-most dimension, it is not something that corresponds to a source level array subscript.

>> Or maybe we should consider delinearization unsuccessful when there is a non-zero remainder after the last real subscript is already computed.
>
> Would be possibility, and was already done by computeAccessFunctions if the outermost operation was a SCEVAddRecExpr justified by being "too complicated". How can it be "too complicated" if it is just forgotten about?
> It would also make make it impossible do use delinearization for array-of-structs. I would leave that decision what to support to the caller. At least for Polly I was considering supporting array-of-structs.

Yeah, I'm not sure why it was considered "too complicated" only when the remainder was an AddRec. The paper <https://dl.acm.org/doi/10.1145/2751205.2751248> says that the original polynomial expression (representing the linearized access function) is first divided by the element size, but they don't say what it means if there is a remainder and what to do with it. I would assume that having an access function that doesn't evenly divide the element size cannot be safely delinearized.

I tried a simple example and it seems that the "element offset" is zero regardless of which member of a structure is being accessed, so not sure if it has any bearing on the ability to delinearize arrays of structs:

  > cat delin_struct.c
  struct MyS {
    float a, b;
  };

  void foo(int n, int m, struct MyS f1[][n][m]) {
    for (int i = 0; i < 1024; i++)
      for (int j = 0; j < n; j++)
        for (int k = 0; k < m; k++)
          f1[i][j][k].a = f1[i][j][k].b;
  }

  opt -passes='print<delinearization>' -disable-output delin_struct.simp.ll 2>&1 | grep ArrayRef
  ArrayRef[{0,+,2}<%for.body>][{0,+,2}<%for.body4>][{1,+,2}<%for.body8>][0]
  ArrayRef[{0,+,2}<%for.body>][{2,+,2}<%for.body4>][-1][0]
  ArrayRef[0][{(-1 + (2 * (zext i32 %n to i64) * (zext i32 %m to i64))),+,(2 * (zext i32 %n to i64) * (zext i32 %m to i64))}<%for.body>][0]
  ArrayRef[{0,+,2}<%for.body>][{0,+,2}<%for.body4>][{0,+,2}<%for.body8>][0]
  ArrayRef[{0,+,2}<%for.body>][{2,+,2}<%for.body4>][-2][0]
  ArrayRef[0][{(-2 + (2 * (zext i32 %n to i64) * (zext i32 %m to i64))),+,(2 * (zext i32 %n to i64) * (zext i32 %m to i64))}<%for.body>][0]

Note that the subscript in the last dimension (assumed byte array) is 0! IR attached: F18788796: delin_struct.simp.ll <https://reviews.llvm.org/F18788796>

================
Comment at: llvm/lib/Analysis/DependenceAnalysis.cpp:3453

+  // computeAccessFunctions also returns the by offset into the array element,
+  // which DependenceAnalysis does not handle. Pop that last subscript and bail
----------------
[nit] the by offset -> the byte offset

================
Comment at: llvm/test/Analysis/Delinearization/byte_offset.ll:10
+; CHECK: ArrayRef[0][{(6 + (4 * (sext i16 %i401 to i64))<nsw>)<nsw>,+,1}<%for.body.i4567>][((sext i32 %mul7.i4534 to i64) + {(sext i32 %i1 to i64),+,((sext i32 (1 + ((1 + %shl.i.i) * (1 + %shl.i.i)) + %shl.i.i) to i64) * (sext i32 %i1 to i64))}<%for.body703>)]
+
+%struct.S1 = type { %struct.S1*, i8*, i16, i16, i16, i16 }
----------------
Meinersbur wrote:
> bmahjour wrote:
> > could we have the psudo-c for this IR added here as a comment?
> Not sure how useful you find these manually-decompiled IR that come from llvm-reduce, but here it is.
> 
> You can also find the origin here: https://github.com/blender/blender/blob/765b842f9520843183bf0a3cdcd071f152bbbf9e/source/blender/blenkernel/intern/CCGSubSurf.c#L2131
> 
I haven't looked at the original example yet, but this looks like a case where delinearization should be expected to fail. The recovered subscripts above don't look anything like what the source level subscripts are.

================
Comment at: polly/lib/Analysis/ScopDetection.cpp:1029
+          const SCEV *AccOffset = Acc->DelinearizedSubscripts.pop_back_val();
+          const SCEV *ArrSize = Shape->DelinearizedSizes.back();
+          const SCEV *AccSize = Context.ElementSize[BasePointer];
----------------
[nit] `ElementSize` may be a better name than `ArrSize`?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108885/new/

https://reviews.llvm.org/D108885