[PATCH] D50644: [WIP] [LAA] Allow runtime checks when strides different but address space does not wrap around

Anna Thomas via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 17 10:26:04 PDT 2018


anna added a comment.

In https://reviews.llvm.org/D50644#1204214, @sbaranga wrote:

> Hi Anna,
>
> If the distance between the source and the sink is loop invariant, how can these have different strides (unless they are also loop invariant)? Could you give an example?
>
> Thanks,
> Silviu


Hi Silviu,
I could not get a test case for this specific patch, but let me start with the original motivation.

I will reduce the test case. The debug-info shows something like this:

  Pointer access with non-constant stride
  LAA: Bad stride - Not an AddRecExpr pointer   %tmp97 = getelementptr inbounds i8, i8 addrspace(1)* %tmp89, i64 62 SCEV: (62 + %tmp89)<nsw>
  LAA: Bad stride - Not an AddRecExpr pointer   %tmp98 = getelementptr inbounds i8, i8 addrspace(1)* %tmp89, i64 66 SCEV: (66 + %tmp89)<nsw>
  LAA: Src Scev: (62 + %tmp89)<nsw>Sink Scev: (66 + %tmp89)<nsw>(Induction step: 0)
  LAA: Distance for   store i8 1, i8 addrspace(1)* %tmp97, align 2, !tbaa !85, !azul-base-pointer-java-type !33 to   store i8 0, i8 addrspace(1)* %tmp98, align 1, !tbaa !86, !azul-base-pointer-java-type !33: 4

%tmp89 is a header phi which is loop varying.
The strides were not add rec pointers, but the distance between source and sink scev is constant.

I initially had a patch like this which allowed vectorizing the loop in question with runtime checks:

  const SCEVConstant *C = dyn_cast<SCEVConstant>(Dist);
  if (!StrideAPtr || !StrideBPtr || StrideAPtr != StrideBPtr){
     if (C || PSE.getSE()->isLoopInvariant(Dist, InnermostLoop))
        ShouldRetryWithRuntimeCheck = true;
      return Dependence::Unknown;
  }

However, that was failing a test case in pointer-with-unknown-bounds.ll. 
; for (i = 0; i < 20; ++i)
;   A[i*i] *= 2;

(we were saying it's legal with runtime checks and generating a vector loop - I can see this being a vector gather and scatter, once we prove i*i doesn't overflow).

I don't understand why we should care about the stride being an addrecPtr for the vectorizer though, because the vectorizer has it's own SCEV overflow checks before deciding on vectorizing (and generating a wide load). What am I missing here?

Thanks,
Anna


Repository:
  rL LLVM

https://reviews.llvm.org/D50644





More information about the llvm-commits mailing list