[PATCH] D50644: [WIP] [LAA] Allow runtime checks when strides different but address space does not wrap around
Anna Thomas via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 17 10:26:04 PDT 2018
anna added a comment.
In https://reviews.llvm.org/D50644#1204214, @sbaranga wrote:
> Hi Anna,
>
> If the distance between the source and the sink is loop invariant, how can these have different strides (unless they are also loop invariant)? Could you give an example?
>
> Thanks,
> Silviu
Hi Silviu,
I could not get a test case for this specific patch, but let me start with the original motivation.
I will reduce the test case. The debug-info shows something like this:
Pointer access with non-constant stride
LAA: Bad stride - Not an AddRecExpr pointer %tmp97 = getelementptr inbounds i8, i8 addrspace(1)* %tmp89, i64 62 SCEV: (62 + %tmp89)<nsw>
LAA: Bad stride - Not an AddRecExpr pointer %tmp98 = getelementptr inbounds i8, i8 addrspace(1)* %tmp89, i64 66 SCEV: (66 + %tmp89)<nsw>
LAA: Src Scev: (62 + %tmp89)<nsw>Sink Scev: (66 + %tmp89)<nsw>(Induction step: 0)
LAA: Distance for store i8 1, i8 addrspace(1)* %tmp97, align 2, !tbaa !85, !azul-base-pointer-java-type !33 to store i8 0, i8 addrspace(1)* %tmp98, align 1, !tbaa !86, !azul-base-pointer-java-type !33: 4
%tmp89 is a header phi which is loop varying.
The strides were not add rec pointers, but the distance between source and sink scev is constant.
I initially had a patch like this which allowed vectorizing the loop in question with runtime checks:
const SCEVConstant *C = dyn_cast<SCEVConstant>(Dist);
if (!StrideAPtr || !StrideBPtr || StrideAPtr != StrideBPtr){
if (C || PSE.getSE()->isLoopInvariant(Dist, InnermostLoop))
ShouldRetryWithRuntimeCheck = true;
return Dependence::Unknown;
}
However, that was failing a test case in pointer-with-unknown-bounds.ll.
; for (i = 0; i < 20; ++i)
; A[i*i] *= 2;
(we were saying it's legal with runtime checks and generating a vector loop - I can see this being a vector gather and scatter, once we prove i*i doesn't overflow).
I don't understand why we should care about the stride being an addrecPtr for the vectorizer though, because the vectorizer has it's own SCEV overflow checks before deciding on vectorizing (and generating a wide load). What am I missing here?
Thanks,
Anna
Repository:
rL LLVM
https://reviews.llvm.org/D50644
More information about the llvm-commits
mailing list