[PATCH] D98263: [LoopInterchange] fix tightlyNested() in LoopInterchange legality
Congzhe Cao via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 10 22:51:38 PST 2021
congzhe added a comment.
In D98263#2618403 <https://reviews.llvm.org/D98263#2618403>, @TaWeiTu wrote:
> Hi @congzhe, thanks for the fix!
>
> I think it'd probably be better to keep the `LoopNest::checkLoopsStructure` and add the additional checks of branch instructions for now.
> By doing so, future changes to `LoopNest` will also apply in `LoopInterchange`, which is what we are trying to achieve I think (by making the definition of perfectly-nested loop in sync).
> The check can be removed in the future, when `LoopNest` is eventually taught to correctly identify guard branches (`singleSucc` is currently considered to be the guard branch of the inner loop, but the loops should only be considered perfectly-nested if the guard branch corresponds to the one added by `LoopRotate` IMO, which is not true in this case).
>
> What do you think?
Thanks for the reply! I see your point.
I am just not entirely sure whether we should keep `LoopNest::checkLoopsStructure()` in `tightlyNested()` in loop interchange. As I described above, `LoopNest::checkLoopsStructure()` does comprehensive checks and some of the scenarios that it considered may not be applicable to loop interchange, because the transformation phase in loop interchange is not comprehensive enough to handle all those scenarios. The scenario with the inner loop guard issue as I added in this patch is only one of those, I can think of other scenarios that can pass `LoopNest::checkLoopsStructure()` but fails loop interchange transformation phase. I have listed two of them below.
Therefore even we handle the inner loop guard appropriately, there's other scenarios that would fail if we do keep `LoopNest::checkLoopsStructure()`. Essentially I think from practical point of view, it does make sense that `tightlyNest()` is not entirely the same as `LoopNest::checkLoopsStructure()`, since `tightlyNest()` is specifically tailored for loop interchange hence has its own logic in checking the loop structure.
**Scenario 1**:
@A = common global [100 x [100 x i64]] zeroinitializer
@B = common global [100 x i64] zeroinitializer
define void @crash(i64 %k, i64 %N, i64 signext %ny) {
entry:
br label %for1.header
for1.header:
%j23 = phi i64 [ 0, %entry ], [ %j.next24, %for1.inc10 ]
%cmp21 = icmp slt i64 0, %ny
br i1 %cmp21, label %preheader.j, label %for.j.end
preheader.j:
br label %for2
for2:
%j = phi i64 [ %j.next, %for.j.inc ], [ 0, %preheader.j ]
%arrayidx5 = getelementptr inbounds [100 x [100 x i64]], [100 x [100 x i64]]* @A, i64 0, i64 %j, i64 %j23
%lv = load i64, i64* %arrayidx5
%add = add nsw i64 %lv, %k
store i64 %add, i64* %arrayidx5
br label %for.j.inc
for.j.inc:
%j.next = add nuw nsw i64 %j, 1
%exitcond = icmp eq i64 %j, 99
br i1 %exitcond, label %for2, label %for.j.end_crit_edge
for.j.end_crit_edge:
%split = phi i64 [ %add, %for.j.inc ]
br label %for.j.end
for.j.end:
;%res.1.lcssa = phi i64 [ %split, %for.j.end_crit_edge ], [ %k, %for1.header ]
br label %for1.inc10
for1.inc10:
%j.next24 = add nuw nsw i64 %j23, 1
%exitcond26 = icmp eq i64 %j23, 99
br i1 %exitcond26, label %for.end12, label %for1.header
for.end12:
ret void
}
**Scenario 2**
@A = common global [100 x [100 x i64]] zeroinitializer
@B = common global [100 x i64] zeroinitializer
define void @crash(i64 %k, i64 %N, i64 signext %ny) {
entry:
br label %for1.header
for1.header:
%j23 = phi i64 [ 0, %entry ], [ %j.next24, %for1.inc10 ]
br label %singleSucc
singleSucc:
br label %preheader.j
preheader.j:
br label %preheader.j2
preheader.j2:
br label %preheader.j3
preheader.j3:
br label %for2
for2:
%j = phi i64 [ %j.next, %for2 ], [ 0, %preheader.j3 ]
%arrayidx5 = getelementptr inbounds [100 x [100 x i64]], [100 x [100 x i64]]* @A, i64 0, i64 %j, i64 %j23
%lv = load i64, i64* %arrayidx5
%add = add nsw i64 %lv, %k
store i64 %add, i64* %arrayidx5
%j.next = add nuw nsw i64 %j, 1
%exitcond = icmp eq i64 %j, 99
br i1 %exitcond, label %for1.inc10, label %for2
for1.inc10:
%j.next24 = add nuw nsw i64 %j23, 1
%exitcond26 = icmp eq i64 %j23, 99
br i1 %exitcond26, label %for.end12, label %for1.header
for.end12:
ret void
}
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D98263/new/
https://reviews.llvm.org/D98263
More information about the llvm-commits
mailing list