[PATCH] D98263: [LoopInterchange] fix tightlyNested() in LoopInterchange legality

Congzhe Cao via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 10 22:51:38 PST 2021


congzhe added a comment.

In D98263#2618403 <https://reviews.llvm.org/D98263#2618403>, @TaWeiTu wrote:

> Hi @congzhe, thanks for the fix!
>
> I think it'd probably be better to keep the `LoopNest::checkLoopsStructure` and add the additional checks of branch instructions for now.
> By doing so, future changes to `LoopNest` will also apply in `LoopInterchange`, which is what we are trying to achieve I think (by making the definition of perfectly-nested loop in sync).
> The check can be removed in the future, when `LoopNest` is eventually taught to correctly identify guard branches (`singleSucc` is currently considered to be the guard branch of the inner loop, but the loops should only be considered perfectly-nested if the guard branch corresponds to the one added by `LoopRotate` IMO, which is not true in this case).
>
> What do you think?

Thanks for the reply! I see your point.

I am just not entirely sure whether we should keep `LoopNest::checkLoopsStructure()`  in `tightlyNested()` in loop interchange. As I described above, `LoopNest::checkLoopsStructure()` does comprehensive checks and some of the scenarios that it considered may not be applicable to loop interchange, because the transformation phase in loop interchange is not comprehensive enough to handle all those scenarios. The scenario with the inner loop guard issue as I added in this patch is only one of those, I can think of other scenarios that can pass `LoopNest::checkLoopsStructure()` but fails loop interchange transformation phase. I have listed two of them below.

Therefore even we handle the inner loop guard appropriately, there's other scenarios that would fail if we do keep `LoopNest::checkLoopsStructure()`. Essentially I think from practical point of view, it does make sense that `tightlyNest()` is not entirely the same as `LoopNest::checkLoopsStructure()`, since `tightlyNest()` is specifically tailored for loop interchange hence has its own logic in checking the loop structure.

**Scenario 1**:

  @A = common global [100 x [100 x i64]] zeroinitializer
  @B = common global [100 x i64] zeroinitializer
  
  define void @crash(i64 %k, i64 %N, i64 signext %ny) {
  entry:
    br label %for1.header
  
  for1.header:
    %j23 = phi i64 [ 0, %entry ], [ %j.next24, %for1.inc10 ]
    %cmp21 = icmp slt i64 0, %ny
    br i1 %cmp21, label %preheader.j, label %for.j.end
  
  preheader.j:  
    br label %for2
  
  for2:
    %j = phi i64 [ %j.next, %for.j.inc ], [ 0, %preheader.j ]
    %arrayidx5 = getelementptr inbounds [100 x [100 x i64]], [100 x [100 x i64]]* @A, i64 0, i64 %j, i64 %j23
    %lv = load i64, i64* %arrayidx5
    %add = add nsw i64 %lv, %k
    store i64 %add, i64* %arrayidx5
    br label %for.j.inc
  
  for.j.inc:                                       
    %j.next = add nuw nsw i64 %j, 1
    %exitcond = icmp eq i64 %j, 99
    br i1 %exitcond, label %for2, label %for.j.end_crit_edge
  
  for.j.end_crit_edge:                              
    %split = phi i64 [ %add, %for.j.inc ]
    br label %for.j.end
  
  for.j.end:                                        
    ;%res.1.lcssa = phi i64 [ %split, %for.j.end_crit_edge ], [ %k, %for1.header ]
    br label %for1.inc10
  
  for1.inc10:
    %j.next24 = add nuw nsw i64 %j23, 1
    %exitcond26 = icmp eq i64 %j23, 99
    br i1 %exitcond26, label %for.end12, label %for1.header
  
  for.end12:
    ret void
  }

**Scenario 2**

  @A = common global [100 x [100 x i64]] zeroinitializer
  @B = common global [100 x i64] zeroinitializer
  
  define void @crash(i64 %k, i64 %N, i64 signext %ny) {
  entry:
    br label %for1.header
  
  for1.header:
    %j23 = phi i64 [ 0, %entry ], [ %j.next24, %for1.inc10 ]    
    br label %singleSucc
  
  singleSucc:
    br label %preheader.j
  
  preheader.j:  
    br label %preheader.j2
  
  preheader.j2:  
    br label %preheader.j3
  
  preheader.j3:  
    br label %for2
  
  for2:
    %j = phi i64 [ %j.next, %for2 ], [ 0, %preheader.j3 ]
    %arrayidx5 = getelementptr inbounds [100 x [100 x i64]], [100 x [100 x i64]]* @A, i64 0, i64 %j, i64 %j23
    %lv = load i64, i64* %arrayidx5
    %add = add nsw i64 %lv, %k
    store i64 %add, i64* %arrayidx5
    %j.next = add nuw nsw i64 %j, 1
    %exitcond = icmp eq i64 %j, 99
    br i1 %exitcond, label %for1.inc10, label %for2
  
  for1.inc10:
    %j.next24 = add nuw nsw i64 %j23, 1
    %exitcond26 = icmp eq i64 %j23, 99
    br i1 %exitcond26, label %for.end12, label %for1.header
  
  for.end12:
    ret void
  }


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98263/new/

https://reviews.llvm.org/D98263



More information about the llvm-commits mailing list