[PATCH] D94717: [LoopNest] Consider loop nest with inner loop guard using outer loop induction variable to be perfect

Bardia Mahjour via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 3 15:09:03 PDT 2021


bmahjour added a comment.

In D94717#2730948 <https://reviews.llvm.org/D94717#2730948>, @Whitney wrote:

> In D94717#2730582 <https://reviews.llvm.org/D94717#2730582>, @bmahjour wrote:
>
>> I think this changes the semantics for the cases that we used to handle before. For example wouldn't this require the exit successors to be empty? If so I think lcssa phis can prevent us from detecting guards. We need to test for those cases as well.
>
> LCSSA phis should be in the loop exit block, not exit successors.



  int foo(char * restrict aa, int N) {
    int sum = 0;
    for (int i = 0; i < N; i++) {
      sum += aa[i];
    }
    return sum;
  }

The corresponding IR would be:

  define dso_local signext i32 @foo(i8* noalias %aa, i32 signext %N) #0 {
  entry:
    %cmp1 = icmp sgt i32 %N, 0
    br i1 %cmp1, label %for.body.preheader, label %for.end
  
  for.body.preheader:                               ; preds = %entry
    %wide.trip.count = zext i32 %N to i64
    br label %for.body
  
  for.body:                                         ; preds = %for.body.preheader, %for.body
    %indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
    %sum.02 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
    %arrayidx = getelementptr inbounds i8, i8* %aa, i64 %indvars.iv
    %0 = load i8, i8* %arrayidx, align 1, !tbaa !2
    %conv = zext i8 %0 to i32
    %add = add nuw nsw i32 %sum.02, %conv
    %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
    %exitcond = icmp ne i64 %indvars.iv.next, %wide.trip.count
    br i1 %exitcond, label %for.body, label %for.end.loopexit, !llvm.loop !5
  
  for.end.loopexit:                                 ; preds = %for.body
    %add.lcssa = phi i32 [ %add, %for.body ]
    br label %for.end
  
  for.end:                                          ; preds = %for.end.loopexit, %entry
    %sum.0.lcssa = phi i32 [ 0, %entry ], [ %add.lcssa, %for.end.loopexit ]
    ret i32 %sum.0.lcssa
  }

As you can see `for.end` is not empty and so this loop's guard won't be detected!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94717/new/

https://reviews.llvm.org/D94717



More information about the llvm-commits mailing list