[PATCH] D118102: [LoopInterchange] Prevent interchange with unsafe control-flow divergence inside inner loops (PR48057)

Congzhe Cao via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 8 09:40:58 PST 2022


congzhe added a comment.

In D118102#3285218 <https://reviews.llvm.org/D118102#3285218>, @bmahjour wrote:

> Couldn't the same problem happen in theory without control flow divergence? For example consider a loop like this:
>
>   for (c = 0; c <= 7; c++) {
>     for (d = 4; d; d--)
>       e = ((b[d+2][c]) ? b[d][0] : e);
>
> where the ternary operator turns into a `select` instruction in LLVM IR.

Thanks for the comment! IIUC the source code you wrote would likely result in control-flow divergence as well, even it is "select" form in the source code. The inner loop would look like the following where `e` is represented by the phi node `%cond`:

  for.body2:                                        ; preds = %for.cond1.preheader, %cond.end
    %indvars.iv = phi i64 [ 4, %for.cond1.preheader ], [ %indvars.iv.next, %cond.end ]
    %cond12 = phi i16 [ %cond1.lcssa57, %for.cond1.preheader ], [ %cond, %cond.end ]
    %4 = add nuw nsw i64 %indvars.iv, 2
    %arrayidx4 = getelementptr inbounds [8 x [8 x i8]], [8 x [8 x i8]]* @b, i64 0, i64 %4, i64 %indvars.iv9
    %5 = load i8, i8* %arrayidx4, align 1, !tbaa !9
    %tobool5.not = icmp eq i8 %5, 0
    br i1 %tobool5.not, label %cond.end, label %cond.true
  
  cond.true:                                        ; preds = %for.body2
    %arrayidx8 = getelementptr inbounds [8 x [8 x i8]], [8 x [8 x i8]]* @b, i64 0, i64 %indvars.iv, i64 0
    %6 = load i8, i8* %arrayidx8, align 8, !tbaa !9
    %conv9 = sext i8 %6 to i16
    br label %cond.end
  
  cond.end:                                         ; preds = %for.body2, %cond.true
    %cond = phi i16 [ %conv9, %cond.true ], [ %cond12, %for.body2 ]
    %indvars.iv.next = add nsw i64 %indvars.iv, -1
    %tobool.not = icmp eq i64 %indvars.iv.next, 0
    br i1 %tobool.not, label %for.inc12, label %for.body2, !llvm.loop !10

Secondly, I've manually changed the IR to make it use "select" instructions and not divergent (doing if-conversion essentially). This way loop interchange would bail out from "unable to recognize reduction or induction phis", since `e` is a phi node and the operation on `e` is like a "reduction" but the "reduction operator" is the select instruction, so it is not a real reduction and loop interchange bails. Hence I think this case is under our control.

Just to summarize, IMO I guess we would need to handle output dependency under control-flow dependence anyways. I'm looking forward to your thoughts.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118102/new/

https://reviews.llvm.org/D118102



More information about the llvm-commits mailing list