[PATCH] D134930: [LoopInterchange] Do not interchange when a reduction phi in all subloops of the outer loop is not recognizable

Tue Nov 1 13:31:07 PDT 2022

bmahjour added a comment.

The current output from loop interchange is wrong, but it could be corrected if we move the `vector.reduce` and the `store` to the `for.inc19.i` block; ie:

  middle.block:                                     ; preds = %for.inc19.i
    %inc17.i = add nuw nsw i16 %j.010.i, 1
    %exitcond12.not.i = icmp eq i16 %inc17.i, 4
    br i1 %exitcond12.not.i, label %test.exit, label %for.j

  for.inc19.i:                                      ; preds = %vector.body
    %.lcssa = phi <4 x i32> [ %16, %vector.body ]
    %18 = tail call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %.lcssa)
    store i32 %18, ptr %arrayidx14.i, align 1
    %inc20.i = add nuw nsw i16 %i.011.i, 1
    %exitcond13.not.i = icmp eq i16 %inc20.i, 2
    br i1 %exitcond13.not.i, label %middle.block, label %for.i

If we look at the `middle.block` in the input IR, it is serving two purposes at the same time: 1) it acts as the j-loop latch 2) and it acts as the epilogue of the `vector.body` loop (containing `vector.reduce` and `store`). LoopInterchange seems to want to keep `middle.block` as the j-loop latch in the interchanged IR, but it fails to separate the epilogue portion from the latch computation. Perhaps a better fix would be to migrate any intervening code that is not used for computing the latch from the middle.block to the i-loop latch block.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134930/new/

https://reviews.llvm.org/D134930