[PATCH] D117580: [LoopVectorize] Support conditional in-loop vector reductions
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 4 06:38:52 PST 2022
david-arm added a comment.
Hi @kmclaughlin, this looks a lot better, thanks for taking the time to address my previous comments and add the exhaustive set of tests! I just had some minor comments, mostly about the tests.
================
Comment at: llvm/lib/Analysis/IVDescriptors.cpp:1064
ExpectedUses = 2;
+ unsigned ExtraPhiUses = 0;
----------------
nit: If you agree, it might be worth moving this variable down to the place it's used? i.e. just above
if (auto ExitPhi = dyn_cast<PHINode>(LoopExitInstr)) {
================
Comment at: llvm/lib/Analysis/IVDescriptors.cpp:1105
+ Instruction *Chain = nullptr;
+ if (Inc0 && Inc0 == Phi)
+ Chain = Inc1;
----------------
nit: I think you can probably simplify this a little with something like this:
if (Inc0 == Phi)
Chain = Inc1;
else if (Inc1 == Phi)
Chain = Inc0;
else
return {};
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/scalable-reduction-inloop-cond.ll:26
+; CHECK-NEXT: [[TMP8:%.*]] = bitcast float* [[TMP7]] to <vscale x 4 x float>*
+; CHECK-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <vscale x 4 x float> @llvm.masked.load.nxv4f32.p0nxv4f32(<vscale x 4 x float>* [[TMP8]], i32 4, <vscale x 4 x i1> [[TMP6]], <vscale x 4 x float> zeroinitializer)
+; CHECK-NEXT: [[TMP9]] = call fast float @llvm.vector.reduce.fadd.nxv4f32(float [[VEC_PHI]], <vscale x 4 x float> [[WIDE_MASKED_LOAD]])
----------------
I think this code looks right because the inactive lanes will be zero, which matches the identity value for an fadd. However, it might be clearer if you remove the `-dce -instcombine` flags so we can see the select? I assume the select has been folded away.
================
Comment at: llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll:234
+
+define i32 @multiple_cond_ands(i32* noalias %A, i32* noalias %B, i32 %cond, i64 noundef %N) #0 {
+; CHECK-LABEL: @multiple_cond_ands(
----------------
Hi @kmclaughlin, it doesn't look there are multiple conditional and instructions in the loop?
================
Comment at: llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll:518
+;
+define i64 @cond_and(i64* noalias nocapture readonly %a, i64* noalias nocapture readonly %b, i64* noalias nocapture readonly %cond, i64 %N){
+; CHECK-LABEL: @cond_and(
----------------
nit: Maybe a better name for this is something like `@nested_cond_and` to distinguish from the other cond_and test?
================
Comment at: llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll:526
+; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
+; CHECK: vector.body:
+; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_LOAD_CONTINUE14:%.*]] ]
----------------
Maybe for this negative test you don't need the CHECK lines here and perhaps something like this is sufficient?
; CHECK: vector.body
; CHECK-NOT: llvm.vector.reduce.and.v4i64
; CHECK: middle.block
; CHECK: llvm.vector.reduce.and.v4i64
; CHECK: scalar.ph
================
Comment at: llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll:689
+; Chain of conditional & unconditional reductions. We do not vectorise as the Phi (%rdx1) has more than
+; the expected number of uses.
+;
----------------
Perhaps worth adding a bit more here, i.e.
; Chain of conditional & unconditional reductions. We currently only support conditional reductions
; if they are the last in the chain, i.e. the loop exit instruction is a PHI node. Therefore, we reject the
; PHI (%rdx1) as it has more than one use.
Do you think that makes it a bit clearer?
================
Comment at: llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll:696
+; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
+; CHECK: vector.ph:
+; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[N]], -4
----------------
SImilar to the comment in `@cond_and` I don't think you need all the CHECK lines here.
================
Comment at: llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll:826
+; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
+; CHECK: vector.ph:
+; CHECK-NEXT: [[N_VEC:%.*]] = and i64 [[N]], -4
----------------
Same comment as `@cond_and` about the CHECK lines. :)
================
Comment at: llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll:1007
+; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
+; CHECK: vector.body:
+; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_LOAD_CONTINUE7:%.*]] ]
----------------
Same comment as `@cond_and` for the CHECK lines.
================
Comment at: llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll:1137
+; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
+; CHECK: vector.body:
+; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_LOAD_CONTINUE7:%.*]] ]
----------------
Same comment as `@cond_and` for the CHECK lines.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D117580/new/
https://reviews.llvm.org/D117580
More information about the llvm-commits
mailing list