[PATCH] D84451: [LV] Tail folded inloop reductions.
Sjoerd Meijer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 7 02:25:12 PDT 2020
SjoerdMeijer added a comment.
Just some minor questions inline.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3996
// a Select choosing between the vectorized LoopExitInst and vectorized Phi,
// instead of the former.
+ if (Cost->foldTailByMasking() && !IsInLoopReductionPhi) {
----------------
nit: perhaps a comment about the in-loop reductions.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:157
+ (It->getVPRecipeID() == VPRecipeBase::VPWidenPHISC ||
+ It->getVPRecipeID() == VPRecipeBase::VPWidenIntOrFpInductionSC ||
+ It->getVPRecipeID() == VPRecipeBase::VPPredInstPHISC ||
----------------
Is that a Phi?
================
Comment at: llvm/test/Transforms/LoopVectorize/ARM/mve-gather-scatter-tailpred.ll:19
; CHECK-NEXT: [[TMP0:%.*]] = add i32 [[INDEX]], 0
+; CHECK-NEXT: [[ACTIVE_LANE_MASK:%.*]] = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 [[TMP0]], i32 [[N]])
; CHECK-NEXT: [[TMP1:%.*]] = mul nuw nsw i32 [[TMP0]], 1
----------------
Not important, but just out of curiousity, why is this moved up?
================
Comment at: llvm/test/Transforms/LoopVectorize/ARM/reduction-inloop-pred.ll:438
; CHECK-NEXT: [[INDUCTION:%.*]] = or <4 x i64> [[BROADCAST_SPLAT]], <i64 0, i64 1, i64 2, i64 3>
-; CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i64 [[INDEX]]
-; CHECK-NEXT: [[TMP1:%.*]] = icmp ult <4 x i64> [[INDUCTION]], <i64 257, i64 257, i64 257, i64 257>
-; CHECK-NEXT: [[TMP2:%.*]] = bitcast i32* [[TMP0]] to <4 x i32>*
-; CHECK-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* [[TMP2]], i32 4, <4 x i1> [[TMP1]], <4 x i32> undef)
+; CHECK-NEXT: [[TMP0:%.*]] = icmp ult <4 x i64> [[INDUCTION]], <i64 257, i64 257, i64 257, i64 257>
+; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i64 [[INDEX]]
----------------
Maye a bit off topic for this patch, but are we tail-predicating this loop for MVE? Could we do that? Reason I am asking is that I am looking at this icmp of the induction and the BTC, for which we could emit get.active.lane.mask, so we would get the tail-predication?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D84451/new/
https://reviews.llvm.org/D84451
More information about the llvm-commits
mailing list