[PATCH] D147720: [LV] Use the known trip count when costing non-tail folded VFs
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 24 07:35:13 PDT 2023
sdesmalen added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5410-5414
+ auto RTCostB =
+ foldTailByMasking()
+ ? (CostB * divideCeil(MaxTripCount, B.Width.getFixedValue()))
+ : (CostB * (MaxTripCount / B.Width.getFixedValue()) +
+ B.ScalarCost * (MaxTripCount % B.Width.getFixedValue()));
----------------
nit: Is it worth using a lambda for this, e.g.
auto GetCostForTC = [MaxTripCount, this](unsigned VF, InstructionCost VectorCost,
InstructionCost ScalarCost) {
return foldTailByMasking() ?
VectorCost * divideCeil(MaxTripCount, VF);
VectorCost * (MaxTripCount / VF) + ScalarCost * (MaxTripCount % VF);
};
auto RTCostA = GetCostForTC(A.Width.getFixedValue(), CostA, A.ScalarCost);
auto RTCostB = GetCostForTC(B.Width.getFixedValue(), CostB, B.ScalarCost);
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll:98
%inc = add nuw nsw i8 %i.08, 1
- %exitcond.not = icmp eq i8 %inc, 12345
+ %exitcond.not = icmp eq i8 %inc, 241
br i1 %exitcond.not, label %for.end, label %for.body
----------------
I'm curious why this test needed changing. What VF does it pick with 12345?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D147720/new/
https://reviews.llvm.org/D147720
More information about the llvm-commits
mailing list