[llvm] [TTI][RISCV]Improve costs for whole vector reg extract/insert. (PR #80164)
Craig Topper via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 28 10:58:53 PST 2024
================
@@ -326,6 +326,18 @@ InstructionCost RISCVTTIImpl::getShuffleCost(TTI::ShuffleKind Kind,
switch (Kind) {
default:
break;
+ case TTI::SK_InsertSubvector: {
+ auto *FSubTy = cast<FixedVectorType>(SubTp);
+ unsigned TpRegs = getRegUsageForType(Tp);
+ unsigned SubTpRegs = getRegUsageForType(SubTp);
+ unsigned NextSubTpRegs = getRegUsageForType(FixedVectorType::get(
+ Tp->getElementType(), FSubTy->getNumElements() + 1));
+ // Whole vector insert - just the vector itself.
+ if (Index == 0 && SubTpRegs != 0 && SubTpRegs != NextSubTpRegs &&
----------------
topperc wrote:
I don't think this works. getRegUsageForType is returning the maximum number of registers needed given a minimum VLEN. If the runtime VLEN is larger the used number of registers could be less.
The backend must always use a vslideup for fixed vector insert unless we know both the maximum and minimum VLEN are the same. I think you have to check `ST.getRealVLen()`.
https://github.com/llvm/llvm-project/pull/80164
More information about the llvm-commits
mailing list