[llvm] [TTI][RISCV]Improve costs for whole vector reg extract/insert. (PR #80164)

Craig Topper via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 28 10:58:53 PST 2024


================
@@ -326,6 +326,18 @@ InstructionCost RISCVTTIImpl::getShuffleCost(TTI::ShuffleKind Kind,
     switch (Kind) {
     default:
       break;
+    case TTI::SK_InsertSubvector: {
+      auto *FSubTy = cast<FixedVectorType>(SubTp);
+      unsigned TpRegs = getRegUsageForType(Tp);
+      unsigned SubTpRegs = getRegUsageForType(SubTp);
+      unsigned NextSubTpRegs = getRegUsageForType(FixedVectorType::get(
+          Tp->getElementType(), FSubTy->getNumElements() + 1));
+      // Whole vector insert - just the vector itself.
+      if (Index == 0 && SubTpRegs != 0 && SubTpRegs != NextSubTpRegs &&
----------------
topperc wrote:

I don't think this works. getRegUsageForType is returning the maximum number of registers needed given a minimum VLEN. If the runtime VLEN is larger the used number of registers could be less.

The backend must always use a vslideup for fixed vector insert unless we know both the maximum and minimum VLEN are the same. I think you have to check `ST.getRealVLen()`.

https://github.com/llvm/llvm-project/pull/80164


More information about the llvm-commits mailing list