[llvm] [PowerPC] adjust cost for vector insert/extract with non const index (PR #79092)

via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 25 14:46:57 PST 2024


================
@@ -697,34 +697,44 @@ InstructionCost PPCTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val,
 
     return Cost;
 
-  } else if (Val->getScalarType()->isIntegerTy() && Index != -1U) {
+  } else if (Val->getScalarType()->isIntegerTy()) {
     unsigned EltSize = Val->getScalarSizeInBits();
     // Computing on 1 bit values requires extra mask or compare operations.
     unsigned MaskCost = VecMaskCost && EltSize == 1 ? 1 : 0;
     if (ST->hasP9Altivec()) {
-      if (ISD == ISD::INSERT_VECTOR_ELT)
-        // A move-to VSR and a permute/insert.  Assume vector operation cost
-        // for both (cost will be 2x on P9).
-        return 2 * CostFactor;
-
-      // It's an extract.  Maybe we can do a cheap move-from VSR.
-      unsigned EltSize = Val->getScalarSizeInBits();
-      if (EltSize == 64) {
-        unsigned MfvsrdIndex = ST->isLittleEndian() ? 1 : 0;
-        if (Index == MfvsrdIndex)
-          return 1;
-      } else if (EltSize == 32) {
-        unsigned MfvsrwzIndex = ST->isLittleEndian() ? 2 : 1;
-        if (Index == MfvsrwzIndex)
-          return 1;
-      }
-
-      // We need a vector extract (or mfvsrld).  Assume vector operation cost.
-      // The cost of the load constant for a vector extract is disregarded
-      // (invariant, easily schedulable).
-      return CostFactor + MaskCost;
+      // P10 has vxform insert which can handle non const index. The MaskCost is
+      // for masking the index.
+      // P9 has insert for const index. A move-to VSR and a permute/insert.
+      // Assume vector operation cost for both (cost will be 2x on P9).
+      if (ISD == ISD::INSERT_VECTOR_ELT) {
+        if (ST->isISA3_1())
+          return CostFactor + MaskCost;
----------------
RolandF77 wrote:

The basic idea of the change makes sense.  The issue is specifically the use of MaskCost to account for indexing overhead.  MaskCost is 0 unless the element type to the vector is i1, which does not look expected.  It was added to deal with the overhead of sub-byte values extracted from i1 vectors.  I think you should probably use a different computation which is 0 unless the index is -1.

https://github.com/llvm/llvm-project/pull/79092


More information about the llvm-commits mailing list