[llvm] [PowerPC] adjust cost for vector insert/extract with non const index (PR #79092)

Chen Zheng via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 25 05:27:59 PST 2024


================
@@ -697,34 +697,44 @@ InstructionCost PPCTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val,
 
     return Cost;
 
-  } else if (Val->getScalarType()->isIntegerTy() && Index != -1U) {
+  } else if (Val->getScalarType()->isIntegerTy()) {
     unsigned EltSize = Val->getScalarSizeInBits();
     // Computing on 1 bit values requires extra mask or compare operations.
     unsigned MaskCost = VecMaskCost && EltSize == 1 ? 1 : 0;
     if (ST->hasP9Altivec()) {
-      if (ISD == ISD::INSERT_VECTOR_ELT)
-        // A move-to VSR and a permute/insert.  Assume vector operation cost
-        // for both (cost will be 2x on P9).
-        return 2 * CostFactor;
-
-      // It's an extract.  Maybe we can do a cheap move-from VSR.
-      unsigned EltSize = Val->getScalarSizeInBits();
-      if (EltSize == 64) {
-        unsigned MfvsrdIndex = ST->isLittleEndian() ? 1 : 0;
-        if (Index == MfvsrdIndex)
-          return 1;
-      } else if (EltSize == 32) {
-        unsigned MfvsrwzIndex = ST->isLittleEndian() ? 2 : 1;
-        if (Index == MfvsrwzIndex)
-          return 1;
-      }
-
-      // We need a vector extract (or mfvsrld).  Assume vector operation cost.
-      // The cost of the load constant for a vector extract is disregarded
-      // (invariant, easily schedulable).
-      return CostFactor + MaskCost;
+      // P10 has vxform insert which can handle non const index. The MaskCost is
+      // for masking the index.
+      // P9 has insert for const index. A move-to VSR and a permute/insert.
+      // Assume vector operation cost for both (cost will be 2x on P9).
+      if (ISD == ISD::INSERT_VECTOR_ELT) {
+        if (ST->isISA3_1())
+          return CostFactor + MaskCost;
----------------
chenzheng1030 wrote:

Thanks for review @RolandF77 .

For the "legal" types for insertelment, like i8/i16/i32/i64, on P10, see https://godbolt.org/z/jcor7KEWr , all of them can be represented with a vector instruction + some arithmetic instructions.

Are there other types I should test? 

The issue https://github.com/llvm/llvm-project/issues/50249 is not for P10, it is for P9 instead.

Before the change, on P9, extractelement with non-const index will have the cost calculated at line 750. That cost is based on loading the required element from the memory where the bigger data is stored. This would inaccurately increase the cost for such extractelement.

https://github.com/llvm/llvm-project/pull/79092


More information about the llvm-commits mailing list