[PATCH] D89767: [CostModel][X86] teach TTI calculate cost of chain of vector inserts/extracts more precisely and correctly:In each 128-lane, if there is at least one index is demanded and not all indices are demanded...

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 22 02:11:36 PDT 2020


RKSimon added inline comments.


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3077
       } else {
-        unsigned NumSubVecs = LT.second.getSizeInBits() / 128;
-        Cost += (PowerOf2Ceil(NumSubVecs) - 1) * LT.first;
+        // In each 128-lane, if there is at least one index is demanded and not
+        // all indices are demanded and this 128-lane is not the first 128-lane
----------------
"if at least one index is demanded but not all indices are demanded"


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3080
+        // of the legalized-vector, then this 128-lane needs a extracti128; If
+        // in each 128-lane, there is at least one index is demanded, this
+        // 128-lane needs a inserti128.
----------------
", there is at least one demanded index, this"


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3099
+          for (unsigned I = 0; I < WidenedDemandedElts.getBitWidth();
+               I += Scale) {
+            APInt Mask = WidenedDemandedElts.getBitsSet(NumElts, I, I + Scale);
----------------
Isn't WidenedDemandedElts.getBitWidth() == NumElts at this stage?


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3109
+        };
+        Cost += calculateCostOfExtInsr128Lanes();
         Cost += DemandedElts.countPopulation();
----------------
I'm not sure what we gain by having calculateCostOfExtInsr128Lanes() as lambda function?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89767/new/

https://reviews.llvm.org/D89767



More information about the llvm-commits mailing list