[PATCH] D89767: [CostModel][X86] teach TTI calculate cost of chain of vector inserts/extracts more precisely and correctly:In each 128-lane, if there is at least one index is demanded and not all indices are demanded...

Pengfei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 20 07:42:19 PDT 2020


pengfei added inline comments.


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3088
+        // inserti128.
+        // Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.
+        unsigned Num128Lanes = LT.second.getSizeInBits() / 128 * LT.first;
----------------
Why inserting 5th in case 2 needs extracti128 but case 3 doesn't?


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3089
+        // Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.
+        unsigned Num128Lanes = LT.second.getSizeInBits() / 128 * LT.first;
+        unsigned NumElts = LT.second.getVectorNumElements() * LT.first;
----------------
Do we need to skip if LT.second.getSizeInBits() == 128?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89767/new/

https://reviews.llvm.org/D89767



More information about the llvm-commits mailing list