[PATCH] D89767: [CostModel][X86] teach TTI calculate cost of chain of vector inserts/extracts more precisely and correctly:In each 128-lane, if there is at least one index is demanded and not all indices are demanded...

Bing Yu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 20 19:51:14 PDT 2020


yubing added inline comments.


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3088
+        // inserti128.
+        // Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.
+        unsigned Num128Lanes = LT.second.getSizeInBits() / 128 * LT.first;
----------------
pengfei wrote:
> Why inserting 5th in case 2 needs extracti128 but case 3 doesn't?
in case#3, we don't need to retain other elements since all the indices are inserted, Thus, extracti128 is not needed.


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3089
+        // Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.
+        unsigned Num128Lanes = LT.second.getSizeInBits() / 128 * LT.first;
+        unsigned NumElts = LT.second.getVectorNumElements() * LT.first;
----------------
pengfei wrote:
> Do we need to skip if LT.second.getSizeInBits() == 128?
we've checked (LT.second.getSizeInBits() <= 128) in line3073


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89767/new/

https://reviews.llvm.org/D89767



More information about the llvm-commits mailing list