[PATCH] D89767: [CostModel][X86] teach TTI calculate cost of chain of vector inserts/extracts more precisely and correctly:In each 128-lane, if there is at least one index is demanded and not all indices are demanded...
Pengfei Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 20 07:42:19 PDT 2020
pengfei added inline comments.
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3088
+ // inserti128.
+ // Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.
+ unsigned Num128Lanes = LT.second.getSizeInBits() / 128 * LT.first;
----------------
Why inserting 5th in case 2 needs extracti128 but case 3 doesn't?
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3089
+ // Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.
+ unsigned Num128Lanes = LT.second.getSizeInBits() / 128 * LT.first;
+ unsigned NumElts = LT.second.getVectorNumElements() * LT.first;
----------------
Do we need to skip if LT.second.getSizeInBits() == 128?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D89767/new/
https://reviews.llvm.org/D89767
More information about the llvm-commits
mailing list