[PATCH] D89767: [CostModel][X86] teach TTI calculate cost of chain of vector inserts/extracts more precisely and correctly:In each 128-lane, if there is at least one index is demanded and not all indices are demanded...
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 22 02:11:36 PDT 2020
RKSimon added inline comments.
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3077
} else {
- unsigned NumSubVecs = LT.second.getSizeInBits() / 128;
- Cost += (PowerOf2Ceil(NumSubVecs) - 1) * LT.first;
+ // In each 128-lane, if there is at least one index is demanded and not
+ // all indices are demanded and this 128-lane is not the first 128-lane
----------------
"if at least one index is demanded but not all indices are demanded"
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3080
+ // of the legalized-vector, then this 128-lane needs a extracti128; If
+ // in each 128-lane, there is at least one index is demanded, this
+ // 128-lane needs a inserti128.
----------------
", there is at least one demanded index, this"
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3099
+ for (unsigned I = 0; I < WidenedDemandedElts.getBitWidth();
+ I += Scale) {
+ APInt Mask = WidenedDemandedElts.getBitsSet(NumElts, I, I + Scale);
----------------
Isn't WidenedDemandedElts.getBitWidth() == NumElts at this stage?
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:3109
+ };
+ Cost += calculateCostOfExtInsr128Lanes();
Cost += DemandedElts.countPopulation();
----------------
I'm not sure what we gain by having calculateCostOfExtInsr128Lanes() as lambda function?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D89767/new/
https://reviews.llvm.org/D89767
More information about the llvm-commits
mailing list