[PATCH] D103952: [CostModel][AArch64] Improve the cost estimate of CTPOP intrinsic
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 9 10:27:29 PDT 2021
dmgreen added a subscriber: Rin.
dmgreen added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:280
+ // CTPOP costs should match the codegen from
+ // llvm/test/CodeGen/AArch64/arm64-vpopcnt.ll
+ static const CostTblEntry CtpopCostTbl[] = {
----------------
It looks like this file doesn't contain all of the cases. As far as I understand, this is how it works:
- v8i8 and v8i16 are legal, so 1 instruction. Fantastic!
- v4i16 and v8i16 are converted to "v16i8 ctpop + addp". So cost 2
- v2i32 and v4i32 are converted to "v16i8 ctpop + addp + addp". So cost 3
- v1i64 and v2i64 are converted to "v16i8 ctpop + addp + addp + addp". So cost 4
Those are all good. For scalar, as opposed to vector, there is no good instruction though. The generation also looks pretty terrible at the moment.
So a i8 becomes "and 0xff; expensive-mov; v8i8 cnt; addlv; expensive-mov". The others looks equally expensive too. I would guess a cost of 5 would make sense?
Everything else would be legalized to one of those types. So you can probably use "auto LT = TLI->getTypeLegalizationCost(DL, RetTy);" and Use LT.first for the type in the table lookup, and return Entry->Cost * LT.second, to include the cost of legalization (how many different vectors there will be).
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:282
+ static const CostTblEntry CtpopCostTbl[] = {
+ {ISD::CTPOP, MVT::i64, 4}, {ISD::CTPOP, MVT::v2i64, 4},
+ {ISD::CTPOP, MVT::i32, 3}, {ISD::CTPOP, MVT::v2i32, 3},
----------------
The table can also be easier to read if it's not formatted.
And maybe in a different order would help.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103952/new/
https://reviews.llvm.org/D103952
More information about the llvm-commits
mailing list