[PATCH] D132477: Improve cost model for some 128-bit vector operations that use SVE
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 22 06:11:16 PDT 2022
david-arm added a comment.
Hi @hassnaa-arm, thanks for updating the patch with more accurate costs for <128 bit vectors. It's almost there I think - just a few more comments and then I'll be happy!
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2100
// need to scalarize so the cost can be cheaper (smull or umull).
+ // so the cost can be cheaper (smull or umull).
if (LT.second != MVT::v2i64 || isWideningInstruction(Ty, Opcode, Args))
----------------
david-arm wrote:
> nit: This line looks unnecessary.
This line still needs removing I think?
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2088
+ if (TLI->isOperationLegalOrCustom(ISD, LT.second) && ST->hasSVE()) {
+ // SDIV/UDIV operations are lowered, then we can have less cost.
+ // For 8/16-bit elements, the cost is higher because the type
----------------
nit: Perhaps this is better written as something like
// If SDIV/UDIV operations are lowered using SVE sdiv/udiv
// instructions, then the cost is less because don't need to
// scalarize the operation.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2089
+ // SDIV/UDIV operations are lowered, then we can have less cost.
+ // For 8/16-bit elements, the cost is higher because the type
+ // requires promotion and possibly splitting:
----------------
nit: I think this comment now needs moving down below to where we check the element type.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2091
+ // requires promotion and possibly splitting:
+ if (isa<FixedVectorType>(Ty)) {
+ EVT VT = TLI->getValueType(DL, Ty);
----------------
I think you can make some of the following code a bit simpler if you also check the type size here, i.e.
if (isa<FixedVectorType>(Ty) && cast<FixedVectorType>(Ty)->getPrimitiveSizeInBits().getFixedSize() < 128) {
This means you can remove the `if (VT.getSizeInBits() < 128u)` below.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2101
+
+ if (VT.getSizeInBits() < 128u) {
+ const auto *Entry = CostTableLookup(DivTbl, ISD, VT.getSimpleVT());
----------------
Just FYI `getSizeInBits()` returns a `TypeSize` class (llvm/include/llvm/Support/TypeSize.h) that tries to represent both fixed-width and scalable sizes. The code you've written here works, but it's relying upon a cast from a TypeSize class to an unsigned integer that only works for fixed-width vectors. We'd like to avoid relying upon these casts and so for things like this we'd prefer to use `getFixedSizeInBits()` instead, which asserts the TypeSize is fixed.
================
Comment at: llvm/test/Analysis/CostModel/AArch64/sve-fixed-length.ll:63
+; Assuming legalization_cost = (vec_len-1/VBITS)+1
+; Assuming extra cost of 8 for i8.
+; Assuming extra cost of 4 for i16.
----------------
nit: I think it might be clearer to say:
; Assuming extra cost multipler of 8 for i8.
; Assuming extra cost multipler of 4 for i8.
================
Comment at: llvm/test/Analysis/CostModel/AArch64/sve-fixed-length.ll:93
+
+; expected cost: 32
+; CHECK: cost of [[#mul(mul(div(128-1, VBITS)+1, 2), 8)]] for instruction: %sdiv128.i8 = sdiv <16 x i8> undef, undef
----------------
Looks like these `; expected cost` comments aren't quite right here. For example, for <16 x i8> it looks like the cost should 2*8=16. I think it's probably worth either:
1. Removing all the `; expected cost` comments in the file because they may need updating over time, or
2. Update them to reflect the correct costs for each operation.
I'm happy with either!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D132477/new/
https://reviews.llvm.org/D132477
More information about the llvm-commits
mailing list