[PATCH] D132477: Improve cost model for some 128-bit vector operations that use SVE
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 12 03:53:17 PDT 2022
sdesmalen added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2039-2042
+ // SDIV/UDIV operations are lowered, then we can have less cost.
+ // for 32/64-bit elements, just the base cost is enough.
+ // but for 8/16-bit elements, there will be higher cost becaues they
+ // require extra work :
----------------
sdesmalen wrote:
> nit: How about:
>
> // For 8/16-bit element types, the cost is higher because the type requires
> // promotion and possibly splitting.
Just FYI that this comment seems unaddressed.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2050-2051
+ } else {
+ // On AArch64, vector divisions are not supported natively and are
+ // expanded into scalar divisions of each pair of elements.
+ Cost += getArithmeticInstrCost(Instruction::ExtractElement, Ty,
----------------
sdesmalen wrote:
> nit: this comment needs updating now.
Just FYI that this comment is unaddressed.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2055
InstructionCost Cost = BaseT::getArithmeticInstrCost(
Opcode, Ty, CostKind, Op1Info, Op2Info);
if (Ty->isVectorTy()) {
----------------
For the case of using SVE, this probably shouldn't be looking at the original type, but rather the type that is legal for the *operation*.
There are two separate things going on here:
1. The *type* legalization cost, which is the cost of getting a legal type.
2. The *operation* legalization cost, which currently assumes a legal type. As your code already points out, operations are only legal on 32- and 64-bit element types.
The way the code-generator handles this is that for 8 and 16-bit inputs, the *type* legalizer first tries to legalize them to something the code-generator can handle. After that, the code-generator will try to promote it to a type that the *operation* can handle.
e.g.
* if the type is <2 x i8> that will be promoted to <2 x i32>. This is all fine, we can use the cost of LT.first (The cost of a DIV for v2i32) and combine it with the legalisation cost for going from v2i8 -> v2i32.
* if the type is <4 x i16> that is a legal type, but for the DIV operation v4i16 is _not_ legal, so we'll need to consider legalisation for v4i32.
So for i8 and i16 element types, it should call `getTypeLegalizationCost();` again for the type that we know it will be promoted to. That type is a vector with the same number of lanes, but i32 element types.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2066
+ Cost *= 4;
+ // consider the cost of legalizaiton
+ return Cost;
----------------
nit: redundant comment, please remove.
================
Comment at: llvm/test/Analysis/CostModel/AArch64/sve-fixed-length.ll:69-75
+; expected cost: 4
+; CHECK: cost of [[#mul(div(16-1, VBITS)+1, 2)]] for instruction: %sdiv16.i8 = sdiv <2 x i8> undef, undef
+ %sdiv16.i8 = sdiv <2 x i8> undef, undef
+
+; expected cost: 16
+; CHECK: cost of [[#mul(mul(div(32-1, VBITS)+1, 2), 4)]] for instruction: %sdiv32.i8 = sdiv <4 x i8> undef, undef
+ %sdiv32.i8 = sdiv <4 x i8> undef, undef
----------------
In practice the cost of `sdiv <4 x i8>` is not 4 times as high as `sdiv <2 x i8>`, see e.g. https://godbolt.org/z/jM3M8rWsf so I think the costs are still not entirely right. Also see my other comment on the code itself.
================
Comment at: llvm/test/Analysis/CostModel/AArch64/sve-fixed-length.ll:70
+; expected cost: 4
+; CHECK: cost of [[#mul(div(16-1, VBITS)+1, 2)]] for instruction: %sdiv16.i8 = sdiv <2 x i8> undef, undef
+ %sdiv16.i8 = sdiv <2 x i8> undef, undef
----------------
nit: 16-1 => 15 ?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D132477/new/
https://reviews.llvm.org/D132477
More information about the llvm-commits
mailing list