[llvm] [TTI][AArch64] Detect OperandInfo from scalable splats. (PR #122469)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 13 03:00:04 PST 2025
================
@@ -260,22 +260,22 @@ define void @udiv_uniformconst() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V16i8 = udiv <16 x i8> undef, splat (i8 7)
; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V32i8 = udiv <32 x i8> undef, splat (i8 7)
; CHECK-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %V64i8 = udiv <64 x i8> undef, splat (i8 7)
-; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 7)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 7)
----------------
david-arm wrote:
Hi @davemgreen, thanks for splitting this patch out from #122236, it's really helpful to see which changes are affecting which costs. However, I do agree with @hassnaaHamdi that something looks wrong here, which isn't fixed by #122236 either. I would assume that in `@sdiv_uniformconst` `sdiv <vscale x 8 x i32> undef, splat (i64 7)` gets legalised into two `sdiv <vscale x 4 x i32> undef, splat (i64 7)` instructions with the results being concatenated together. Right now the vectoriser will believe that VF=vscale x 8 is almost half the cost of VF=vscale x 4, which doesn't seem right. Something similar happens in `sdiv_uniformconstnegpow2`.
Having said that, in general I do agree this is a step in the right direction for most cases because previously the divide costs were too low. The throughputs for udiv/sdiv are at best 1/7 on neoverse-v1 according to the optimisation guide. This patch just increases the costs for divides of splats, and then #122236 follows on for the more general case. I do like this patch and happy to accept it, but perhaps worth understanding what's going on first with the illegal types?
https://github.com/llvm/llvm-project/pull/122469
More information about the llvm-commits
mailing list