[PATCH] D124202: [SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 26 10:21:04 PDT 2022
dmgreen accepted this revision.
dmgreen added a comment.
This revision is now accepted and ready to land.
Thanks. LGTM
================
Comment at: llvm/test/Analysis/CostModel/AArch64/shuffle-load.ll:38-44
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sv2f16 = shufflevector <2 x half> %lv2f16, <2 x half> undef, <2 x i32> zeroinitializer
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %lv4f16 = load <4 x half>, ptr undef, align 8
-; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %sv4f16 = shufflevector <4 x half> %lv4f16, <4 x half> undef, <4 x i32> zeroinitializer
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sv4f16 = shufflevector <4 x half> %lv4f16, <4 x half> undef, <4 x i32> zeroinitializer
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %lv8f16 = load <8 x half>, ptr undef, align 16
-; CHECK-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %sv8f16 = shufflevector <8 x half> %lv8f16, <8 x half> undef, <8 x i32> zeroinitializer
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sv8f16 = shufflevector <8 x half> %lv8f16, <8 x half> undef, <8 x i32> zeroinitializer
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %lv16f16 = load <16 x half>, ptr undef, align 32
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %sv16f16 = shufflevector <16 x half> %lv16f16, <16 x half> undef, <16 x i32> zeroinitializer
----------------
vporpo wrote:
> These look like big drops in cost. As far as I can tell `ld1r` does support broadcasting 16-bit float so it looks correct. @dmgreen could you confirm this?
Yeah that sounds OK to me. FP16 was added in Arm8.2-a, and before that the costs are sometimes a little funny, because the types are not legal. For a broadcast load a ld1r isnt affected by the type though (just the size), so it should be OK to treat them as cheap.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124202/new/
https://reviews.llvm.org/D124202
More information about the llvm-commits
mailing list