[llvm] 750bf35 - [AArch64] Increase cost of v2i64 multiplies

David Green via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 4 09:42:25 PDT 2022


Author: David Green
Date: 2022-04-04T17:42:20+01:00
New Revision: 750bf3582a6d475e0ada22fe86cbd84cd1739eee

URL: https://github.com/llvm/llvm-project/commit/750bf3582a6d475e0ada22fe86cbd84cd1739eee
DIFF: https://github.com/llvm/llvm-project/commit/750bf3582a6d475e0ada22fe86cbd84cd1739eee.diff

LOG: [AArch64] Increase cost of v2i64 multiplies

The cost of a v2i64 multiply was special cased in D92208 as scalarized
into 4*extract + 2*insert + 2*mul. Scalarizing to/from gpr registers are
expensive though, and the cost wasn't high enough to prevent vectorizing
in places where it can be detrimental for performance. This increases it
so that the costs of copying to/from GPRs is increased to 2 each, with
the total cost increasing to 14. So long as umull/smull are handled
correctly (as in D123006) this seems to lead to better vectorization
factors and better performance.

Differential Revision: https://reviews.llvm.org/D123007

Added: 
    

Modified: 
    llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
    llvm/test/Analysis/CostModel/AArch64/arith-overflow.ll
    llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
    llvm/test/Analysis/CostModel/AArch64/arith.ll
    llvm/test/Analysis/CostModel/AArch64/mul.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 2b00652860862..73203bca02794 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -1841,15 +1841,15 @@ InstructionCost AArch64TTIImpl::getArithmeticInstrCost(
     // as elements are extracted from the vectors and the muls scalarized.
     // As getScalarizationOverhead is a bit too pessimistic, we estimate the
     // cost for a i64 vector directly here, which is:
-    // - four i64 extracts,
-    // - two i64 inserts, and
-    // - two muls.
-    // So, for a v2i64 with LT.First = 1 the cost is 8, and for a v4i64 with
-    // LT.first = 2 the cost is 16. If both operands are extensions it will not
+    // - four 2-cost i64 extracts,
+    // - two 2-cost i64 inserts, and
+    // - two 1-cost muls.
+    // So, for a v2i64 with LT.First = 1 the cost is 14, and for a v4i64 with
+    // LT.first = 2 the cost is 28. If both operands are extensions it will not
     // need to scalarize so the cost can be cheaper (smull or umull).
     if (LT.second != MVT::v2i64 || isWideningInstruction(Ty, Opcode, Args))
       return LT.first;
-    return LT.first * 8;
+    return LT.first * 14;
   case ISD::ADD:
   case ISD::XOR:
   case ISD::OR:

diff  --git a/llvm/test/Analysis/CostModel/AArch64/arith-overflow.ll b/llvm/test/Analysis/CostModel/AArch64/arith-overflow.ll
index 68a8ede330024..aa75b66509cd7 100644
--- a/llvm/test/Analysis/CostModel/AArch64/arith-overflow.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/arith-overflow.ll
@@ -357,9 +357,9 @@ define i32 @smul(i32 %arg) {
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 44 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.smul.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 88 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.smul.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %I32 = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 undef, i32 undef)
-; RECIP-NEXT:  Cost Model: Found an estimated cost of 24 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.smul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)
-; RECIP-NEXT:  Cost Model: Found an estimated cost of 48 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.smul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)
-; RECIP-NEXT:  Cost Model: Found an estimated cost of 96 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.smul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)
+; RECIP-NEXT:  Cost Model: Found an estimated cost of 36 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.smul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)
+; RECIP-NEXT:  Cost Model: Found an estimated cost of 72 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.smul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)
+; RECIP-NEXT:  Cost Model: Found an estimated cost of 144 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.smul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %I16 = call { i16, i1 } @llvm.smul.with.overflow.i16(i16 undef, i16 undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.smul.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.smul.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)
@@ -439,9 +439,9 @@ define i32 @umul(i32 %arg) {
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 42 for instruction: %V4I64 = call { <4 x i64>, <4 x i1> } @llvm.umul.with.overflow.v4i64(<4 x i64> undef, <4 x i64> undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 84 for instruction: %V8I64 = call { <8 x i64>, <8 x i1> } @llvm.umul.with.overflow.v8i64(<8 x i64> undef, <8 x i64> undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %I32 = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 undef, i32 undef)
-; RECIP-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.umul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)
-; RECIP-NEXT:  Cost Model: Found an estimated cost of 46 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.umul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)
-; RECIP-NEXT:  Cost Model: Found an estimated cost of 92 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)
+; RECIP-NEXT:  Cost Model: Found an estimated cost of 35 for instruction: %V4I32 = call { <4 x i32>, <4 x i1> } @llvm.umul.with.overflow.v4i32(<4 x i32> undef, <4 x i32> undef)
+; RECIP-NEXT:  Cost Model: Found an estimated cost of 70 for instruction: %V8I32 = call { <8 x i32>, <8 x i1> } @llvm.umul.with.overflow.v8i32(<8 x i32> undef, <8 x i32> undef)
+; RECIP-NEXT:  Cost Model: Found an estimated cost of 140 for instruction: %V16I32 = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> undef, <16 x i32> undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %I16 = call { i16, i1 } @llvm.umul.with.overflow.i16(i16 undef, i16 undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %V8I16 = call { <8 x i16>, <8 x i1> } @llvm.umul.with.overflow.v8i16(<8 x i16> undef, <8 x i16> undef)
 ; RECIP-NEXT:  Cost Model: Found an estimated cost of 30 for instruction: %V16I16 = call { <16 x i16>, <16 x i1> } @llvm.umul.with.overflow.v16i16(<16 x i16> undef, <16 x i16> undef)

diff  --git a/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll b/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
index 0b96a0090ea12..d0f6e1cc66de0 100644
--- a/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
@@ -1554,15 +1554,15 @@ define void @extmulv2(<2 x i8> %i8, <2 x i16> %i16, <2 x i32> %i32, <2 x i64> %i
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zl2_8_32 = zext <2 x i8> %i8 to <2 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %azl_8_32 = mul <2 x i32> %zl1_8_32, %zl2_8_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sw_8_64 = sext <2 x i8> %i8 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %asw_8_64 = mul <2 x i64> %i64, %sw_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %asw_8_64 = mul <2 x i64> %i64, %sw_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sl1_8_64 = sext <2 x i8> %i8 to <2 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sl2_8_64 = sext <2 x i8> %i8 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %asl_8_64 = mul <2 x i64> %sl1_8_64, %sl2_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %asl_8_64 = mul <2 x i64> %sl1_8_64, %sl2_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zw_8_64 = zext <2 x i8> %i8 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %azw_8_64 = mul <2 x i64> %i64, %zw_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %azw_8_64 = mul <2 x i64> %i64, %zw_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zl1_8_64 = zext <2 x i8> %i8 to <2 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zl2_8_64 = zext <2 x i8> %i8 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %azl_8_64 = mul <2 x i64> %zl1_8_64, %zl2_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %azl_8_64 = mul <2 x i64> %zl1_8_64, %zl2_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sw_16_32 = sext <2 x i16> %i16 to <2 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %asw_16_32 = mul <2 x i32> %i32, %sw_16_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sl1_16_32 = sext <2 x i16> %i16 to <2 x i32>
@@ -1574,22 +1574,22 @@ define void @extmulv2(<2 x i8> %i8, <2 x i16> %i16, <2 x i32> %i32, <2 x i64> %i
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zl2_16_32 = zext <2 x i16> %i16 to <2 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %azl_16_32 = mul <2 x i32> %zl1_16_32, %zl2_16_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sw_16_64 = sext <2 x i16> %i16 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %asw_16_64 = mul <2 x i64> %i64, %sw_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %asw_16_64 = mul <2 x i64> %i64, %sw_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sl1_16_64 = sext <2 x i16> %i16 to <2 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sl2_16_64 = sext <2 x i16> %i16 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %asl_16_64 = mul <2 x i64> %sl1_16_64, %sl2_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %asl_16_64 = mul <2 x i64> %sl1_16_64, %sl2_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zw_16_64 = zext <2 x i16> %i16 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %azw_16_64 = mul <2 x i64> %i64, %zw_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %azw_16_64 = mul <2 x i64> %i64, %zw_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zl1_16_64 = zext <2 x i16> %i16 to <2 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zl2_16_64 = zext <2 x i16> %i16 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %azl_16_64 = mul <2 x i64> %zl1_16_64, %zl2_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %azl_16_64 = mul <2 x i64> %zl1_16_64, %zl2_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sw_32_64 = sext <2 x i32> %i32 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %asw_32_64 = mul <2 x i64> %i64, %sw_32_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %asw_32_64 = mul <2 x i64> %i64, %sw_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl1_32_64 = sext <2 x i32> %i32 to <2 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl2_32_64 = sext <2 x i32> %i32 to <2 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %asl_32_64 = mul <2 x i64> %sl1_32_64, %sl2_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zw_32_64 = zext <2 x i32> %i32 to <2 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %azw_32_64 = mul <2 x i64> %i64, %zw_32_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %azw_32_64 = mul <2 x i64> %i64, %zw_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl1_32_64 = zext <2 x i32> %i32 to <2 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl2_32_64 = zext <2 x i32> %i32 to <2 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %azl_32_64 = mul <2 x i64> %zl1_32_64, %zl2_32_64
@@ -1693,15 +1693,15 @@ define void @extmulv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %azl_8_32 = mul <4 x i32> %zl1_8_32, %zl2_8_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sw_8_64 = sext <4 x i8> %i8 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %asw_8_64 = mul <4 x i64> %i64, %sw_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %asw_8_64 = mul <4 x i64> %i64, %sw_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sl1_8_64 = sext <4 x i8> %i8 to <4 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sl2_8_64 = sext <4 x i8> %i8 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %asl_8_64 = mul <4 x i64> %sl1_8_64, %sl2_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %asl_8_64 = mul <4 x i64> %sl1_8_64, %sl2_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %zw_8_64 = zext <4 x i8> %i8 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %azw_8_64 = mul <4 x i64> %i64, %zw_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %azw_8_64 = mul <4 x i64> %i64, %zw_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %zl1_8_64 = zext <4 x i8> %i8 to <4 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %zl2_8_64 = zext <4 x i8> %i8 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %azl_8_64 = mul <4 x i64> %zl1_8_64, %zl2_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %azl_8_64 = mul <4 x i64> %zl1_8_64, %zl2_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %sw_16_32 = sext <4 x i16> %i16 to <4 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %asw_16_32 = mul <4 x i32> %i32, %sw_16_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl1_16_32 = sext <4 x i16> %i16 to <4 x i32>
@@ -1713,22 +1713,22 @@ define void @extmulv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl2_16_32 = zext <4 x i16> %i16 to <4 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %azl_16_32 = mul <4 x i32> %zl1_16_32, %zl2_16_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sw_16_64 = sext <4 x i16> %i16 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %asw_16_64 = mul <4 x i64> %i64, %sw_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %asw_16_64 = mul <4 x i64> %i64, %sw_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sl1_16_64 = sext <4 x i16> %i16 to <4 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %sl2_16_64 = sext <4 x i16> %i16 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %asl_16_64 = mul <4 x i64> %sl1_16_64, %sl2_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %asl_16_64 = mul <4 x i64> %sl1_16_64, %sl2_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %zw_16_64 = zext <4 x i16> %i16 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %azw_16_64 = mul <4 x i64> %i64, %zw_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %azw_16_64 = mul <4 x i64> %i64, %zw_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %zl1_16_64 = zext <4 x i16> %i16 to <4 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %zl2_16_64 = zext <4 x i16> %i16 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %azl_16_64 = mul <4 x i64> %zl1_16_64, %zl2_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %azl_16_64 = mul <4 x i64> %zl1_16_64, %zl2_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sw_32_64 = sext <4 x i32> %i32 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %asw_32_64 = mul <4 x i64> %i64, %sw_32_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %asw_32_64 = mul <4 x i64> %i64, %sw_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl1_32_64 = sext <4 x i32> %i32 to <4 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl2_32_64 = sext <4 x i32> %i32 to <4 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %asl_32_64 = mul <4 x i64> %sl1_32_64, %sl2_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %zw_32_64 = zext <4 x i32> %i32 to <4 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %azw_32_64 = mul <4 x i64> %i64, %zw_32_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %azw_32_64 = mul <4 x i64> %i64, %zw_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl1_32_64 = zext <4 x i32> %i32 to <4 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl2_32_64 = zext <4 x i32> %i32 to <4 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %azl_32_64 = mul <4 x i64> %zl1_32_64, %zl2_32_64
@@ -1832,15 +1832,15 @@ define void @extmulv8(<8 x i8> %i8, <8 x i16> %i16, <8 x i32> %i32, <8 x i64> %i
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %zl2_8_32 = zext <8 x i8> %i8 to <8 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %azl_8_32 = mul <8 x i32> %zl1_8_32, %zl2_8_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %sw_8_64 = sext <8 x i8> %i8 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %asw_8_64 = mul <8 x i64> %i64, %sw_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %asw_8_64 = mul <8 x i64> %i64, %sw_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %sl1_8_64 = sext <8 x i8> %i8 to <8 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %sl2_8_64 = sext <8 x i8> %i8 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %asl_8_64 = mul <8 x i64> %sl1_8_64, %sl2_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %asl_8_64 = mul <8 x i64> %sl1_8_64, %sl2_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %zw_8_64 = zext <8 x i8> %i8 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %azw_8_64 = mul <8 x i64> %i64, %zw_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %azw_8_64 = mul <8 x i64> %i64, %zw_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %zl1_8_64 = zext <8 x i8> %i8 to <8 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %zl2_8_64 = zext <8 x i8> %i8 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %azl_8_64 = mul <8 x i64> %zl1_8_64, %zl2_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %azl_8_64 = mul <8 x i64> %zl1_8_64, %zl2_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %sw_16_32 = sext <8 x i16> %i16 to <8 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %asw_16_32 = mul <8 x i32> %i32, %sw_16_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl1_16_32 = sext <8 x i16> %i16 to <8 x i32>
@@ -1852,22 +1852,22 @@ define void @extmulv8(<8 x i8> %i8, <8 x i16> %i16, <8 x i32> %i32, <8 x i64> %i
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl2_16_32 = zext <8 x i16> %i16 to <8 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %azl_16_32 = mul <8 x i32> %zl1_16_32, %zl2_16_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sw_16_64 = sext <8 x i16> %i16 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %asw_16_64 = mul <8 x i64> %i64, %sw_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %asw_16_64 = mul <8 x i64> %i64, %sw_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sl1_16_64 = sext <8 x i16> %i16 to <8 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %sl2_16_64 = sext <8 x i16> %i16 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %asl_16_64 = mul <8 x i64> %sl1_16_64, %sl2_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %asl_16_64 = mul <8 x i64> %sl1_16_64, %sl2_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %zw_16_64 = zext <8 x i16> %i16 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %azw_16_64 = mul <8 x i64> %i64, %zw_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %azw_16_64 = mul <8 x i64> %i64, %zw_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %zl1_16_64 = zext <8 x i16> %i16 to <8 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %zl2_16_64 = zext <8 x i16> %i16 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %azl_16_64 = mul <8 x i64> %zl1_16_64, %zl2_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %azl_16_64 = mul <8 x i64> %zl1_16_64, %zl2_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sw_32_64 = sext <8 x i32> %i32 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %asw_32_64 = mul <8 x i64> %i64, %sw_32_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %asw_32_64 = mul <8 x i64> %i64, %sw_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl1_32_64 = sext <8 x i32> %i32 to <8 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl2_32_64 = sext <8 x i32> %i32 to <8 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %asl_32_64 = mul <8 x i64> %sl1_32_64, %sl2_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %zw_32_64 = zext <8 x i32> %i32 to <8 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %azw_32_64 = mul <8 x i64> %i64, %zw_32_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %azw_32_64 = mul <8 x i64> %i64, %zw_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl1_32_64 = zext <8 x i32> %i32 to <8 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl2_32_64 = zext <8 x i32> %i32 to <8 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %azl_32_64 = mul <8 x i64> %zl1_32_64, %zl2_32_64
@@ -1971,15 +1971,15 @@ define void @extmulv16(<16 x i8> %i8, <16 x i16> %i16, <16 x i32> %i32, <16 x i6
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %zl2_8_32 = zext <16 x i8> %i8 to <16 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %azl_8_32 = mul <16 x i32> %zl1_8_32, %zl2_8_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %sw_8_64 = sext <16 x i8> %i8 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %asw_8_64 = mul <16 x i64> %i64, %sw_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %asw_8_64 = mul <16 x i64> %i64, %sw_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %sl1_8_64 = sext <16 x i8> %i8 to <16 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %sl2_8_64 = sext <16 x i8> %i8 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %asl_8_64 = mul <16 x i64> %sl1_8_64, %sl2_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %asl_8_64 = mul <16 x i64> %sl1_8_64, %sl2_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %zw_8_64 = zext <16 x i8> %i8 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %azw_8_64 = mul <16 x i64> %i64, %zw_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %azw_8_64 = mul <16 x i64> %i64, %zw_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %zl1_8_64 = zext <16 x i8> %i8 to <16 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %zl2_8_64 = zext <16 x i8> %i8 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %azl_8_64 = mul <16 x i64> %zl1_8_64, %zl2_8_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %azl_8_64 = mul <16 x i64> %zl1_8_64, %zl2_8_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %sw_16_32 = sext <16 x i16> %i16 to <16 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %asw_16_32 = mul <16 x i32> %i32, %sw_16_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl1_16_32 = sext <16 x i16> %i16 to <16 x i32>
@@ -1991,22 +1991,22 @@ define void @extmulv16(<16 x i8> %i8, <16 x i16> %i16, <16 x i32> %i32, <16 x i6
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl2_16_32 = zext <16 x i16> %i16 to <16 x i32>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %azl_16_32 = mul <16 x i32> %zl1_16_32, %zl2_16_32
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %sw_16_64 = sext <16 x i16> %i16 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %asw_16_64 = mul <16 x i64> %i64, %sw_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %asw_16_64 = mul <16 x i64> %i64, %sw_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %sl1_16_64 = sext <16 x i16> %i16 to <16 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %sl2_16_64 = sext <16 x i16> %i16 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %asl_16_64 = mul <16 x i64> %sl1_16_64, %sl2_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %asl_16_64 = mul <16 x i64> %sl1_16_64, %sl2_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %zw_16_64 = zext <16 x i16> %i16 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %azw_16_64 = mul <16 x i64> %i64, %zw_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %azw_16_64 = mul <16 x i64> %i64, %zw_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %zl1_16_64 = zext <16 x i16> %i16 to <16 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %zl2_16_64 = zext <16 x i16> %i16 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %azl_16_64 = mul <16 x i64> %zl1_16_64, %zl2_16_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %azl_16_64 = mul <16 x i64> %zl1_16_64, %zl2_16_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %sw_32_64 = sext <16 x i32> %i32 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %asw_32_64 = mul <16 x i64> %i64, %sw_32_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %asw_32_64 = mul <16 x i64> %i64, %sw_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl1_32_64 = sext <16 x i32> %i32 to <16 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %sl2_32_64 = sext <16 x i32> %i32 to <16 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %asl_32_64 = mul <16 x i64> %sl1_32_64, %sl2_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %zw_32_64 = zext <16 x i32> %i32 to <16 x i64>
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %azw_32_64 = mul <16 x i64> %i64, %zw_32_64
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %azw_32_64 = mul <16 x i64> %i64, %zw_32_64
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl1_32_64 = zext <16 x i32> %i32 to <16 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %zl2_32_64 = zext <16 x i32> %i32 to <16 x i64>
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %azl_32_64 = mul <16 x i64> %zl1_32_64, %zl2_32_64

diff  --git a/llvm/test/Analysis/CostModel/AArch64/arith.ll b/llvm/test/Analysis/CostModel/AArch64/arith.ll
index 5ac9efa6e53de..405bbf5247965 100644
--- a/llvm/test/Analysis/CostModel/AArch64/arith.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/arith.ll
@@ -368,7 +368,7 @@ define void @vi64() {
 ; CHECK-LABEL: 'vi64'
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %c2 = add <2 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %d2 = sub <2 x i64> undef, undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %e2 = mul <2 x i64> undef, undef
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %e2 = mul <2 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %f2 = ashr <2 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %g2 = lshr <2 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %h2 = shl <2 x i64> undef, undef
@@ -377,7 +377,7 @@ define void @vi64() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %k2 = xor <2 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %c4 = add <4 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %d4 = sub <4 x i64> undef, undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %e4 = mul <4 x i64> undef, undef
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %e4 = mul <4 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %f4 = ashr <4 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %g4 = lshr <4 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %h4 = shl <4 x i64> undef, undef
@@ -386,7 +386,7 @@ define void @vi64() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %k4 = xor <4 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %c8 = add <8 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %d8 = sub <8 x i64> undef, undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %e8 = mul <8 x i64> undef, undef
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 56 for instruction: %e8 = mul <8 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %f8 = ashr <8 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %g8 = lshr <8 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %h8 = shl <8 x i64> undef, undef
@@ -395,7 +395,7 @@ define void @vi64() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %k8 = xor <8 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %c16 = add <16 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %d16 = sub <16 x i64> undef, undef
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %e16 = mul <16 x i64> undef, undef
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 112 for instruction: %e16 = mul <16 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %f16 = ashr <16 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %g16 = lshr <16 x i64> undef, undef
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %h16 = shl <16 x i64> undef, undef

diff  --git a/llvm/test/Analysis/CostModel/AArch64/mul.ll b/llvm/test/Analysis/CostModel/AArch64/mul.ll
index 7c7110aabb3ce..2a93e7e24a181 100644
--- a/llvm/test/Analysis/CostModel/AArch64/mul.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/mul.ll
@@ -113,7 +113,7 @@ define <8 x i32> @t12(<8 x i32> %a, <8 x i32> %b)  {
 
 define <2 x i64> @t13(<2 x i64> %a, <2 x i64> %b)  {
 ; THROUGHPUT-LABEL: 't13'
-; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %1 = mul nsw <2 x i64> %a, %b
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %1 = mul nsw <2 x i64> %a, %b
 ; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret <2 x i64> %1
 ;
   %1 = mul nsw <2 x i64> %a, %b
@@ -122,7 +122,7 @@ define <2 x i64> @t13(<2 x i64> %a, <2 x i64> %b)  {
 
 define <4 x i64> @t14(<4 x i64> %a, <4 x i64> %b)  {
 ; THROUGHPUT-LABEL: 't14'
-; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %1 = mul nsw <4 x i64> %a, %b
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: %1 = mul nsw <4 x i64> %a, %b
 ; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret <4 x i64> %1
 ;
   %1 = mul nsw <4 x i64> %a, %b


        


More information about the llvm-commits mailing list