[llvm] dfbefe4 - [CostModel][X86] Add missing AVX1 costs for PMULUDQ v4i64 pattern (#157475)

via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 8 08:06:38 PDT 2025


Author: Simon Pilgrim
Date: 2025-09-08T15:06:33Z
New Revision: dfbefe4a65019fd8dc2ff8257b11519808a27a13

URL: https://github.com/llvm/llvm-project/commit/dfbefe4a65019fd8dc2ff8257b11519808a27a13
DIFF: https://github.com/llvm/llvm-project/commit/dfbefe4a65019fd8dc2ff8257b11519808a27a13.diff

LOG: [CostModel][X86] Add missing AVX1 costs for PMULUDQ v4i64 pattern (#157475)

We need to account for AVX1 targets splitting 256-bit PMULUDQ style
patterns - we included a AVX2 cost, but missed out AVX1 despite v4i64
being a legal type

Noticed while working on #156568

Added: 
    

Modified: 
    llvm/lib/Target/X86/X86TargetTransformInfo.cpp
    llvm/test/Analysis/CostModel/X86/mul64.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index 910c724a4e3aa..3d8d0a236a3c1 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -1206,6 +1206,8 @@ InstructionCost X86TTIImpl::getArithmeticInstrCost(
     { ISD::MUL,     MVT::v4i32,   {  2,  5,  1,  3 } }, // pmulld
     { ISD::MUL,     MVT::v4i64,   { 12, 15, 19, 20 } },
 
+    { X86ISD::PMULUDQ, MVT::v4i64, { 3,  5, 5, 6 } }, // pmuludq + split
+
     { ISD::AND,     MVT::v32i8,   {  1,  1, 1, 2 } }, // vandps
     { ISD::AND,     MVT::v16i16,  {  1,  1, 1, 2 } }, // vandps
     { ISD::AND,     MVT::v8i32,   {  1,  1, 1, 2 } }, // vandps

diff  --git a/llvm/test/Analysis/CostModel/X86/mul64.ll b/llvm/test/Analysis/CostModel/X86/mul64.ll
index 9e4794760404f..9af20febeaffa 100644
--- a/llvm/test/Analysis/CostModel/X86/mul64.ll
+++ b/llvm/test/Analysis/CostModel/X86/mul64.ll
@@ -327,11 +327,11 @@ define void @mul_zext_vXi8(<2 x i8> %a2, <2 x i8> %b2, <4 x i8> %a4, <4 x i8> %b
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:50 CodeSize:1 Lat:1 SizeLat:1 for: %xa64 = zext <64 x i8> %a64 to <64 x i64>
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:50 CodeSize:1 Lat:1 SizeLat:1 for: %xb64 = zext <64 x i8> %b64 to <64 x i64>
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:5 SizeLat:1 for: %res2 = mul <2 x i64> %xa2, %xb2
-; AVX1-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %res4 = mul <4 x i64> %xa4, %xb4
-; AVX1-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:1 Lat:1 SizeLat:1 for: %res8 = mul <8 x i64> %xa8, %xb8
-; AVX1-NEXT:  Cost Model: Found costs of RThru:8 CodeSize:1 Lat:1 SizeLat:1 for: %res16 = mul <16 x i64> %xa16, %xb16
-; AVX1-NEXT:  Cost Model: Found costs of RThru:16 CodeSize:1 Lat:1 SizeLat:1 for: %res32 = mul <32 x i64> %xa32, %xb32
-; AVX1-NEXT:  Cost Model: Found costs of RThru:32 CodeSize:1 Lat:1 SizeLat:1 for: %res64 = mul <64 x i64> %xa64, %xb64
+; AVX1-NEXT:  Cost Model: Found costs of RThru:3 CodeSize:5 Lat:5 SizeLat:6 for: %res4 = mul <4 x i64> %xa4, %xb4
+; AVX1-NEXT:  Cost Model: Found costs of RThru:6 CodeSize:10 Lat:10 SizeLat:12 for: %res8 = mul <8 x i64> %xa8, %xb8
+; AVX1-NEXT:  Cost Model: Found costs of RThru:12 CodeSize:20 Lat:20 SizeLat:24 for: %res16 = mul <16 x i64> %xa16, %xb16
+; AVX1-NEXT:  Cost Model: Found costs of RThru:24 CodeSize:40 Lat:40 SizeLat:48 for: %res32 = mul <32 x i64> %xa32, %xb32
+; AVX1-NEXT:  Cost Model: Found costs of RThru:48 CodeSize:80 Lat:80 SizeLat:96 for: %res64 = mul <64 x i64> %xa64, %xb64
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret void
 ;
 ; AVX2-LABEL: 'mul_zext_vXi8'
@@ -985,11 +985,11 @@ define void @mul_zext_vXi16(<2 x i16> %a2, <2 x i16> %b2, <4 x i16> %a4, <4 x i1
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:52 CodeSize:1 Lat:1 SizeLat:1 for: %xa64 = zext <64 x i16> %a64 to <64 x i64>
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:52 CodeSize:1 Lat:1 SizeLat:1 for: %xb64 = zext <64 x i16> %b64 to <64 x i64>
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:5 SizeLat:1 for: %res2 = mul <2 x i64> %xa2, %xb2
-; AVX1-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %res4 = mul <4 x i64> %xa4, %xb4
-; AVX1-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:1 Lat:1 SizeLat:1 for: %res8 = mul <8 x i64> %xa8, %xb8
-; AVX1-NEXT:  Cost Model: Found costs of RThru:8 CodeSize:1 Lat:1 SizeLat:1 for: %res16 = mul <16 x i64> %xa16, %xb16
-; AVX1-NEXT:  Cost Model: Found costs of RThru:16 CodeSize:1 Lat:1 SizeLat:1 for: %res32 = mul <32 x i64> %xa32, %xb32
-; AVX1-NEXT:  Cost Model: Found costs of RThru:32 CodeSize:1 Lat:1 SizeLat:1 for: %res64 = mul <64 x i64> %xa64, %xb64
+; AVX1-NEXT:  Cost Model: Found costs of RThru:3 CodeSize:5 Lat:5 SizeLat:6 for: %res4 = mul <4 x i64> %xa4, %xb4
+; AVX1-NEXT:  Cost Model: Found costs of RThru:6 CodeSize:10 Lat:10 SizeLat:12 for: %res8 = mul <8 x i64> %xa8, %xb8
+; AVX1-NEXT:  Cost Model: Found costs of RThru:12 CodeSize:20 Lat:20 SizeLat:24 for: %res16 = mul <16 x i64> %xa16, %xb16
+; AVX1-NEXT:  Cost Model: Found costs of RThru:24 CodeSize:40 Lat:40 SizeLat:48 for: %res32 = mul <32 x i64> %xa32, %xb32
+; AVX1-NEXT:  Cost Model: Found costs of RThru:48 CodeSize:80 Lat:80 SizeLat:96 for: %res64 = mul <64 x i64> %xa64, %xb64
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret void
 ;
 ; AVX2-LABEL: 'mul_zext_vXi16'
@@ -1643,11 +1643,11 @@ define void @mul_zext_vXi32(<2 x i32> %a2, <2 x i32> %b2, <4 x i32> %a4, <4 x i3
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:56 CodeSize:1 Lat:1 SizeLat:1 for: %xa64 = zext <64 x i32> %a64 to <64 x i64>
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:56 CodeSize:1 Lat:1 SizeLat:1 for: %xb64 = zext <64 x i32> %b64 to <64 x i64>
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:5 SizeLat:1 for: %res2 = mul <2 x i64> %xa2, %xb2
-; AVX1-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %res4 = mul <4 x i64> %xa4, %xb4
-; AVX1-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:1 Lat:1 SizeLat:1 for: %res8 = mul <8 x i64> %xa8, %xb8
-; AVX1-NEXT:  Cost Model: Found costs of RThru:8 CodeSize:1 Lat:1 SizeLat:1 for: %res16 = mul <16 x i64> %xa16, %xb16
-; AVX1-NEXT:  Cost Model: Found costs of RThru:16 CodeSize:1 Lat:1 SizeLat:1 for: %res32 = mul <32 x i64> %xa32, %xb32
-; AVX1-NEXT:  Cost Model: Found costs of RThru:32 CodeSize:1 Lat:1 SizeLat:1 for: %res64 = mul <64 x i64> %xa64, %xb64
+; AVX1-NEXT:  Cost Model: Found costs of RThru:3 CodeSize:5 Lat:5 SizeLat:6 for: %res4 = mul <4 x i64> %xa4, %xb4
+; AVX1-NEXT:  Cost Model: Found costs of RThru:6 CodeSize:10 Lat:10 SizeLat:12 for: %res8 = mul <8 x i64> %xa8, %xb8
+; AVX1-NEXT:  Cost Model: Found costs of RThru:12 CodeSize:20 Lat:20 SizeLat:24 for: %res16 = mul <16 x i64> %xa16, %xb16
+; AVX1-NEXT:  Cost Model: Found costs of RThru:24 CodeSize:40 Lat:40 SizeLat:48 for: %res32 = mul <32 x i64> %xa32, %xb32
+; AVX1-NEXT:  Cost Model: Found costs of RThru:48 CodeSize:80 Lat:80 SizeLat:96 for: %res64 = mul <64 x i64> %xa64, %xb64
 ; AVX1-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret void
 ;
 ; AVX2-LABEL: 'mul_zext_vXi32'


        


More information about the llvm-commits mailing list