[llvm] 05f56f1 - [X86] Fix VPPERM load folding latency

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 9 05:58:03 PDT 2022


Author: Simon Pilgrim
Date: 2022-09-09T13:57:39+01:00
New Revision: 05f56f10ed84dc56bb4fb686c4630aeb4dcbca0b

URL: https://github.com/llvm/llvm-project/commit/05f56f10ed84dc56bb4fb686c4630aeb4dcbca0b
DIFF: https://github.com/llvm/llvm-project/commit/05f56f10ed84dc56bb4fb686c4630aeb4dcbca0b.diff

LOG: [X86] Fix VPPERM load folding latency

Noticed while investigating BITREVERSE cost numbers with the D103695 script - VPPERM folded loads was using the WriteVarShuffleX defaults and was missing an override like the VPPERM reg-reg variants

Added: 
    

Modified: 
    llvm/lib/Target/X86/X86ScheduleBdVer2.td
    llvm/test/tools/llvm-mca/X86/BdVer2/resources-xop.s

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/X86/X86ScheduleBdVer2.td b/llvm/lib/Target/X86/X86ScheduleBdVer2.td
index cb75c3660728d..ef2b4263bff4f 100644
--- a/llvm/lib/Target/X86/X86ScheduleBdVer2.td
+++ b/llvm/lib/Target/X86/X86ScheduleBdVer2.td
@@ -1186,6 +1186,12 @@ def PdWriteVPPERM : SchedWriteRes<[PdFPU01, PdFPMAL]> {
 }
 def : InstRW<[PdWriteVPPERM], (instrs VPPERMrrr, VPPERMrrr_REV)>;
 
+def PdWriteVPPERMLd : SchedWriteRes<[PdFPU01, PdFPMAL, PdLoad]> {
+  let Latency = 7;
+  let ResourceCycles = [1, 3, 3];
+}
+def : InstRW<[PdWriteVPPERMLd], (instrs VPPERMrrm, VPPERMrmr)>;
+
 defm : PdWriteResXMMPair<WriteBlend,         [PdFPU01, PdFPMAL], 2>;
 defm : X86WriteResPairUnsupported<WriteBlendY>;
 defm : X86WriteResPairUnsupported<WriteBlendZ>;

diff  --git a/llvm/test/tools/llvm-mca/X86/BdVer2/resources-xop.s b/llvm/test/tools/llvm-mca/X86/BdVer2/resources-xop.s
index 3effa5518ea1d..b4650841704e7 100644
--- a/llvm/test/tools/llvm-mca/X86/BdVer2/resources-xop.s
+++ b/llvm/test/tools/llvm-mca/X86/BdVer2/resources-xop.s
@@ -322,8 +322,8 @@ vpshlw %xmm0, (%rax), %xmm3
 # CHECK-NEXT:  1      4     1.00                        vpmadcswd	%xmm0, %xmm1, %xmm2, %xmm3
 # CHECK-NEXT:  1      9     1.50    *                   vpmadcswd	%xmm0, (%rax), %xmm1, %xmm3
 # CHECK-NEXT:  1      2     1.50                        vpperm	%xmm0, %xmm1, %xmm2, %xmm3
-# CHECK-NEXT:  1      8     1.50    *                   vpperm	(%rax), %xmm0, %xmm1, %xmm3
-# CHECK-NEXT:  1      8     1.50    *                   vpperm	%xmm0, (%rax), %xmm1, %xmm3
+# CHECK-NEXT:  1      7     1.50    *                   vpperm	(%rax), %xmm0, %xmm1, %xmm3
+# CHECK-NEXT:  1      7     1.50    *                   vpperm	%xmm0, (%rax), %xmm1, %xmm3
 # CHECK-NEXT:  1      3     1.00                        vprotb	%xmm0, %xmm1, %xmm3
 # CHECK-NEXT:  1      8     1.50    *                   vprotb	(%rax), %xmm0, %xmm3
 # CHECK-NEXT:  1      8     1.50    *                   vprotb	%xmm0, (%rax), %xmm3


        


More information about the llvm-commits mailing list