[llvm] r343452 - [X86][BtVer2] Teach how to identify zero-idiom VPERM2F128rr instructions.

Andrea Di Biagio via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 1 03:35:13 PDT 2018


Author: adibiagio
Date: Mon Oct  1 03:35:13 2018
New Revision: 343452

URL: http://llvm.org/viewvc/llvm-project?rev=343452&view=rev
Log:
[X86][BtVer2] Teach how to identify zero-idiom VPERM2F128rr instructions.

This patch adds another variant class to identify zero-idiom VPERM2F128rr
instructions.

On Jaguar, a VPERM wih bit 3 and 7 of the mask set, is a zero-idiom.

Differential Revision: https://reviews.llvm.org/D52663

Modified:
    llvm/trunk/lib/Target/X86/X86SchedPredicates.td
    llvm/trunk/lib/Target/X86/X86ScheduleBtVer2.td
    llvm/trunk/test/tools/llvm-mca/X86/BtVer2/zero-idioms-avx-256.s

Modified: llvm/trunk/lib/Target/X86/X86SchedPredicates.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86SchedPredicates.td?rev=343452&r1=343451&r2=343452&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86SchedPredicates.td (original)
+++ llvm/trunk/lib/Target/X86/X86SchedPredicates.td Mon Oct  1 03:35:13 2018
@@ -19,6 +19,13 @@
 // different zero-idioms.
 def ZeroIdiomPredicate : CheckSameRegOperand<1, 2>;
 
+// A predicate used to identify VPERM that have bits 3 and 7 of their mask set.
+// On some processors, these VPERM instructions are zero-idioms.
+def ZeroIdiomVPERMPredicate : CheckAll<[
+  ZeroIdiomPredicate,
+  CheckImmOperand<3, 0x88>
+]>;
+
 // A predicate used to check if a LEA instruction uses all three source
 // operands: base, index, and offset.
 def IsThreeOperandsLEAPredicate: CheckAll<[

Modified: llvm/trunk/lib/Target/X86/X86ScheduleBtVer2.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ScheduleBtVer2.td?rev=343452&r1=343451&r2=343452&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ScheduleBtVer2.td (original)
+++ llvm/trunk/lib/Target/X86/X86ScheduleBtVer2.td Mon Oct  1 03:35:13 2018
@@ -688,6 +688,12 @@ def : InstRW<[JWriteVZeroIdiomALUX], (in
                                              PCMPGTQrr, VPCMPGTQrr,
                                              PCMPGTWrr, VPCMPGTWrr)>;
 
+def JWriteVPERM2F128 : SchedWriteVariant<[
+  SchedVar<MCSchedPredicate<ZeroIdiomVPERMPredicate>, [JWriteZeroIdiomYmm]>,
+  SchedVar<NoSchedPred,                               [WriteFShuffle256]>
+]>;
+def : InstRW<[JWriteVPERM2F128], (instrs VPERM2F128rr)>;
+
 // This write is used for slow LEA instructions.
 def JWrite3OpsLEA : SchedWriteRes<[JALU1, JSAGU]> {
   let Latency = 2;
@@ -762,7 +768,9 @@ def : IsZeroIdiomFunction<[
 
     // ymm variants.
     VXORPSYrr, VXORPDYrr, VANDNPSYrr, VANDNPDYrr
-  ], ZeroIdiomPredicate>
+  ], ZeroIdiomPredicate>,
+
+  DepBreakingClass<[ VPERM2F128rr ], ZeroIdiomVPERMPredicate>
 ]>;
 
 def : IsDepBreakingFunction<[

Modified: llvm/trunk/test/tools/llvm-mca/X86/BtVer2/zero-idioms-avx-256.s
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-mca/X86/BtVer2/zero-idioms-avx-256.s?rev=343452&r1=343451&r2=343452&view=diff
==============================================================================
--- llvm/trunk/test/tools/llvm-mca/X86/BtVer2/zero-idioms-avx-256.s (original)
+++ llvm/trunk/test/tools/llvm-mca/X86/BtVer2/zero-idioms-avx-256.s Mon Oct  1 03:35:13 2018
@@ -330,12 +330,12 @@ vaddps  %ymm1, %ymm1, %ymm0
 
 # CHECK:      Iterations:        100
 # CHECK-NEXT: Instructions:      200
-# CHECK-NEXT: Total Cycles:      403
+# CHECK-NEXT: Total Cycles:      205
 # CHECK-NEXT: Total uOps:        400
 
 # CHECK:      Dispatch Width:    2
-# CHECK-NEXT: uOps Per Cycle:    0.99
-# CHECK-NEXT: IPC:               0.50
+# CHECK-NEXT: uOps Per Cycle:    1.95
+# CHECK-NEXT: IPC:               0.98
 # CHECK-NEXT: Block RThroughput: 2.0
 
 # CHECK:      Instruction Info:
@@ -347,7 +347,7 @@ vaddps  %ymm1, %ymm1, %ymm0
 # CHECK-NEXT: [6]: HasSideEffects (U)
 
 # CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
-# CHECK-NEXT:  2      1     1.00                        vperm2f128	$136, %ymm0, %ymm0, %ymm1
+# CHECK-NEXT:  2      1     0.50                        vperm2f128	$136, %ymm0, %ymm0, %ymm1
 # CHECK-NEXT:  2      3     2.00                        vaddps	%ymm1, %ymm1, %ymm0
 
 # CHECK:      Resources:
@@ -368,23 +368,23 @@ vaddps  %ymm1, %ymm1, %ymm0
 
 # CHECK:      Resource pressure per iteration:
 # CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8]    [9]    [10]   [11]   [12]   [13]
-# CHECK-NEXT:  -      -      -     2.00   2.00   2.00   2.00    -      -      -      -      -      -      -
+# CHECK-NEXT:  -      -      -     2.00   1.00   2.00   1.00    -      -      -      -      -      -      -
 
 # CHECK:      Resource pressure by instruction:
 # CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8]    [9]    [10]   [11]   [12]   [13]   Instructions:
-# CHECK-NEXT:  -      -      -      -     2.00    -     2.00    -      -      -      -      -      -      -     vperm2f128	$136, %ymm0, %ymm0, %ymm1
+# CHECK-NEXT:  -      -      -      -     1.00    -     1.00    -      -      -      -      -      -      -     vperm2f128	$136, %ymm0, %ymm0, %ymm1
 # CHECK-NEXT:  -      -      -     2.00    -     2.00    -      -      -      -      -      -      -      -     vaddps	%ymm1, %ymm1, %ymm0
 
 # CHECK:      Timeline view:
-# CHECK-NEXT:                     01234
+# CHECK-NEXT:                     0
 # CHECK-NEXT: Index     0123456789
 
-# CHECK:      [0,0]     DeER .    .   .   vperm2f128	$136, %ymm0, %ymm0, %ymm1
-# CHECK-NEXT: [0,1]     .DeeeER   .   .   vaddps	%ymm1, %ymm1, %ymm0
-# CHECK-NEXT: [1,0]     . D==eER  .   .   vperm2f128	$136, %ymm0, %ymm0, %ymm1
-# CHECK-NEXT: [1,1]     .  D==eeeER   .   vaddps	%ymm1, %ymm1, %ymm0
-# CHECK-NEXT: [2,0]     .   D====eER  .   vperm2f128	$136, %ymm0, %ymm0, %ymm1
-# CHECK-NEXT: [2,1]     .    D====eeeER   vaddps	%ymm1, %ymm1, %ymm0
+# CHECK:      [0,0]     DeER .    .   vperm2f128	$136, %ymm0, %ymm0, %ymm1
+# CHECK-NEXT: [0,1]     .DeeeER   .   vaddps	%ymm1, %ymm1, %ymm0
+# CHECK-NEXT: [1,0]     . DeE-R   .   vperm2f128	$136, %ymm0, %ymm0, %ymm1
+# CHECK-NEXT: [1,1]     .  DeeeER .   vaddps	%ymm1, %ymm1, %ymm0
+# CHECK-NEXT: [2,0]     .   DeE-R .   vperm2f128	$136, %ymm0, %ymm0, %ymm1
+# CHECK-NEXT: [2,1]     .    DeeeER   vaddps	%ymm1, %ymm1, %ymm0
 
 # CHECK:      Average Wait times (based on the timeline view):
 # CHECK-NEXT: [0]: Executions
@@ -393,5 +393,5 @@ vaddps  %ymm1, %ymm1, %ymm0
 # CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
 
 # CHECK:            [0]    [1]    [2]    [3]
-# CHECK-NEXT: 0.     3     3.0    0.3    0.0       vperm2f128	$136, %ymm0, %ymm0, %ymm1
-# CHECK-NEXT: 1.     3     3.0    0.0    0.0       vaddps	%ymm1, %ymm1, %ymm0
+# CHECK-NEXT: 0.     3     1.0    1.0    0.7       vperm2f128	$136, %ymm0, %ymm0, %ymm1
+# CHECK-NEXT: 1.     3     1.0    0.0    0.0       vaddps	%ymm1, %ymm1, %ymm0




More information about the llvm-commits mailing list