[llvm] r334945 - [llvm-mca] Add tests for XOP and AVX512 instructions that implicitly clear the upper portion of a super-register.

Andrea Di Biagio via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 18 07:00:32 PDT 2018


Author: adibiagio
Date: Mon Jun 18 07:00:30 2018
New Revision: 334945

URL: http://llvm.org/viewvc/llvm-project?rev=334945&view=rev
Log:
[llvm-mca] Add tests for XOP and AVX512 instructions that implicitly clear the upper portion of a super-register.

When the destination register of a XOP instruction is an XMM register, bits
[255:128] of the corresponding YMM register are cleared.

When the destination register of a EVEX encoded instruction is an XMM/YMM
register, the upper bits of the corresponding ZMM are cleared.
On processors that feature AVX512, a write to an XMM registers always clears the
upper portion of the corresponding ZMM register if the instruction is VEX or
EVEX encoded.

These new tests show some interesting cases which aren't correctly analyzed by
llvm-mca. The lack of knowledge related to the implicit update on the
super-registers is addressed by D48225.

Added:
    llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-1.s
    llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-2.s
    llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-3.s
    llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-1.s
    llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-2.s

Added: llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-1.s
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-1.s?rev=334945&view=auto
==============================================================================
--- llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-1.s (added)
+++ llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-1.s Mon Jun 18 07:00:30 2018
@@ -0,0 +1,86 @@
+# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -timeline -timeline-max-iterations=2 < %s | FileCheck %s
+
+  vmulps  %zmm0, %zmm1, %zmm2
+  vaddps  %xmm1, %xmm1, %xmm2
+  vmulps  %ymm2, %ymm3, %ymm4
+  vaddps  %xmm4, %xmm5, %xmm6
+  vmulps  %xmm6, %xmm3, %xmm4
+  vaddps  %xmm4, %xmm5, %xmm0
+
+# CHECK:      Iterations:        100
+# CHECK-NEXT: Instructions:      600
+# CHECK-NEXT: Total Cycles:      2103
+# CHECK-NEXT: Dispatch Width:    4
+# CHECK-NEXT: IPC:               0.29
+# CHECK-NEXT: Block RThroughput: 3.0
+
+# CHECK:      Instruction Info:
+# CHECK-NEXT: [1]: #uOps
+# CHECK-NEXT: [2]: Latency
+# CHECK-NEXT: [3]: RThroughput
+# CHECK-NEXT: [4]: MayLoad
+# CHECK-NEXT: [5]: MayStore
+# CHECK-NEXT: [6]: HasSideEffects
+
+# CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
+# CHECK-NEXT:  1      5     1.00                        vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT:  1      3     1.00                        vaddps	%xmm1, %xmm1, %xmm2
+# CHECK-NEXT:  1      5     1.00                        vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT:  1      5     1.00                        vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%xmm4, %xmm5, %xmm0
+
+# CHECK:      Resources:
+# CHECK-NEXT: [0]   - SBDivider
+# CHECK-NEXT: [1]   - SBFPDivider
+# CHECK-NEXT: [2]   - SBPort0
+# CHECK-NEXT: [3]   - SBPort1
+# CHECK-NEXT: [4]   - SBPort4
+# CHECK-NEXT: [5]   - SBPort5
+# CHECK-NEXT: [6.0] - SBPort23
+# CHECK-NEXT: [6.1] - SBPort23
+
+# CHECK:      Resource pressure per iteration:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]
+# CHECK-NEXT:  -      -     3.00   3.00    -      -      -      -
+
+# CHECK:      Resource pressure by instruction:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]  Instructions:
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%xmm1, %xmm1, %xmm2
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%xmm4, %xmm5, %xmm0
+
+# CHECK:      Timeline view:
+# CHECK-NEXT:                     0123456789          0123456789
+# CHECK-NEXT: Index     0123456789          0123456789          01234
+
+# CHECK:      [0,0]     DeeeeeER  .    .    .    .    .    .    .   .   vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT: [0,1]     DeeeE--R  .    .    .    .    .    .    .   .   vaddps	%xmm1, %xmm1, %xmm2
+# CHECK-NEXT: [0,2]     D=====eeeeeER  .    .    .    .    .    .   .   vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: [0,3]     D==========eeeER    .    .    .    .    .   .   vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT: [0,4]     .D============eeeeeER    .    .    .    .   .   vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT: [0,5]     .D=================eeeER .    .    .    .   .   vaddps	%xmm4, %xmm5, %xmm0
+# CHECK-NEXT: [1,0]     .D====================eeeeeER .    .    .   .   vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT: [1,1]     .DeeeE----------------------R .    .    .   .   vaddps	%xmm1, %xmm1, %xmm2
+# CHECK-NEXT: [1,2]     . D========================eeeeeER .    .   .   vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: [1,3]     . D=============================eeeER   .   .   vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT: [1,4]     . D================================eeeeeER  .   vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT: [1,5]     . D=====================================eeeER   vaddps	%xmm4, %xmm5, %xmm0
+
+# CHECK:      Average Wait times (based on the timeline view):
+# CHECK-NEXT: [0]: Executions
+# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
+# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
+# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
+
+# CHECK:            [0]    [1]    [2]    [3]
+# CHECK-NEXT: 0.     2     11.0   0.5    0.0       vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT: 1.     2     1.0    1.0    12.0      vaddps	%xmm1, %xmm1, %xmm2
+# CHECK-NEXT: 2.     2     15.5   0.0    0.0       vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: 3.     2     20.5   0.0    0.0       vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT: 4.     2     23.0   0.0    0.0       vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT: 5.     2     28.0   0.0    0.0       vaddps	%xmm4, %xmm5, %xmm0

Added: llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-2.s
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-2.s?rev=334945&view=auto
==============================================================================
--- llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-2.s (added)
+++ llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-2.s Mon Jun 18 07:00:30 2018
@@ -0,0 +1,86 @@
+# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -timeline -timeline-max-iterations=2 < %s | FileCheck %s
+
+  vmulps  %zmm0, %zmm1, %zmm2
+  vaddps  %ymm1, %ymm1, %ymm2
+  vmulps  %zmm2, %zmm3, %zmm4
+  vaddps  %xmm4, %xmm5, %xmm6
+  vmulps  %xmm6, %xmm3, %xmm4
+  vaddps  %xmm4, %xmm5, %xmm0
+
+# CHECK:      Iterations:        100
+# CHECK-NEXT: Instructions:      600
+# CHECK-NEXT: Total Cycles:      2103
+# CHECK-NEXT: Dispatch Width:    4
+# CHECK-NEXT: IPC:               0.29
+# CHECK-NEXT: Block RThroughput: 3.0
+
+# CHECK:      Instruction Info:
+# CHECK-NEXT: [1]: #uOps
+# CHECK-NEXT: [2]: Latency
+# CHECK-NEXT: [3]: RThroughput
+# CHECK-NEXT: [4]: MayLoad
+# CHECK-NEXT: [5]: MayStore
+# CHECK-NEXT: [6]: HasSideEffects
+
+# CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
+# CHECK-NEXT:  1      5     1.00                        vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT:  1      3     1.00                        vaddps	%ymm1, %ymm1, %ymm2
+# CHECK-NEXT:  1      5     1.00                        vmulps	%zmm2, %zmm3, %zmm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT:  1      5     1.00                        vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%xmm4, %xmm5, %xmm0
+
+# CHECK:      Resources:
+# CHECK-NEXT: [0]   - SBDivider
+# CHECK-NEXT: [1]   - SBFPDivider
+# CHECK-NEXT: [2]   - SBPort0
+# CHECK-NEXT: [3]   - SBPort1
+# CHECK-NEXT: [4]   - SBPort4
+# CHECK-NEXT: [5]   - SBPort5
+# CHECK-NEXT: [6.0] - SBPort23
+# CHECK-NEXT: [6.1] - SBPort23
+
+# CHECK:      Resource pressure per iteration:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]
+# CHECK-NEXT:  -      -     3.00   3.00    -      -      -      -
+
+# CHECK:      Resource pressure by instruction:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]  Instructions:
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%ymm1, %ymm1, %ymm2
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%zmm2, %zmm3, %zmm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%xmm4, %xmm5, %xmm0
+
+# CHECK:      Timeline view:
+# CHECK-NEXT:                     0123456789          0123456789
+# CHECK-NEXT: Index     0123456789          0123456789          01234
+
+# CHECK:      [0,0]     DeeeeeER  .    .    .    .    .    .    .   .   vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT: [0,1]     DeeeE--R  .    .    .    .    .    .    .   .   vaddps	%ymm1, %ymm1, %ymm2
+# CHECK-NEXT: [0,2]     D=====eeeeeER  .    .    .    .    .    .   .   vmulps	%zmm2, %zmm3, %zmm4
+# CHECK-NEXT: [0,3]     D==========eeeER    .    .    .    .    .   .   vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT: [0,4]     .D============eeeeeER    .    .    .    .   .   vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT: [0,5]     .D=================eeeER .    .    .    .   .   vaddps	%xmm4, %xmm5, %xmm0
+# CHECK-NEXT: [1,0]     .D====================eeeeeER .    .    .   .   vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT: [1,1]     .DeeeE----------------------R .    .    .   .   vaddps	%ymm1, %ymm1, %ymm2
+# CHECK-NEXT: [1,2]     . D========================eeeeeER .    .   .   vmulps	%zmm2, %zmm3, %zmm4
+# CHECK-NEXT: [1,3]     . D=============================eeeER   .   .   vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT: [1,4]     . D================================eeeeeER  .   vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT: [1,5]     . D=====================================eeeER   vaddps	%xmm4, %xmm5, %xmm0
+
+# CHECK:      Average Wait times (based on the timeline view):
+# CHECK-NEXT: [0]: Executions
+# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
+# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
+# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
+
+# CHECK:            [0]    [1]    [2]    [3]
+# CHECK-NEXT: 0.     2     11.0   0.5    0.0       vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT: 1.     2     1.0    1.0    12.0      vaddps	%ymm1, %ymm1, %ymm2
+# CHECK-NEXT: 2.     2     15.5   0.0    0.0       vmulps	%zmm2, %zmm3, %zmm4
+# CHECK-NEXT: 3.     2     20.5   0.0    0.0       vaddps	%xmm4, %xmm5, %xmm6
+# CHECK-NEXT: 4.     2     23.0   0.0    0.0       vmulps	%xmm6, %xmm3, %xmm4
+# CHECK-NEXT: 5.     2     28.0   0.0    0.0       vaddps	%xmm4, %xmm5, %xmm0

Added: llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-3.s
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-3.s?rev=334945&view=auto
==============================================================================
--- llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-3.s (added)
+++ llvm/trunk/test/tools/llvm-mca/X86/Generic/avx512-super-registers-3.s Mon Jun 18 07:00:30 2018
@@ -0,0 +1,86 @@
+# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -timeline -timeline-max-iterations=2 < %s | FileCheck %s
+
+  vmulps  %zmm0, %zmm1, %zmm2
+  vaddps  %xmm16, %xmm17, %xmm2
+  vmulps  %ymm2, %ymm3, %ymm4
+  vaddps  %xmm4, %xmm18, %xmm6
+  vmulps  %xmm6, %xmm19, %xmm4
+  vaddps  %xmm4, %xmm20, %xmm0
+
+# CHECK:      Iterations:        100
+# CHECK-NEXT: Instructions:      600
+# CHECK-NEXT: Total Cycles:      2103
+# CHECK-NEXT: Dispatch Width:    4
+# CHECK-NEXT: IPC:               0.29
+# CHECK-NEXT: Block RThroughput: 3.0
+
+# CHECK:      Instruction Info:
+# CHECK-NEXT: [1]: #uOps
+# CHECK-NEXT: [2]: Latency
+# CHECK-NEXT: [3]: RThroughput
+# CHECK-NEXT: [4]: MayLoad
+# CHECK-NEXT: [5]: MayStore
+# CHECK-NEXT: [6]: HasSideEffects
+
+# CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
+# CHECK-NEXT:  1      5     1.00                        vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT:  1      3     1.00                        vaddps	%xmm16, %xmm17, %xmm2
+# CHECK-NEXT:  1      5     1.00                        vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%xmm4, %xmm18, %xmm6
+# CHECK-NEXT:  1      5     1.00                        vmulps	%xmm6, %xmm19, %xmm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%xmm4, %xmm20, %xmm0
+
+# CHECK:      Resources:
+# CHECK-NEXT: [0]   - SBDivider
+# CHECK-NEXT: [1]   - SBFPDivider
+# CHECK-NEXT: [2]   - SBPort0
+# CHECK-NEXT: [3]   - SBPort1
+# CHECK-NEXT: [4]   - SBPort4
+# CHECK-NEXT: [5]   - SBPort5
+# CHECK-NEXT: [6.0] - SBPort23
+# CHECK-NEXT: [6.1] - SBPort23
+
+# CHECK:      Resource pressure per iteration:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]
+# CHECK-NEXT:  -      -     3.00   3.00    -      -      -      -
+
+# CHECK:      Resource pressure by instruction:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]  Instructions:
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%xmm16, %xmm17, %xmm2
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%xmm4, %xmm18, %xmm6
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%xmm6, %xmm19, %xmm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%xmm4, %xmm20, %xmm0
+
+# CHECK:      Timeline view:
+# CHECK-NEXT:                     0123456789          0123456789
+# CHECK-NEXT: Index     0123456789          0123456789          01234
+
+# CHECK:      [0,0]     DeeeeeER  .    .    .    .    .    .    .   .   vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT: [0,1]     DeeeE--R  .    .    .    .    .    .    .   .   vaddps	%xmm16, %xmm17, %xmm2
+# CHECK-NEXT: [0,2]     D=====eeeeeER  .    .    .    .    .    .   .   vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: [0,3]     D==========eeeER    .    .    .    .    .   .   vaddps	%xmm4, %xmm18, %xmm6
+# CHECK-NEXT: [0,4]     .D============eeeeeER    .    .    .    .   .   vmulps	%xmm6, %xmm19, %xmm4
+# CHECK-NEXT: [0,5]     .D=================eeeER .    .    .    .   .   vaddps	%xmm4, %xmm20, %xmm0
+# CHECK-NEXT: [1,0]     .D====================eeeeeER .    .    .   .   vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT: [1,1]     .DeeeE----------------------R .    .    .   .   vaddps	%xmm16, %xmm17, %xmm2
+# CHECK-NEXT: [1,2]     . D========================eeeeeER .    .   .   vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: [1,3]     . D=============================eeeER   .   .   vaddps	%xmm4, %xmm18, %xmm6
+# CHECK-NEXT: [1,4]     . D================================eeeeeER  .   vmulps	%xmm6, %xmm19, %xmm4
+# CHECK-NEXT: [1,5]     . D=====================================eeeER   vaddps	%xmm4, %xmm20, %xmm0
+
+# CHECK:      Average Wait times (based on the timeline view):
+# CHECK-NEXT: [0]: Executions
+# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
+# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
+# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
+
+# CHECK:            [0]    [1]    [2]    [3]
+# CHECK-NEXT: 0.     2     11.0   0.5    0.0       vmulps	%zmm0, %zmm1, %zmm2
+# CHECK-NEXT: 1.     2     1.0    1.0    12.0      vaddps	%xmm16, %xmm17, %xmm2
+# CHECK-NEXT: 2.     2     15.5   0.0    0.0       vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: 3.     2     20.5   0.0    0.0       vaddps	%xmm4, %xmm18, %xmm6
+# CHECK-NEXT: 4.     2     23.0   0.0    0.0       vmulps	%xmm6, %xmm19, %xmm4
+# CHECK-NEXT: 5.     2     28.0   0.0    0.0       vaddps	%xmm4, %xmm20, %xmm0

Added: llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-1.s
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-1.s?rev=334945&view=auto
==============================================================================
--- llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-1.s (added)
+++ llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-1.s Mon Jun 18 07:00:30 2018
@@ -0,0 +1,86 @@
+# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -timeline -timeline-max-iterations=2 < %s | FileCheck %s
+
+  vmulps  %ymm0, %ymm1, %ymm2
+  vfrczpd %xmm1, %xmm2
+  vmulps  %ymm2, %ymm3, %ymm4
+  vaddps  %ymm4, %ymm5, %ymm6
+  vmulps  %ymm6, %ymm3, %ymm4
+  vaddps  %ymm4, %ymm5, %ymm0
+
+# CHECK:      Iterations:        100
+# CHECK-NEXT: Instructions:      600
+# CHECK-NEXT: Total Cycles:      2103
+# CHECK-NEXT: Dispatch Width:    4
+# CHECK-NEXT: IPC:               0.29
+# CHECK-NEXT: Block RThroughput: 3.0
+
+# CHECK:      Instruction Info:
+# CHECK-NEXT: [1]: #uOps
+# CHECK-NEXT: [2]: Latency
+# CHECK-NEXT: [3]: RThroughput
+# CHECK-NEXT: [4]: MayLoad
+# CHECK-NEXT: [5]: MayStore
+# CHECK-NEXT: [6]: HasSideEffects
+
+# CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
+# CHECK-NEXT:  1      5     1.00                        vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT:  1      3     1.00                        vfrczpd	%xmm1, %xmm2
+# CHECK-NEXT:  1      5     1.00                        vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT:  1      5     1.00                        vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%ymm4, %ymm5, %ymm0
+
+# CHECK:      Resources:
+# CHECK-NEXT: [0]   - SBDivider
+# CHECK-NEXT: [1]   - SBFPDivider
+# CHECK-NEXT: [2]   - SBPort0
+# CHECK-NEXT: [3]   - SBPort1
+# CHECK-NEXT: [4]   - SBPort4
+# CHECK-NEXT: [5]   - SBPort5
+# CHECK-NEXT: [6.0] - SBPort23
+# CHECK-NEXT: [6.1] - SBPort23
+
+# CHECK:      Resource pressure per iteration:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]
+# CHECK-NEXT:  -      -     3.00   3.00    -      -      -      -
+
+# CHECK:      Resource pressure by instruction:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]  Instructions:
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vfrczpd	%xmm1, %xmm2
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%ymm4, %ymm5, %ymm0
+
+# CHECK:      Timeline view:
+# CHECK-NEXT:                     0123456789          0123456789
+# CHECK-NEXT: Index     0123456789          0123456789          01234
+
+# CHECK:      [0,0]     DeeeeeER  .    .    .    .    .    .    .   .   vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT: [0,1]     DeeeE--R  .    .    .    .    .    .    .   .   vfrczpd	%xmm1, %xmm2
+# CHECK-NEXT: [0,2]     D=====eeeeeER  .    .    .    .    .    .   .   vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: [0,3]     D==========eeeER    .    .    .    .    .   .   vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT: [0,4]     .D============eeeeeER    .    .    .    .   .   vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT: [0,5]     .D=================eeeER .    .    .    .   .   vaddps	%ymm4, %ymm5, %ymm0
+# CHECK-NEXT: [1,0]     .D====================eeeeeER .    .    .   .   vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT: [1,1]     .DeeeE----------------------R .    .    .   .   vfrczpd	%xmm1, %xmm2
+# CHECK-NEXT: [1,2]     . D========================eeeeeER .    .   .   vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: [1,3]     . D=============================eeeER   .   .   vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT: [1,4]     . D================================eeeeeER  .   vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT: [1,5]     . D=====================================eeeER   vaddps	%ymm4, %ymm5, %ymm0
+
+# CHECK:      Average Wait times (based on the timeline view):
+# CHECK-NEXT: [0]: Executions
+# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
+# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
+# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
+
+# CHECK:            [0]    [1]    [2]    [3]
+# CHECK-NEXT: 0.     2     11.0   0.5    0.0       vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT: 1.     2     1.0    1.0    12.0      vfrczpd	%xmm1, %xmm2
+# CHECK-NEXT: 2.     2     15.5   0.0    0.0       vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: 3.     2     20.5   0.0    0.0       vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT: 4.     2     23.0   0.0    0.0       vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT: 5.     2     28.0   0.0    0.0       vaddps	%ymm4, %ymm5, %ymm0

Added: llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-2.s
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-2.s?rev=334945&view=auto
==============================================================================
--- llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-2.s (added)
+++ llvm/trunk/test/tools/llvm-mca/X86/Generic/xop-super-registers-2.s Mon Jun 18 07:00:30 2018
@@ -0,0 +1,86 @@
+# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -timeline -timeline-max-iterations=2 < %s | FileCheck %s
+
+  vmulps     %ymm0, %ymm1, %ymm2
+  vpermil2pd $16, %xmm3, %xmm5, %xmm1, %xmm2
+  vmulps     %ymm2, %ymm3, %ymm4
+  vaddps     %ymm4, %ymm5, %ymm6
+  vmulps     %ymm6, %ymm3, %ymm4
+  vaddps     %ymm4, %ymm5, %ymm0
+
+# CHECK:      Iterations:        100
+# CHECK-NEXT: Instructions:      600
+# CHECK-NEXT: Total Cycles:      2103
+# CHECK-NEXT: Dispatch Width:    4
+# CHECK-NEXT: IPC:               0.29
+# CHECK-NEXT: Block RThroughput: 3.0
+
+# CHECK:      Instruction Info:
+# CHECK-NEXT: [1]: #uOps
+# CHECK-NEXT: [2]: Latency
+# CHECK-NEXT: [3]: RThroughput
+# CHECK-NEXT: [4]: MayLoad
+# CHECK-NEXT: [5]: MayStore
+# CHECK-NEXT: [6]: HasSideEffects
+
+# CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
+# CHECK-NEXT:  1      5     1.00                        vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT:  1      1     1.00                        vpermil2pd	$16, %xmm3, %xmm5, %xmm1, %xmm2
+# CHECK-NEXT:  1      5     1.00                        vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT:  1      5     1.00                        vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT:  1      3     1.00                        vaddps	%ymm4, %ymm5, %ymm0
+
+# CHECK:      Resources:
+# CHECK-NEXT: [0]   - SBDivider
+# CHECK-NEXT: [1]   - SBFPDivider
+# CHECK-NEXT: [2]   - SBPort0
+# CHECK-NEXT: [3]   - SBPort1
+# CHECK-NEXT: [4]   - SBPort4
+# CHECK-NEXT: [5]   - SBPort5
+# CHECK-NEXT: [6.0] - SBPort23
+# CHECK-NEXT: [6.1] - SBPort23
+
+# CHECK:      Resource pressure per iteration:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]
+# CHECK-NEXT:  -      -     3.00   2.00    -     1.00    -      -
+
+# CHECK:      Resource pressure by instruction:
+# CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6.0]  [6.1]  Instructions:
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT:  -      -      -      -      -     1.00    -      -     vpermil2pd	$16, %xmm3, %xmm5, %xmm1, %xmm2
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT:  -      -     1.00    -      -      -      -      -     vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT:  -      -      -     1.00    -      -      -      -     vaddps	%ymm4, %ymm5, %ymm0
+
+# CHECK:      Timeline view:
+# CHECK-NEXT:                     0123456789          0123456789
+# CHECK-NEXT: Index     0123456789          0123456789          01234
+
+# CHECK:      [0,0]     DeeeeeER  .    .    .    .    .    .    .   .   vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT: [0,1]     DeE----R  .    .    .    .    .    .    .   .   vpermil2pd	$16, %xmm3, %xmm5, %xmm1, %xmm2
+# CHECK-NEXT: [0,2]     D=====eeeeeER  .    .    .    .    .    .   .   vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: [0,3]     D==========eeeER    .    .    .    .    .   .   vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT: [0,4]     .D============eeeeeER    .    .    .    .   .   vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT: [0,5]     .D=================eeeER .    .    .    .   .   vaddps	%ymm4, %ymm5, %ymm0
+# CHECK-NEXT: [1,0]     .D====================eeeeeER .    .    .   .   vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT: [1,1]     .DeE------------------------R .    .    .   .   vpermil2pd	$16, %xmm3, %xmm5, %xmm1, %xmm2
+# CHECK-NEXT: [1,2]     . D========================eeeeeER .    .   .   vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: [1,3]     . D=============================eeeER   .   .   vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT: [1,4]     . D================================eeeeeER  .   vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT: [1,5]     . D=====================================eeeER   vaddps	%ymm4, %ymm5, %ymm0
+
+# CHECK:      Average Wait times (based on the timeline view):
+# CHECK-NEXT: [0]: Executions
+# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
+# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
+# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
+
+# CHECK:            [0]    [1]    [2]    [3]
+# CHECK-NEXT: 0.     2     11.0   0.5    0.0       vmulps	%ymm0, %ymm1, %ymm2
+# CHECK-NEXT: 1.     2     1.0    1.0    14.0      vpermil2pd	$16, %xmm3, %xmm5, %xmm1, %xmm2
+# CHECK-NEXT: 2.     2     15.5   0.0    0.0       vmulps	%ymm2, %ymm3, %ymm4
+# CHECK-NEXT: 3.     2     20.5   0.0    0.0       vaddps	%ymm4, %ymm5, %ymm6
+# CHECK-NEXT: 4.     2     23.0   0.0    0.0       vmulps	%ymm6, %ymm3, %ymm4
+# CHECK-NEXT: 5.     2     28.0   0.0    0.0       vaddps	%ymm4, %ymm5, %ymm0




More information about the llvm-commits mailing list