[llvm] r218932 - [x86] Remove some of the --show-mc-encoding flags from avx512 tests that

Sun Oct 5 00:42:44 PDT 2014

Register allocation is done AFTER instruction selection. That's why I want to be sure that EVEX form is chosen.
Just think about big code that uses all 32 registers. That's why Intel duplicates all VEX instructions - to reduce amount of spill/fill, allow loop unrolling an so on.
I know that VEX instruction is enough if you use 0-15 registers. And it is shorter. In the future we can think about a pass that will work after register allocator and remap all EVEX instructions to VEX in order to reduce the code size.


-           Elena

From: Adam Nemet [mailto:anemet at apple.com]
Sent: Sunday, October 05, 2014 09:16
To: Demikhovsky, Elena
Cc: Chandler Carruth; Robert Khasanov; LLVM Commits
Subject: Re: [llvm] r218932 - [x86] Remove some of the --show-mc-encoding flags from avx512 tests that


On Oct 4, 2014, at 3:16 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at intel.com>> wrote:


I put encoding check to be sure that AVX-512 instruction is taken instead of AVX2.
It is very important because AVX-512 allows more registers with same mnemonic.

Sure but what about my example when we only use the low 16 vector registers.  Is there any reason why we'd want EVEX encoding?  Looks like that the assembler and CodeGen disagree.  Assembler will pick VEX and CodeGen EVEX.  Intel's XED assembler picks VEX as well.

Adam

-  Elena


-----Original Message-----
From: Adam Nemet [mailto:anemet at apple.com]
Sent: Friday, October 03, 2014 21:57
To: Chandler Carruth; Demikhovsky, Elena; Robert Khasanov
Cc: LLVM Commits
Subject: Re: [llvm] r218932 - [x86] Remove some of the --show-mc-encoding flags from avx512 tests that


On Oct 2, 2014, at 9:30 PM, Adam Nemet <anemet at apple.com<mailto:anemet at apple.com>> wrote:


On Oct 2, 2014, at 5:36 PM, Chandler Carruth <chandlerc at gmail.com<mailto:chandlerc at gmail.com>> wrote:


Author: chandlerc
Date: Thu Oct  2 19:36:29 2014
New Revision: 218932

URL: http://llvm.org/viewvc/llvm-project?rev=218932&view=rev
Log:
[x86] Remove some of the --show-mc-encoding flags from avx512 tests
that need to be updated for the new vector shuffle lowering.

After talking to Adam Nemet, Tim Northover, etc., it seems that
testing MC encodings in the same suite as the basic codegen isn't the
right approach. Instead, we're going to want dedicated MC tests for
the encodings. These encodings are starting to get in my way so I
wanted to cut them out early. The total set of instructions that
should have encoding tests added is:

vpaddd
vsqrtss
vsqrtsd
vmovlhps
vmovhlps
valignq
vbroadcastss

So the plan is that I will put together a script that will move all the encoding tests from CodeGen to MC.

Not that simple :(((.  I am pretty confused at this point.  Elena, can you please help with this?

There are some cases of matching encoding here that only checks if we generate the EVEX prefix (0x62).  I guess for AVX512 scalar ops we want to generate the AVX512 encoded version even for default rounding rather than the AVX version.  Correct?

Since the mnemonic and operands are the same we need some way to steer this to AVX512.  This seems to be working in codegen (perhaps by chance) but not in the assembler.  E.g.

vsqrtsd %xmm0, %xmm0, %xmm0

is assembled without EVEX with -mcpu=knl.  Is this supposed to work?  The change that added the encoding checks seems to suggest that it should: http://reviews.llvm.org/rL197041

So for now, I will probably only move encoding checks that don't fall under this category until we work out the right approach.

Adam



The concern was that we're not testing the assembler when we check the encoding only through CodeGen.

Adam




Not too many parts of these tests were even using this. =]

Modified:
 llvm/trunk/test/CodeGen/X86/avx512-arith.ll
 llvm/trunk/test/CodeGen/X86/avx512-shuffle.ll
 llvm/trunk/test/CodeGen/X86/avx512-vbroadcast.ll

Modified: llvm/trunk/test/CodeGen/X86/avx512-arith.ll
URL:
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx51
2-arith.ll?rev=218932&r1=218931&r2=218932&view=diff
=====================================================================
=========

--- llvm/trunk/test/CodeGen/X86/avx512-arith.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-arith.ll Thu Oct  2 19:36:29
+++ 2014
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl
--show-mc-encoding| FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl | FileCheck
+%s

; CHECK-LABEL: addpd512
; CHECK: vaddpd
@@ -223,7 +223,7 @@ define <16 x i32> @vpaddd_broadcast_test }

; CHECK-LABEL: vpaddd_mask_test
-; CHECK: vpaddd {{%zmm[0-9]{1,2}, %zmm[0-9]{1,2}, %zmm[0-9]{1,2}
{%k[1-7]} }}
+; CHECK: vpaddd {{%zmm[0-9], %zmm[0-9], %zmm[0-9] {%k[1-7]}}}
; CHECK: ret
define <16 x i32> @vpaddd_mask_test(<16 x i32> %i, <16 x i32> %j, <16
x i32> %mask1) nounwind readnone {  %mask = icmp ne <16 x i32>
%mask1, zeroinitializer @@ -233,7 +233,7 @@ define <16 x i32>
@vpaddd_mask_test(<16 }

; CHECK-LABEL: vpaddd_maskz_test
-; CHECK: vpaddd {{%zmm[0-9]{1,2}, %zmm[0-9]{1,2}, %zmm[0-9]{1,2}
{%k[1-7]} {z} }}
+; CHECK: vpaddd {{%zmm[0-9], %zmm[0-9], %zmm[0-9] {%k[1-7]} {z}}}
; CHECK: ret
define <16 x i32> @vpaddd_maskz_test(<16 x i32> %i, <16 x i32> %j,
<16 x i32> %mask1) nounwind readnone {  %mask = icmp ne <16 x i32>
%mask1, zeroinitializer @@ -243,7 +243,7 @@ define <16 x i32>
@vpaddd_maskz_test(<16 }

; CHECK-LABEL: vpaddd_mask_fold_test
-; CHECK: vpaddd (%rdi), {{%zmm[0-9]{1,2}, %zmm[0-9]{1,2} {%k[1-7]}
}}
+; CHECK: vpaddd (%rdi), {{%zmm[0-9], %zmm[0-9] {%k[1-7]}}}
; CHECK: ret
define <16 x i32> @vpaddd_mask_fold_test(<16 x i32> %i, <16 x i32>*
%j.ptr, <16 x i32> %mask1) nounwind readnone {  %mask = icmp ne <16 x
i32> %mask1, zeroinitializer @@ -254,7 +254,7 @@ define <16 x i32>
@vpaddd_mask_fold_test }

; CHECK-LABEL: vpaddd_mask_broadcast_test -; CHECK: vpaddd
LCP{{.*}}(%rip){1to16}, {{%zmm[0-9]{1,2}, %zmm[0-9]{1,2} {%k[1-7]} }}
+; CHECK: vpaddd LCP{{.*}}(%rip){1to16}, {{%zmm[0-9], %zmm[0-9]
+{%k[1-7]}}}
; CHECK: ret
define <16 x i32> @vpaddd_mask_broadcast_test(<16 x i32> %i, <16 x
i32> %mask1) nounwind readnone {  %mask = icmp ne <16 x i32> %mask1,
zeroinitializer @@ -264,7 +264,7 @@ define <16 x i32>
@vpaddd_mask_broadcast }

; CHECK-LABEL: vpaddd_maskz_fold_test -; CHECK: vpaddd (%rdi),
{{%zmm[0-9]{1,2}, %zmm[0-9]{1,2} {%k[1-7]}}} {z}
+; CHECK: vpaddd (%rdi), {{%zmm[0-9], %zmm[0-9] {%k[1-7]}}} {z}
; CHECK: ret
define <16 x i32> @vpaddd_maskz_fold_test(<16 x i32> %i, <16 x i32>*
%j.ptr, <16 x i32> %mask1) nounwind readnone {  %mask = icmp ne <16 x
i32> %mask1, zeroinitializer @@ -275,7 +275,7 @@ define <16 x i32>
@vpaddd_maskz_fold_tes }

; CHECK-LABEL: vpaddd_maskz_broadcast_test -; CHECK: vpaddd
LCP{{.*}}(%rip){1to16}, {{%zmm[0-9]{1,2}, %zmm[0-9]{1,2} {%k[1-7]}}}
{z}
+; CHECK: vpaddd LCP{{.*}}(%rip){1to16}, {{%zmm[0-9], %zmm[0-9]
+{%k[1-7]}}} {z}
; CHECK: ret
define <16 x i32> @vpaddd_maskz_broadcast_test(<16 x i32> %i, <16 x
i32> %mask1) nounwind readnone {  %mask = icmp ne <16 x i32> %mask1,
zeroinitializer @@ -309,7 +309,7 @@ define <16 x i32>
@vpmulld_test(<16 x i3 }

; CHECK-LABEL: sqrtA
-; CHECK: vsqrtss {{.*}} encoding: [0x62
+; CHECK: vsqrtss {{.*}}
; CHECK: ret
declare float @sqrtf(float) readnone
define float @sqrtA(float %a) nounwind uwtable readnone ssp { @@
-319,7 +319,7 @@ entry:
}

; CHECK-LABEL: sqrtB
-; CHECK: vsqrtsd {{.*}}## encoding: [0x62
+; CHECK: vsqrtsd {{.*}}
; CHECK: ret
declare double @sqrt(double) readnone define double @sqrtB(double %a)
nounwind uwtable readnone ssp { @@ -329,7 +329,7 @@ entry:
}

; CHECK-LABEL: sqrtC
-; CHECK: vsqrtss {{.*}}## encoding: [0x62
+; CHECK: vsqrtss {{.*}}
; CHECK: ret
declare float @llvm.sqrt.f32(float)
define float @sqrtC(float %a) nounwind {

Modified: llvm/trunk/test/CodeGen/X86/avx512-shuffle.ll
URL:
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx51
2-shuffle.ll?rev=218932&r1=218931&r2=218932&view=diff
=====================================================================
=========
--- llvm/trunk/test/CodeGen/X86/avx512-shuffle.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-shuffle.ll Thu Oct  2 19:36:29
+++ 2014
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl
--show-mc-encoding| FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl | FileCheck
+%s
; CHECK: LCP
; CHECK: .long 2
; CHECK: .long 5
@@ -169,7 +169,7 @@ define <16 x i32> @test11(<16 x i32> %a, }

; CHECK-LABEL: test12
-; CHECK: vmovlhps {{.*}}## encoding: [0x62
+; CHECK: vmovlhps {{.*}}
; CHECK: ret
define <4 x i32> @test12(<4 x i32> %a, <4 x i32> %b) nounwind {  %c =
shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 1,
i32 4, i32 5> @@ -226,7 +226,7 @@ define <8 x double> @test16(<8 x
double> }

; CHECK-LABEL: test16k
-; CHECK: valignq $2, %zmm0, %zmm1, %zmm2 {%k1} #
+; CHECK: valignq $2, %zmm0, %zmm1, %zmm2 {%k1}
define <8 x i64> @test16k(<8 x i64> %a, <8 x i64> %b, <8 x i64> %src,
i8 %mask) nounwind {  %c = shufflevector <8 x i64> %a, <8 x i64> %b,
<8 x i32> <i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9>
%m = bitcast i8 %mask to <8 x i1> @@ -235,7 +235,7 @@ define <8 x
i64> @test16k(<8 x i64> %a, }

; CHECK-LABEL: test16kz
-; CHECK: valignq $2, %zmm0, %zmm1, %zmm0 {%k1} {z} ## encoding:
[0x62,0xf3,0xf5,0xc9,0x03,0xc0,0x02]
+; CHECK: valignq $2, %zmm0, %zmm1, %zmm0 {%k1} {z}
define <8 x i64> @test16kz(<8 x i64> %a, <8 x i64> %b, i8 %mask)
nounwind {  %c = shufflevector <8 x i64> %a, <8 x i64> %b, <8 x i32>
<i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9>  %m =
bitcast i8 %mask to <8 x i1> @@ -296,7 +296,7 @@ define <16 x float>
@test21(<16 x float> }

; CHECK-LABEL: test22
-; CHECK: vmovhlps {{.*}}## encoding: [0x62
+; CHECK: vmovhlps {{.*}}
; CHECK: ret
define <4 x i32> @test22(<4 x i32> %a, <4 x i32> %b) nounwind {  %c =
shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 2, i32 3,
i32 6, i32 7>

Modified: llvm/trunk/test/CodeGen/X86/avx512-vbroadcast.ll
URL:
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx51
2-vbroadcast.ll?rev=218932&r1=218931&r2=218932&view=diff
=====================================================================
=========
--- llvm/trunk/test/CodeGen/X86/avx512-vbroadcast.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-vbroadcast.ll Thu Oct  2
+++ 19:36:29 2014
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl
--show-mc-encoding| FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl | FileCheck
+%s

;CHECK-LABEL: _inreg16xi32:
;CHECK: vpbroadcastd {{.*}}, %zmm
@@ -45,7 +45,7 @@ define   <16 x i32> @_xmm16xi32(<16 x i3
}

;CHECK-LABEL: _xmm16xfloat
-;CHECK: vbroadcastss {{.*}}## encoding: [0x62
+;CHECK: vbroadcastss {{.*}}
;CHECK: ret
define   <16 x float> @_xmm16xfloat(<16 x float> %a) {
%b = shufflevector <16 x float> %a, <16 x float> undef, <16 x i32>
zeroinitializer


_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu<mailto:llvm-commits at cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141005/a1a3d1f5/attachment.html>