[llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1> values in memory

Thu Oct 1 05:31:17 PDT 2015

Hi Hal,

I handled internal discussion, inside Intel. We concluded that the compressed memory form for all i1 vectors and arrays should be implemented for the entire  X86 target, not only AVX-512.
Unlike AVX-512, other X86 will require a sequence of instructions. Does clang generate i1 global variables?
We did not schedule this work yet.

-  Elena

-----Original Message-----
From: Hal Finkel [mailto:hfinkel at anl.gov] 
Sent: Wednesday, September 30, 2015 03:17
To: Demikhovsky, Elena
Cc: llvm-commits at lists.llvm.org; Zaks, Ayal; Daniel Sanders
Subject: Re: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1> values in memory

----- Original Message -----
> From: "Elena Demikhovsky" <elena.demikhovsky at intel.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvm-commits at lists.llvm.org, "Ayal Zaks" <ayal.zaks at intel.com>
> Sent: Wednesday, September 9, 2015 2:30:28 AM
> Subject: RE: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1> 
> values in memory
> 
> Hi Hal,
> 
> Yes, this is incompatibility. I agree with you. I did not think about 
> this.
> 
> The bit types are legal for AVX-512 architecture and used for masks.
> It is a special instruction for store and load masks.
> When we need to spill/fill k-register, we use KMOV instructions that 
> stores / loads bits without expanding them to bytes.
> A constant is loaded through GPR:
> 
>        %c = or < 8 x i1>%b, <i1 0, i1 0, i1 1, i1 1, i1 0, i1 0, i1
>        0, i1 1>
> 
>         movb    $-116, %al
>         kmovb   %eax, %k1
>         korb    %k1, %k0, %k0
> 
> I suppose that  arrays/vectors of masks should also be represented in 
> memory in the compressed form for AVX-512.
> What do you think?

As I recall, at least for vectors, this was the consensus (they should be stored in compressed form). We'd need to fix the code that generates global initializers, and also perhaps things like r247128 as well (and code in some of the backends that matches the current behavior).

 -Hal

> 
> -  Elena
> 
> -----Original Message-----
> From: Hal Finkel [mailto:hfinkel at anl.gov]
> Sent: Wednesday, September 09, 2015 02:46
> To: Demikhovsky, Elena
> Cc: llvm-commits at lists.llvm.org
> Subject: Re: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1> 
> values in memory
> 
> Hi Elena,
> 
> What is the in-memory format of these stores?
> 
> I ask because we still CodeGen i1 arrays and vectors byte expanded.
> By which I mean that given this:
> 
> @arr = global [4 x i1] [i1 0, i1 1, i1 0, i1 1], align 1 @vect = 
> global <4 x i1> <i1 0, i1 1, i1 0, i1 1>, align 1
> 
> we'll generate this:
> 
> 	.text
> 	.file	"<stdin>"
> 	.type	arr, at object             # @arr
> 	.data
> 	.globl	arr
> arr:
> 	.byte	0                       # 0x0
> 	.byte	1                       # 0x1
> 	.byte	0                       # 0x0
> 	.byte	1                       # 0x1
> 	.size	arr, 4
> 
> 	.type	vect, at object            # @vect
> 	.globl	vect
> 	.align	4
> vect:
> 	.byte	0                       # 0x0
> 	.byte	1                       # 0x1
> 	.byte	0                       # 0x0
> 	.byte	1                       # 0x1
> 	.size	vect, 4
> 
> we've discussed fixing/changing this at various points, but as of yet, 
> this is still what happens. If we'll be giving the x86 backend the 
> ability to load/store i1 vectors, we should make the format compatible 
> with what we use for global initializers of the same type.
> 
> Thanks again,
> Hal
> 
> ----- Original Message -----
> > From: "Elena Demikhovsky via llvm-commits"
> > <llvm-commits at lists.llvm.org>
> > To: llvm-commits at lists.llvm.org
> > Sent: Wednesday, September 2, 2015 4:20:58 AM
> > Subject: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1> 
> > values in memory
> > 
> > Author: delena
> > Date: Wed Sep  2 04:20:58 2015
> > New Revision: 246625
> > 
> > URL: http://llvm.org/viewvc/llvm-project?rev=246625&view=rev
> > Log:
> > AVX-512: store <4 x i1> and <2 x i1> values in memory Enabled DAG 
> > pattern lowering for SKX with DQI predicate.
> > 
> > Differential Revision: http://reviews.llvm.org/D12550
> > 
> > 
> > Modified:
> >     llvm/trunk/lib/Target/X86/X86InstrAVX512.td
> >     llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
> > 
> > Modified: llvm/trunk/lib/Target/X86/X86InstrAVX512.td
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Ins
> > tr AVX512.td?rev=246625&r1=246624&r2=246625&view=diff
> > ====================================================================
> > ==
> > ========
> > --- llvm/trunk/lib/Target/X86/X86InstrAVX512.td (original)
> > +++ llvm/trunk/lib/Target/X86/X86InstrAVX512.td Wed Sep  2 04:20:58
> > 2015
> > @@ -1786,6 +1786,11 @@ let Predicates = [HasDQI] in {
> >              (KMOVBmk addr:$dst, VK8:$src)>;
> >    def : Pat<(v8i1 (bitconvert (i8 (load addr:$src)))),
> >              (KMOVBkm addr:$src)>;
> > +
> > +  def : Pat<(store VK4:$src, addr:$dst),
> > +            (KMOVBmk addr:$dst, (COPY_TO_REGCLASS VK4:$src,
> > VK8))>;
> > + def : Pat<(store VK2:$src, addr:$dst),
> > +            (KMOVBmk addr:$dst, (COPY_TO_REGCLASS VK2:$src,
> > VK8))>;
> >  }
> >  let Predicates = [HasAVX512, NoDQI] in {
> >    def : Pat<(store (i8 (bitconvert (v8i1 VK8:$src))), addr:$dst),
> > 
> > Modified: llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx5
> > 12 -mask-op.ll?rev=246625&r1=246624&r2=246625&view=diff
> > ====================================================================
> > ==
> > ========
> > --- llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll (original)
> > +++ llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll Wed Sep  2
> > 04:20:58
> > 2015
> > @@ -407,3 +407,17 @@ define <32 x i16> @test21(<32 x i16> %x
> >    %ret = select <32 x i1> %mask, <32 x i16> %x, <32 x i16>
> >    zeroinitializer
> >    ret <32 x i16> %ret
> >  }
> > +
> > +; SKX-LABEL: test22
> > +; SKX: kmovb
> > +define void @test22(<4 x i1> %a, <4 x i1>* %addr) {
> > +  store <4 x i1> %a, <4 x i1>* %addr
> > +  ret void
> > +}
> > +
> > +; SKX-LABEL: test23
> > +; SKX: kmovb
> > +define void @test23(<2 x i1> %a, <2 x i1>* %addr) {
> > +  store <2 x i1> %a, <2 x i1>* %addr
> > +  ret void
> > +}
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for 
> the sole use of the intended recipient(s). Any review or distribution 
> by others is strictly prohibited. If you are not the intended 
> recipient, please contact the sender and delete all copies.
> 

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.