[llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1> values in memory

Tue Sep 29 17:16:56 PDT 2015

----- Original Message -----
> From: "Elena Demikhovsky" <elena.demikhovsky at intel.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvm-commits at lists.llvm.org, "Ayal Zaks" <ayal.zaks at intel.com>
> Sent: Wednesday, September 9, 2015 2:30:28 AM
> Subject: RE: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1> values in memory
> 
> Hi Hal,
> 
> Yes, this is incompatibility. I agree with you. I did not think about
> this.
> 
> The bit types are legal for AVX-512 architecture and used for masks.
> It is a special instruction for store and load masks.
> When we need to spill/fill k-register, we use KMOV instructions that
> stores / loads bits without expanding them to bytes.
> A constant is loaded through GPR:
> 
>        %c = or < 8 x i1>%b, <i1 0, i1 0, i1 1, i1 1, i1 0, i1 0, i1
>        0, i1 1>
> 
>         movb    $-116, %al
>         kmovb   %eax, %k1
>         korb    %k1, %k0, %k0
> 
> I suppose that  arrays/vectors of masks should also be represented in
> memory in the compressed form for AVX-512.
> What do you think?

As I recall, at least for vectors, this was the consensus (they should be stored in compressed form). We'd need to fix the code that generates global initializers, and also perhaps things like r247128 as well (and code in some of the backends that matches the current behavior).

 -Hal

> 
> -  Elena
> 
> -----Original Message-----
> From: Hal Finkel [mailto:hfinkel at anl.gov]
> Sent: Wednesday, September 09, 2015 02:46
> To: Demikhovsky, Elena
> Cc: llvm-commits at lists.llvm.org
> Subject: Re: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1>
> values in memory
> 
> Hi Elena,
> 
> What is the in-memory format of these stores?
> 
> I ask because we still CodeGen i1 arrays and vectors byte expanded.
> By which I mean that given this:
> 
> @arr = global [4 x i1] [i1 0, i1 1, i1 0, i1 1], align 1 @vect =
> global <4 x i1> <i1 0, i1 1, i1 0, i1 1>, align 1
> 
> we'll generate this:
> 
> 	.text
> 	.file	"<stdin>"
> 	.type	arr, at object             # @arr
> 	.data
> 	.globl	arr
> arr:
> 	.byte	0                       # 0x0
> 	.byte	1                       # 0x1
> 	.byte	0                       # 0x0
> 	.byte	1                       # 0x1
> 	.size	arr, 4
> 
> 	.type	vect, at object            # @vect
> 	.globl	vect
> 	.align	4
> vect:
> 	.byte	0                       # 0x0
> 	.byte	1                       # 0x1
> 	.byte	0                       # 0x0
> 	.byte	1                       # 0x1
> 	.size	vect, 4
> 
> we've discussed fixing/changing this at various points, but as of
> yet, this is still what happens. If we'll be giving the x86 backend
> the ability to load/store i1 vectors, we should make the format
> compatible with what we use for global initializers of the same
> type.
> 
> Thanks again,
> Hal
> 
> ----- Original Message -----
> > From: "Elena Demikhovsky via llvm-commits"
> > <llvm-commits at lists.llvm.org>
> > To: llvm-commits at lists.llvm.org
> > Sent: Wednesday, September 2, 2015 4:20:58 AM
> > Subject: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1>
> > values
> > in memory
> > 
> > Author: delena
> > Date: Wed Sep  2 04:20:58 2015
> > New Revision: 246625
> > 
> > URL: http://llvm.org/viewvc/llvm-project?rev=246625&view=rev
> > Log:
> > AVX-512: store <4 x i1> and <2 x i1> values in memory Enabled DAG
> > pattern lowering for SKX with DQI predicate.
> > 
> > Differential Revision: http://reviews.llvm.org/D12550
> > 
> > 
> > Modified:
> >     llvm/trunk/lib/Target/X86/X86InstrAVX512.td
> >     llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
> > 
> > Modified: llvm/trunk/lib/Target/X86/X86InstrAVX512.td
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Instr
> > AVX512.td?rev=246625&r1=246624&r2=246625&view=diff
> > ======================================================================
> > ========
> > --- llvm/trunk/lib/Target/X86/X86InstrAVX512.td (original)
> > +++ llvm/trunk/lib/Target/X86/X86InstrAVX512.td Wed Sep  2 04:20:58
> > 2015
> > @@ -1786,6 +1786,11 @@ let Predicates = [HasDQI] in {
> >              (KMOVBmk addr:$dst, VK8:$src)>;
> >    def : Pat<(v8i1 (bitconvert (i8 (load addr:$src)))),
> >              (KMOVBkm addr:$src)>;
> > +
> > +  def : Pat<(store VK4:$src, addr:$dst),
> > +            (KMOVBmk addr:$dst, (COPY_TO_REGCLASS VK4:$src,
> > VK8))>;
> > + def : Pat<(store VK2:$src, addr:$dst),
> > +            (KMOVBmk addr:$dst, (COPY_TO_REGCLASS VK2:$src,
> > VK8))>;
> >  }
> >  let Predicates = [HasAVX512, NoDQI] in {
> >    def : Pat<(store (i8 (bitconvert (v8i1 VK8:$src))), addr:$dst),
> > 
> > Modified: llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512
> > -mask-op.ll?rev=246625&r1=246624&r2=246625&view=diff
> > ======================================================================
> > ========
> > --- llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll (original)
> > +++ llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll Wed Sep  2
> > 04:20:58
> > 2015
> > @@ -407,3 +407,17 @@ define <32 x i16> @test21(<32 x i16> %x
> >    %ret = select <32 x i1> %mask, <32 x i16> %x, <32 x i16>
> >    zeroinitializer
> >    ret <32 x i16> %ret
> >  }
> > +
> > +; SKX-LABEL: test22
> > +; SKX: kmovb
> > +define void @test22(<4 x i1> %a, <4 x i1>* %addr) {
> > +  store <4 x i1> %a, <4 x i1>* %addr
> > +  ret void
> > +}
> > +
> > +; SKX-LABEL: test23
> > +; SKX: kmovb
> > +define void @test23(<2 x i1> %a, <2 x i1>* %addr) {
> > +  store <2 x i1> %a, <2 x i1>* %addr
> > +  ret void
> > +}
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory