[llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1> values in memory

Fri Oct 2 17:00:53 PDT 2015

----- Original Message -----
> From: "Elena Demikhovsky" <elena.demikhovsky at intel.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvm-commits at lists.llvm.org, "Ayal Zaks" <ayal.zaks at intel.com>, "Daniel Sanders" <daniel.sanders at imgtec.com>
> Sent: Thursday, October 1, 2015 7:31:17 AM
> Subject: RE: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1> values in memory
> 
> Hi Hal,
> 
> I handled internal discussion, inside Intel. We concluded that the
> compressed memory form for all i1 vectors and arrays should be
> implemented for the entire  X86 target, not only AVX-512.
> Unlike AVX-512, other X86 will require a sequence of instructions.
> Does clang generate i1 global variables?

I believe that it will not do this for arrays or scalars, but will for Boolean vectors:

$ cat /tmp/b.cxx 
typedef bool bool4 __attribute__((ext_vector_type(4)));
bool4 y = { true, true, false, false };

$ clang++ -O3 -o - -S -emit-llvm /tmp/b.cxx 
target datalayout = "E-m:e-i64:64-n32:64"
target triple = "powerpc64-unknown-linux-gnu"

@y = global <4 x i1> <i1 true, i1 true, i1 false, i1 false>, align 4

Thanks again,
Hal

> We did not schedule this work yet.
> 
> -  Elena
> 
> 
> -----Original Message-----
> From: Hal Finkel [mailto:hfinkel at anl.gov]
> Sent: Wednesday, September 30, 2015 03:17
> To: Demikhovsky, Elena
> Cc: llvm-commits at lists.llvm.org; Zaks, Ayal; Daniel Sanders
> Subject: Re: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1>
> values in memory
> 
> ----- Original Message -----
> > From: "Elena Demikhovsky" <elena.demikhovsky at intel.com>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: llvm-commits at lists.llvm.org, "Ayal Zaks" <ayal.zaks at intel.com>
> > Sent: Wednesday, September 9, 2015 2:30:28 AM
> > Subject: RE: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1>
> > values in memory
> > 
> > Hi Hal,
> > 
> > Yes, this is incompatibility. I agree with you. I did not think
> > about
> > this.
> > 
> > The bit types are legal for AVX-512 architecture and used for
> > masks.
> > It is a special instruction for store and load masks.
> > When we need to spill/fill k-register, we use KMOV instructions
> > that
> > stores / loads bits without expanding them to bytes.
> > A constant is loaded through GPR:
> > 
> >        %c = or < 8 x i1>%b, <i1 0, i1 0, i1 1, i1 1, i1 0, i1 0, i1
> >        0, i1 1>
> > 
> >         movb    $-116, %al
> >         kmovb   %eax, %k1
> >         korb    %k1, %k0, %k0
> > 
> > I suppose that  arrays/vectors of masks should also be represented
> > in
> > memory in the compressed form for AVX-512.
> > What do you think?
> 
> As I recall, at least for vectors, this was the consensus (they
> should be stored in compressed form). We'd need to fix the code that
> generates global initializers, and also perhaps things like r247128
> as well (and code in some of the backends that matches the current
> behavior).
> 
>  -Hal
> 
> > 
> > -  Elena
> > 
> > -----Original Message-----
> > From: Hal Finkel [mailto:hfinkel at anl.gov]
> > Sent: Wednesday, September 09, 2015 02:46
> > To: Demikhovsky, Elena
> > Cc: llvm-commits at lists.llvm.org
> > Subject: Re: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1>
> > values in memory
> > 
> > Hi Elena,
> > 
> > What is the in-memory format of these stores?
> > 
> > I ask because we still CodeGen i1 arrays and vectors byte expanded.
> > By which I mean that given this:
> > 
> > @arr = global [4 x i1] [i1 0, i1 1, i1 0, i1 1], align 1 @vect =
> > global <4 x i1> <i1 0, i1 1, i1 0, i1 1>, align 1
> > 
> > we'll generate this:
> > 
> > 	.text
> > 	.file	"<stdin>"
> > 	.type	arr, at object             # @arr
> > 	.data
> > 	.globl	arr
> > arr:
> > 	.byte	0                       # 0x0
> > 	.byte	1                       # 0x1
> > 	.byte	0                       # 0x0
> > 	.byte	1                       # 0x1
> > 	.size	arr, 4
> > 
> > 	.type	vect, at object            # @vect
> > 	.globl	vect
> > 	.align	4
> > vect:
> > 	.byte	0                       # 0x0
> > 	.byte	1                       # 0x1
> > 	.byte	0                       # 0x0
> > 	.byte	1                       # 0x1
> > 	.size	vect, 4
> > 
> > we've discussed fixing/changing this at various points, but as of
> > yet,
> > this is still what happens. If we'll be giving the x86 backend the
> > ability to load/store i1 vectors, we should make the format
> > compatible
> > with what we use for global initializers of the same type.
> > 
> > Thanks again,
> > Hal
> > 
> > ----- Original Message -----
> > > From: "Elena Demikhovsky via llvm-commits"
> > > <llvm-commits at lists.llvm.org>
> > > To: llvm-commits at lists.llvm.org
> > > Sent: Wednesday, September 2, 2015 4:20:58 AM
> > > Subject: [llvm] r246625 - AVX-512: store <4 x i1> and <2 x i1>
> > > values in memory
> > > 
> > > Author: delena
> > > Date: Wed Sep  2 04:20:58 2015
> > > New Revision: 246625
> > > 
> > > URL: http://llvm.org/viewvc/llvm-project?rev=246625&view=rev
> > > Log:
> > > AVX-512: store <4 x i1> and <2 x i1> values in memory Enabled DAG
> > > pattern lowering for SKX with DQI predicate.
> > > 
> > > Differential Revision: http://reviews.llvm.org/D12550
> > > 
> > > 
> > > Modified:
> > >     llvm/trunk/lib/Target/X86/X86InstrAVX512.td
> > >     llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
> > > 
> > > Modified: llvm/trunk/lib/Target/X86/X86InstrAVX512.td
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Ins
> > > tr AVX512.td?rev=246625&r1=246624&r2=246625&view=diff
> > > ====================================================================
> > > ==
> > > ========
> > > --- llvm/trunk/lib/Target/X86/X86InstrAVX512.td (original)
> > > +++ llvm/trunk/lib/Target/X86/X86InstrAVX512.td Wed Sep  2
> > > 04:20:58
> > > 2015
> > > @@ -1786,6 +1786,11 @@ let Predicates = [HasDQI] in {
> > >              (KMOVBmk addr:$dst, VK8:$src)>;
> > >    def : Pat<(v8i1 (bitconvert (i8 (load addr:$src)))),
> > >              (KMOVBkm addr:$src)>;
> > > +
> > > +  def : Pat<(store VK4:$src, addr:$dst),
> > > +            (KMOVBmk addr:$dst, (COPY_TO_REGCLASS VK4:$src,
> > > VK8))>;
> > > + def : Pat<(store VK2:$src, addr:$dst),
> > > +            (KMOVBmk addr:$dst, (COPY_TO_REGCLASS VK2:$src,
> > > VK8))>;
> > >  }
> > >  let Predicates = [HasAVX512, NoDQI] in {
> > >    def : Pat<(store (i8 (bitconvert (v8i1 VK8:$src))),
> > >    addr:$dst),
> > > 
> > > Modified: llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx5
> > > 12 -mask-op.ll?rev=246625&r1=246624&r2=246625&view=diff
> > > ====================================================================
> > > ==
> > > ========
> > > --- llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll (original)
> > > +++ llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll Wed Sep  2
> > > 04:20:58
> > > 2015
> > > @@ -407,3 +407,17 @@ define <32 x i16> @test21(<32 x i16> %x
> > >    %ret = select <32 x i1> %mask, <32 x i16> %x, <32 x i16>
> > >    zeroinitializer
> > >    ret <32 x i16> %ret
> > >  }
> > > +
> > > +; SKX-LABEL: test22
> > > +; SKX: kmovb
> > > +define void @test22(<4 x i1> %a, <4 x i1>* %addr) {
> > > +  store <4 x i1> %a, <4 x i1>* %addr
> > > +  ret void
> > > +}
> > > +
> > > +; SKX-LABEL: test23
> > > +; SKX: kmovb
> > > +define void @test23(<2 x i1> %a, <2 x i1>* %addr) {
> > > +  store <2 x i1> %a, <2 x i1>* %addr
> > > +  ret void
> > > +}
> > > 
> > > 
> > > _______________________________________________
> > > llvm-commits mailing list
> > > llvm-commits at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> > > 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > ---------------------------------------------------------------------
> > Intel Israel (74) Limited
> > 
> > This e-mail and any attachments may contain confidential material
> > for
> > the sole use of the intended recipient(s). Any review or
> > distribution
> > by others is strictly prohibited. If you are not the intended
> > recipient, please contact the sender and delete all copies.
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory