[LLVMdev] Adding masked vector load and store intrinsics

Fri Oct 24 07:28:18 PDT 2014

----- Original Message -----
> From: "Hal Finkel" <hfinkel at anl.gov>
> To: "Elena Demikhovsky" <elena.demikhovsky at intel.com>
> Cc: dag at cray.com, llvmdev at cs.uiuc.edu
> Sent: Friday, October 24, 2014 8:39:56 AM
> Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics
> 
> ----- Original Message -----
> > From: "Elena Demikhovsky" <elena.demikhovsky at intel.com>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: dag at cray.com, llvmdev at cs.uiuc.edu, "Ayal Zaks"
> > <ayal.zaks at intel.com>
> > Sent: Friday, October 24, 2014 8:07:18 AM
> > Subject: RE: [LLVMdev] Adding masked vector load and store
> > intrinsics
> > 
> > > For the loads, I'm must less sure. Why can't we represent the
> > > loads
> > > as select(mask, load(addr), passthru)? It is true, that the load
> > > might get separated from the select so that isel might not see it
> > > (because isel if basic-block local), but we can add some code in
> > > CodeGenPrep to fix that for targets on which it is useful to do
> > > so
> > > (which is a more-general solution than the intrinsic anyhow).
> > > What
> > > do you think?
> > 
> > We generate the vector-masked-intrinsic on IR-to-IR pass. It is too
> > far from instruction selection. We'll need to guarantee that all
> > subsequent IR-to-IR passes will not break the sequence.
> 
> I'm fully aware of this issue. This needs to be weighed against the
> cost of updating all other optimizations that operate on loads to
> also understand this intrinsic.
> 
> > And only for
> > one or two specific targets.
> 
> Regardless, they're certainly targets many users care about ;)
> 
> > Then we'll keep the logic in type
> > legalizer, which may split or extend operations. Then we are taking
> > care in DAG-combine.
> > In my opinion, this is just unsafe.
> 
> If this were really a question of safety, I'd agree. And if we were
> talking about gather loads, I'd agree. For a regular vector loads, I
> don't see this as a safety issue. We should outline what the
> downside of emitting a regular load would actually be should some
> optimization be done to the select. Can you please elaborate on
> this?

Nevermind ;) -- I changed my mind, the safety issue is with non-aligned loads that might cross page boundaries. Is that right? If so, I think this proposal is good (although obviously the docs need to make clear what the faulting behavior of these intrinsics is).

Thanks again,
Hal

> 
> Thanks again,
> Hal
> 
> > 
> > -  Elena
> > 
> > 
> > -----Original Message-----
> > From: Hal Finkel [mailto:hfinkel at anl.gov]
> > Sent: Friday, October 24, 2014 15:50
> > To: Demikhovsky, Elena
> > Cc: dag at cray.com; llvmdev at cs.uiuc.edu
> > Subject: Re: [LLVMdev] Adding masked vector load and store
> > intrinsics
> > 
> > ----- Original Message -----
> > > From: "Elena Demikhovsky" <elena.demikhovsky at intel.com>
> > > To: llvmdev at cs.uiuc.edu
> > > Cc: dag at cray.com
> > > Sent: Friday, October 24, 2014 6:24:15 AM
> > > Subject: [LLVMdev] Adding masked vector load and store intrinsics
> > > 
> > > 
> > > 
> > > Hi,
> > > 
> > > We would like to add support for masked vector loads and stores
> > > by
> > > introducing new target-independent intrinsics. The loop
> > > vectorizer
> > > will then be enhanced to optimize loops containing conditional
> > > memory
> > > accesses by generating these intrinsics for existing targets such
> > > as
> > > AVX2 and AVX-512. The vectorizer will first ask the target about
> > > availability of masked vector loads and stores. The SLP
> > > vectorizer
> > > can
> > > potentially be enhanced to use these intrinsics as well.
> > > 
> > > The intrinsics would be legal for all targets; targets that do
> > > not
> > > support masked vector loads or stores will scalarize them.
> > > The addressed memory will not be touched for masked-off lanes. In
> > > particular, if all lanes are masked off no address will be
> > > accessed.
> > > 
> > > call void @llvm.masked.store (i32* %addr, <16 x i32> %data, i32
> > > 4,
> > > <16 x i1> %mask)
> > > 
> > > %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x i32>
> > > %passthru, i32 4, <8 x i1> %mask)
> > > 
> > > where %passthru is used to fill the elements of %data that are
> > > masked-off (if any; can be zeroinitializer or undef).
> > > 
> > > Comments so far, before we dive into more details?
> > 
> > For the stores, I think this is a reasonable idea. The alternative
> > is
> > to represent them in scalar form with a lot of control flow, and I
> > think that expecting the backend to properly pattern match that
> > after isel is not realistic.
> > 
> > For the loads, I'm must less sure. Why can't we represent the loads
> > as select(mask, load(addr), passthru)? It is true, that the load
> > might get separated from the select so that isel might not see it
> > (because isel if basic-block local), but we can add some code in
> > CodeGenPrep to fix that for targets on which it is useful to do so
> > (which is a more-general solution than the intrinsic anyhow). What
> > do you think?
> > 
> > Thanks again,
> > Hal
> > 
> > > 
> > > Thank you.
> > > 
> > > - Elena and Ayal
> > > 
> > > 
> > > 
> > > ---------------------------------------------------------------------
> > > Intel Israel (74) Limited
> > > 
> > > This e-mail and any attachments may contain confidential material
> > > for
> > > the sole use of the intended recipient(s). Any review or
> > > distribution
> > > by others is strictly prohibited. If you are not the intended
> > > recipient, please contact the sender and delete all copies.
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > > 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > ---------------------------------------------------------------------
> > Intel Israel (74) Limited
> > 
> > This e-mail and any attachments may contain confidential material
> > for
> > the sole use of the intended recipient(s). Any review or
> > distribution
> > by others is strictly prohibited. If you are not the intended
> > recipient, please contact the sender and delete all copies.
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory