[LLVMdev] Adding masked vector load and store intrinsics

Owen Anderson resistor at mac.com
Mon Oct 27 09:58:34 PDT 2014


Since this is something that you expect to be supported on all targets, and which requires extensive type overloading, it seems like a perfect candidate for being an Instruction rather than an intrinsic.

—Owen

> On Oct 27, 2014, at 12:02 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote:
> 
> we just follow  a common recommendation to start with intrinsics:
> http://llvm.org/docs/ExtendingLLVM.html <http://llvm.org/docs/ExtendingLLVM.html>
>  
>  
> -           Elena
>  
> From: Owen Anderson [mailto:resistor at mac.com <mailto:resistor at mac.com>] 
> Sent: Sunday, October 26, 2014 23:57
> To: Demikhovsky, Elena
> Cc: llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu>; dag at cray.com <mailto:dag at cray.com>
> Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics
>  
> What is the motivation for using intrinsics versus adding new instructions?
>  
> —Owen
>  
> On Oct 24, 2014, at 4:24 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com <mailto:elena.demikhovsky at intel.com>> wrote:
>  
> Hi,
>  
> We would like to add support for masked vector loads and stores by introducing new target-independent intrinsics. The loop vectorizer will then be enhanced to optimize loops containing conditional memory accesses by generating these intrinsics for existing targets such as AVX2 and AVX-512. The vectorizer will first ask the target about availability of masked vector loads and stores. The SLP vectorizer can potentially be enhanced to use these intrinsics as well.
>  
> The intrinsics would be legal for all targets; targets that do not support masked vector loads or stores will scalarize them.
> The addressed memory will not be touched for masked-off lanes. In particular, if all lanes are masked off no address will be accessed.
>  
>   call void @llvm.masked.store (i32* %addr, <16 x i32> %data, i32 4, <16 x i1> %mask)
>  
>   %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x i32> %passthru, i32 4, <8 x i1> %mask)
>  
> where %passthru is used to fill the elements of %data that are masked-off (if any; can be zeroinitializer or undef).
>  
> Comments so far, before we dive into more details?
>  
> Thank you.
>  
> - Elena and Ayal
>  
>  
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>  
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141027/25eec333/attachment.html>


More information about the llvm-dev mailing list