[llvm-dev] RFC: New intrinsics masked.expandload and masked.compressstore

Demikhovsky, Elena via llvm-dev llvm-dev at lists.llvm.org
Sun Sep 18 23:37:02 PDT 2016


       Hi all,

       AVX-512 ISA introduces new vector instructions VCOMPRESS and VEXPAND in order to allow vectorization of the following loops with two specific types of cross-iteration dependencies:

       Compress:
       for (int i=0; i<N; ++i)
         If (t[i])
           *A++ = expr;

       Expand:
       for (i=0; i<N; ++i)
         If (t[i])
            X[i] = *A++;
         else
            X[i] = PassThruV[i];

       On this poster ( http://llvm.org/devmtg/2013-11/slides/Demikhovsky-Poster.pdf ) you'll find depicted "compress" and "expand" patterns.

       The RFC proposes to support this functionality by introducing two intrinsics to LLVM IR:
       llvm.masked.expandload.*
       llvm.masked.compressstore.*

       The syntax of these two intrinsics is similar to the syntax of llvm.masked.load.* and masked.store.*, respectively, but the semantics are different, matching the above patterns.

       %res = call <16 x float> @llvm.masked.expandload.v16f32.p0f32 (float* %ptr, <16 x i1>%mask, <16 x float> %passthru)
       void @llvm.masked.compressstore.v16f32.p0f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>)

       The arguments - %mask, %value and %passthru all have the same vector length.
       The underlying type of %ptr corresponds to the scalar type of the vector value.
       (In brief; the full syntax description will be provided in subsequent full documentation.)

       The intrinsics are planned to be target independent, similar to masked.load/store/gather/scatter. They will be lowered effectively on AVX-512 and scalarized on other targets, also akin to masked.* intrinsics.
       Loop vectorizer will query TTI about existence of effective support for these intrinsics, and if provided will be able to handle loops with such cross-iteration dependences.

       The first step will include the full documentation and implementation of CodeGen part.

       An additional information about expand load ( https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=expandload&techs=AVX_512 )  and compress store (https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compressstore&techs=AVX_512 ) you also can find in the Intel Intrinsic Guide.

-       Elena



---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160919/999f82dc/attachment.html>


More information about the llvm-dev mailing list