[LLVMdev] Adding masked vector load and store intrinsics

Fri Oct 24 10:20:36 PDT 2014

"Demikhovsky, Elena" <elena.demikhovsky at intel.com> writes:

> %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x i32>
> %passthru, i32 4, <8 x i1> %mask)
> where %passthru is used to fill the elements of %data that are
> masked-off (if any; can be zeroinitializer or undef).

So %passthrough can *only* be undef or zeroinitializer?  If that's the
case it might make more sense to have two intrinsics, one that fills
with undef and one that fills with zero.  Using a general vector operand
with a restriction on valid values seems odd and potentially misleading.

Another option is to always fill with undef and require a select on top
of the load to fill with zero.  The load + select would be easily
matchable to a target instruction.

I'm trying to think beyond just AVX-512 to what other future
architectures might want.  It's not a given that future architectures
will fill with zero *or* undef though those are the two most likely fill
values.

                             -David