[LLVMdev] Adding masked vector load and store intrinsics
dag at cray.com
dag at cray.com
Fri Oct 24 10:20:36 PDT 2014
"Demikhovsky, Elena" <elena.demikhovsky at intel.com> writes:
> %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x i32>
> %passthru, i32 4, <8 x i1> %mask)
> where %passthru is used to fill the elements of %data that are
> masked-off (if any; can be zeroinitializer or undef).
So %passthrough can *only* be undef or zeroinitializer? If that's the
case it might make more sense to have two intrinsics, one that fills
with undef and one that fills with zero. Using a general vector operand
with a restriction on valid values seems odd and potentially misleading.
Another option is to always fill with undef and require a select on top
of the load to fill with zero. The load + select would be easily
matchable to a target instruction.
I'm trying to think beyond just AVX-512 to what other future
architectures might want. It's not a given that future architectures
will fill with zero *or* undef though those are the two most likely fill
values.
-David
More information about the llvm-dev
mailing list