[LLVMdev] Adding masked vector load and store intrinsics
elena.demikhovsky at intel.com
Sat Oct 25 04:22:15 PDT 2014
> So %passthrough can *only* be undef or zeroinitializer?
No, it can be any value including undef and zeroinitializer.
We considered, while designing, zero and merge semantics and decided that merge semantics is better because it covers zero semantics if you use zeroinitializer in the %paththru.
From: dag at cray.com [mailto:dag at cray.com]
Sent: Friday, October 24, 2014 20:21
To: Demikhovsky, Elena
Cc: llvmdev at cs.uiuc.edu; Zaks, Ayal; Nadav Rotem <nrotem at apple.com> (nrotem at apple.com); Chandler Carruth (chandlerc at google.com); Adam Nemet (anemet at apple.com)
Subject: Re: Adding masked vector load and store intrinsics
"Demikhovsky, Elena" <elena.demikhovsky at intel.com> writes:
> %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x i32>
> %passthru, i32 4, <8 x i1> %mask) where %passthru is used to fill the
> elements of %data that are masked-off (if any; can be zeroinitializer
> or undef).
So %passthrough can *only* be undef or zeroinitializer? If that's the case it might make more sense to have two intrinsics, one that fills with undef and one that fills with zero. Using a general vector operand with a restriction on valid values seems odd and potentially misleading.
Another option is to always fill with undef and require a select on top of the load to fill with zero. The load + select would be easily matchable to a target instruction.
I'm trying to think beyond just AVX-512 to what other future architectures might want. It's not a given that future architectures will fill with zero *or* undef though those are the two most likely fill values.
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
More information about the llvm-dev