[LLVMdev] Indexed Load and Store Intrinsics - proposal
Zaks, Ayal
ayal.zaks at intel.com
Mon Dec 22 06:05:43 PST 2014
> Why shouldn't the IR representation simply be a load from a vector of arbitrary pointers?
Such a load could indeed serve as a general form of a gather or scatter. As Elena responded, we can propose two distinct intrinsics: one with a vector of pointers, and another with (non-zero) base, a vector of indices, and a scale implicitly inferred from the element type.
The motivation for the latter stems from vectorizing a load or store to "b[i]", where b is invariant. Broadcasting b and using a vector gep to feed a vector of pointers, to be pattern matched and folded later, may work. The alternative intrinsic proposed keeps b scalar and uses a vector of indices for i. In any case, it's important to recognize such common patterns, at-least for x86, so could deserve an x86 intrinsic. But it's a general pattern that could potentially serve other implementations; any other gathers to consider atm?
Documentation indeed needs to be provided.
Ayal.
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Philip Reames
Sent: Sunday, December 21, 2014 20:25
To: dag at cray.com; Demikhovsky, Elena
Cc: Khasanov, Robert; llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Indexed Load and Store Intrinsics - proposal
On 12/18/2014 11:56 AM, dag at cray.com wrote:
> "Demikhovsky, Elena" <elena.demikhovsky at intel.com> writes:
>
>> Semantics:
>> For i=0,1,…,N-1: if (Mask[i]) {*(BaseAddr + VectorOfIndices[i]*Scale)
>> = VectorValue[i];}
>> VectorValue: any float or integer vector type.
>> BaseAddr: a pointer; may be zero if full address is placed in the
>> index.
>> VectorOfIndices: a vector of i32 or i64 signed or unsigned integer
>> values.
> What about the case of a gather/scatter where the BaseAddr is zero and
> the indices are pointers? Must we do a ptrtoint? llvm.org is down at
> the moment but I don't think we currently have a vector ptrtoint.
I would be opposed to any representation which required the introduction of ptrtoint casts by the vectorizer. If it were the only option available, I could be argued around, but I think we should try to avoid this.
More generally, I'm somewhat hesitant of representing a scatter with explicit base and offsets at all. Why shouldn't the IR representation simply be a load from a vector of arbitrary pointers? The backend can pattern match the actual gather instructions it supports and scalarize the rest. The proposal being made seems very specific to the current generation of x86 hardware.
p.s. Where is the documentation for the existing mask load intrinsics?
I can't find it with a quick search through the LangRef.
Philip
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
More information about the llvm-dev
mailing list