[LLVMdev] Indexed Load and Store Intrinsics - proposal
    dag at cray.com 
    dag at cray.com
       
    Thu Dec 18 11:56:05 PST 2014
    
    
  
"Demikhovsky, Elena" <elena.demikhovsky at intel.com> writes:
> Semantics:
> For i=0,1,…,N-1: if (Mask[i]) {*(BaseAddr + VectorOfIndices[i]*Scale)
> = VectorValue[i];}
> VectorValue: any float or integer vector type.
> BaseAddr: a pointer; may be zero if full address is placed in the
> index.
> VectorOfIndices: a vector of i32 or i64 signed or unsigned integer
> values.
What about the case of a gather/scatter where the BaseAddr is zero and
the indices are pointers?  Must we do a ptrtoint?  llvm.org is down at
the moment but I don't think we currently have a vector ptrtoint.
> Scale: a compile time constant 1, 2, 4 or 8.
This seems a bit too Intel-focused.  Why not allow arbitrary scales?  Or
alternatively, eliminate the Scale and do a vector multiply on
VectorOfIndices.  It should be simple enough to write matching TableGen
patterns.  We do it now for the x86 memop stuff.
> VectorValue, VectorOfIndices and Mask must have the same vector width.
>From your example, you mean they must have the same number of vector
elements, not the same bit width, right?  I'm used to "width" meaning a
specific bit length and "vector length" meaning "number of elements."
With that terminology, I think you mean they must have the same vector
length.
> An indexed store instruction with complete or partial overlap in
> memory (i.e., two indices with same or close values) will provide the
> result equivalent to serial scalar stores from least to most
> significant vector elements.
Yep, they must be ordered.  Do we want to provide unordered scatters as
well?  Some (non-LLVM) targets have them.  We don't need to add them
right now but it's worth thinking about.
> The new intrinsics are common for all targets, like recently
> introduced masked load and store.
> Examples:
> <16 x float> @llvm.sindex.load.v16f32.v16i32 (i8 *%ptr, <16 x i32>
> %index, i32 %scale)
> <16 x float> @llvm.masked.sindex.load.v16f32.v16i32 (i8 *%ptr, <16 x
> i32> %index, <16 x float> %passthru, <16 x i1> %mask)
> void @llvm.sindex.store.v16f32.v16i64(i8* %ptr, <16 x float> %value,
> <16 x 164> %index, i32 %scale, <16 x i1> %mask)
> Comments?
I think it's definitely a good idea to introduce them, but let's make
them a little more target-neutral if we can.
                           -David
    
    
More information about the llvm-dev
mailing list