[LLVMdev] Indexed Load and Store Intrinsics - proposal

Thu Mar 12 20:55:30 PDT 2015

Hi Elena,

I think such intrinsics are very useful.
Do you have any plan to upstream them?

Thanks,
-Hao

2014-12-18 22:40 GMT+08:00 Demikhovsky, Elena <elena.demikhovsky at intel.com>:
> Hi,
>
> Recent Intel architectures AVX-512 and AVX2 provide vector gather and/or
> scatter instructions.
> Gather/scatter instructions allow read/write access to multiple memory
> addresses. The addresses are specified using a base address and a vector of
> indices.
> We’d like Vectorizers to tap this functionality, and propose to do so by
> introducing new intrinsics:
>
> VectorValue = @llvm.sindex.load (BaseAddr, VectorOfIndices, Scale)
> VectorValue = @llvm.uindex.load (BaseAddr, VectorOfIndices, Scale)
> VectorValue = @llvm.sindex.masked.load (BaseAddr, VectorOfIndices, Scale,
> PassThruVal, Mask)
> VectorValue = @llvm.uindex.masked.load (BaseAddr, VectorOfIndices, Scale,
> PassThruVal, Mask)
>
> Semantics:
> For i=0,1,…,N-1: if (Mask[i]) {VectorValue[i] = *(BaseAddr +
> VectorOfIndices[i]*Scale) else VectorValue[i]=PassThruVal[i];}
>
> void @llvm.sindex.store (BaseAddr, VectorValue, VectorOfIndices, Scale)
> void @llvm.uindex.store (BaseAddr, VectorValue, VectorOfIndices, Scale)
> void @llvm.sindex.masked.store (BaseAddr, VectorValue, VectorOfIndices,
> Scale, Mask)
> void @llvm.uindex.masked.store (BaseAddr, VectorValue, VectorOfIndices,
> Scale, Mask)
>
> Semantics:
> For i=0,1,…,N-1: if (Mask[i]) {*(BaseAddr + VectorOfIndices[i]*Scale) =
> VectorValue[i];}
>
> VectorValue: any float or integer vector type.
> BaseAddr: a pointer; may be zero if full address is placed in the index.
> VectorOfIndices: a vector of i32 or i64 signed or unsigned integer values.
> Scale: a compile time constant 1, 2, 4 or 8.
> VectorValue, VectorOfIndices and Mask must have the same vector width.
>
> An indexed store instruction with complete or partial overlap in memory
> (i.e., two indices with same or close values) will provide the result
> equivalent to serial scalar stores from least to most significant vector
> elements.
>
> The new intrinsics are common for all targets, like recently introduced
> masked load and store.
>
> Examples:
>
> <16 x float> @llvm.sindex.load.v16f32.v16i32 (i8 *%ptr,   <16 x i32> %index,
> i32 %scale)
> <16 x float> @llvm.masked.sindex.load.v16f32.v16i32  (i8 *%ptr, <16 x i32>
> %index,   <16 x float> %passthru, <16 x i1> %mask)
> void @llvm.sindex.store.v16f32.v16i64(i8* %ptr, <16 x float> %value,   <16 x
> 164> %index, i32 %scale,  <16 x i1> %mask)
>
> Comments?
>
> Thank you.
>
>
> Elena
>
>
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>