Nicholas Chapman
Wed Jan 22 05:01:52 PST 2014

Hi Mark,

On 22/01/2014 1:06 a.m., Mark Lacey wrote:
> Hi Nick,
> On Jan 21, 2014, at 1:54 PM, Nicholas Chapman <admin at indigorenderer.com> wrote:
>> Hi Evan, all,
>> The most obvious thing to me would be to extend the load instruction to have an additional form that takes a vector of pointers instead of a single pointer.
>> This form would return a vector of values instead of a single value.
>> If a gather instruction is not available on the target, then the load could be lowered to a series of scalar loads and insert elements.
> Are you only interested in gather, or scatter as well?
Currently I'm only interested in gather.
> What are the limitations on the vector of pointers?
> - Do they have to have power-of-two number of elements?
I would prefer there was no power-of-two restriction, since that 
restriction has been eased for the LLVM vector type in general.
> - Do they have to point to scalars? If not, does it support any struct/vector type, or only vectors and homogeneous structs with all the elements of the same type?
Currently I would be happy to just have scalars handled. Presumably if 
scalars are handled initially then requirements can be relaxed in the 


> I ask the latter because if I recall correctly some GPUs have support for doing 16x4 gathers that result in 4 16-wide vectors, where the memory is in array-of-structures form, but the resulting registers are in structure-of-arrays form, i.e. the result is four 16-wide vectors with the elements of the source memory placed such that they were loaded with a stride. In other words if the input pointers pointed to a struct like:
>    struct S { float a, b, c, d; }
> the result of gathering from a vector of pointers-to-S would be four vectors, one with all the a’s, one with all the b’s, etc.
> Mark
>> Thanks,
>>     Nick
>> On 20/01/2014 5:59 p.m., Evan Cheng wrote:
>>> On Jan 14, 2014, at 11:11 AM, Nicholas Chapman <admin at indigorenderer.com> wrote:
>>>> Hi All,
>>>> I was in the process of implementing a gathering load for my language.  I got the getelementptr vector form working.  However there doesn't seem to be a way to load a vector of values using the vector of pointers from getelementptr.  Am I correct that this is not possible with LLVM IR currently (apart from with the avx2 gather intrinsic)?
>>> I believe you are correct. This is not currently possible.
>>>> And if so, are there plans to allow loading of a vector of values?
>>>> I think extending the load instruction to take a vector of pointers (as would be produced by getelementptr) would work well.
>>> This is of interests to a lot of people and I think it's a reasonable time to start the discussion. Do you have any concrete proposal in mind?
>>> Evan
