[cfe-dev] [Mesa-dev] 3 element vectors in opencl 1.1+
Francisco Jerez
currojerez at riseup.net
Thu Apr 24 03:05:00 PDT 2014
Francisco Jerez <currojerez at riseup.net> writes:
> Jan Vesely <jan.vesely at rutgers.edu> writes:
>
>> On Wed, 2014-04-23 at 19:49 +0200, Francisco Jerez wrote:
>>> Jan Vesely <jan.vesely at rutgers.edu> writes:
>>>
>>> > On Tue, 2014-04-22 at 17:50 -0700, Matt Arsenault wrote:
>>> > <snip>
>>> >
>>> >> >> I think this is what v96:128 is for
>>> >> > according to [0], it specifies only alignment, not size. I could not
>>> >> > find an __attribute__ that would change size either.
>>> >> >
>>> >> > It should be possible to have ADMGPUDataLayout: public DataLayout class
>>> >> > that would intercept the call and fix the reported value, but I think it
>>> >> > would only move the hack to different place.
>>> >> >
>>> >> > I have added pocl-devel list as suggested.
>>> >> >
>>> >> > regards,
>>> >> > Jan
>>> >> >
>>> >> > [0]http://llvm.org/docs/LangRef.html#data-layout
>>> >> >
>>> >>
>>> >> Only the size in memory matters, which is what the required alignment
>>> >> specifies. DataLayout::getTypeAllocSize accounts for the alignment, but
>>> >> getTypeStoreSize does not. I actually thought this was half of what
>>> >> getTypeStoreSize was for, but it turns out it isn't.
>>> >
>>> > hm, I always thought that alignment only puts restrictions on starting
>>> > address and using padding was just a tool to do the job.
>>> >
>>> > anyway, thanks for the hint, using getTypeAllocSize works nicely.
>>> > since we are allocating space in the argument vector I think
>>> > getAllocSize is the right function to use.
>>> >
>>> > I'll post a patch.
>>> >
>>>
>>> I don't think that using getTypeAllocSize() instead of
>>> getTypeStoreSize() to calculate clover::argument::size would be a
>>> satisfactory solution. By doing that you'd redefine the API argument
>>> size exposed to the host for *all* argument types to be the
>>> device-dependent aligned size, which is definitely not what you want.
>>
>>> AFAIK 3-element vectors are an exception because they are the only types
>>> that are defined to have a different API size from their actual usable
>>> size, so they probably deserve special handling in invocation.cpp (as
>>> you did in your first patch). As the API size is target-independent I
>>> don't think that the fix belongs in Clang or LLVM, Clover is likely at
>>> fault.
>>
>> The way I understood the ch 6.1.5 is that both API and OpenCL C 3
>> element vectors are required to be 4*sizeof(component). So a
>> sizeof(float3) == sizeof(cl_float3) == 16, and should be both host and
>> target independent.
>
> Well, sizeof() is supposed to take into account the alignment, so this
> should be the case already.
>
>> That's why clang (or more precisely libclc) looked like a correct
>> place.
>>
>
> Right. I guess the other possibility would be to redefine cl_float3 as
> cl_float4 in libclc.
Sorry, I meant to redefine "gentype3" as "gentype4" for every vector
element type "gentype".
> You mentioned that it had undesired consequences. Which exactly?
>
>> I understand that target device can have stricter alignment rules, but I
>> don't see how it can have different type sizes (my reading of the specs
>> is that these are binding for the target as well).
>>
>
> In theory the sizes are binding for most built-in types, but R600
> doesn't support certain integer types so clover has code to take into
> account the differing size, alignment and endianness between host and
> device. I guess that in most cases it would be possible to use the
> "official" ABI for passing kernel arguments to the device (and that's a
> requirement for your solution using DataLayout::getTypeAllocSize() to
> work reliably, otherwise you'll be taking into account device-specific
> padding), but you'll have to fix the R600 back-end so it's able to deal
> with (i.e. lower) all built-in types specified by OpenCL. I think that
> this would be useful on its own, Tom should know better than I how
> difficult it would be.
>
>> I can resend the original patch with debug output replaced by a comment.
>>
> A slightly more general solution than what you did could be to align the
> store size of scalar and vector arguments to the next power of two to
> calculate the API size (in particular, that would make 3- and 4-element
> vectors have the same size). From 6.1.5: "A built-in data type that is
> not a power of two bytes in size must be aligned to the next larger
> power of two."
>
>> regards,
>> Jan
>>
>
> Thanks.
>
>>>
>>> Thanks.
>>>
>>> > regards,
>>> > Jan
>>> >
>>> >
>>> > --
>>> > Jan Vesely <jan.vesely at rutgers.edu>
>>> > _______________________________________________
>>> > mesa-dev mailing list
>>> > mesa-dev at lists.freedesktop.org
>>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>> --
>> Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 229 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140424/41ad18f2/attachment.sig>
More information about the cfe-dev
mailing list