[cfe-dev] [RFC] Delayed target-specific diagnostic when compiling for the devices.

Thu Jan 17 09:19:34 PST 2019

Just one question: how are you going to emit the type if it is not
supported by the device?

If you going to emit it as just the array of bytes, I don't think this
is the right solution. User classes/datatypes with the unsupported data
types are just not mappable types and, thus, cannot be used on the
device at all. In any form. Even with the unified memory.

-------------
Best regards,
Alexey Bataev

17.01.2019 12:15, Doerfert, Johannes Rudolf пишет:
>
> > The compiler does not know anything about the layout on the host
> when it compiles for the device.
>
>
> Just a side note: I'll try to write up an RFC next week to propose a
> conceptual change in our compilation process that makes this argument
> go away.
>
>
>
> ------------------------------------------------------------------------
> *From:* Alexey Bataev <a.bataev at outlook.com>
> *Sent:* Thursday, January 17, 2019 11:11:55 AM
> *To:* Finkel, Hal J.
> *Cc:* John McCall; Reid Kleckner; Artem Belevich; Justin Lebar;
> Richard Smith; cfe-dev; John McCall; Doerfert, Johannes Rudolf
> *Subject:* Re: [cfe-dev] [RFC] Delayed target-specific diagnostic when
> compiling for the devices.
>  
>
> The compiler does not know anything about the layout on the host when
> it compiles for the device. We cannot do anything with the types that
> are not supported by the target device and we cannot use the layout
> from the host. And it is user responsibility to write and use the code
> that is compatible with the the target devices.
>
> He/she does not need to use macros/void* types, there are templates.
>
> You cannot use classes, which use types incompatible with the device.
> There is a problem with the data layout on the device and we just
> don't know how to represent such classes on the device.
>
> -------------
> Best regards,
> Alexey Bataev
> 17.01.2019 11:47, Finkel, Hal J. пишет:
>> On 1/17/19 9:52 AM, Alexey Bataev wrote:
>>> Because the type is not compatible with the target device.
>> But it's not that simple. The situation is that the programming
>> environment supports the type, but *operations* on that type are not
>> supported in certain contexts (e.g., when compiled for a certain
>> device). As you point out, we already need to move in this explicit
>> direction by, for example, allowing typedefs for types that are not
>> supported in all contexts, function declarations, and so on. In the end,
>> we should allow our users to design their classes and abstractions using
>> good software-engineering practice without worrying about access-context
>> partitioning.
>>
>> Also, the other problem here is that the function I used as an example
>> is a very common C++ idiom. There are a lot of classes with function
>> that return a reference to themselves. Classes can have lots of data
>> members, and those members might not be accessed on the device (even if
>> the class itself might be accessed on the device). We're moving to a
>> world in which unified memory is common - the promise of this technology
>> is that configuration data and complex data structures, which might be
>> occasionally accessed (but for which explicitly managing data movement
>> is not performance relevant) are handled transparently. If use of these
>> data structures is transitively poisoned by use of any type not
>> supported on the device (including by pointers to types that use those
>> types), then we'll force unhelpful and technically-unnecessary
>> refactoring, thus reducing the value of the feature.
>>
>> In the current implementation we pre-process the source twice, and so we
>> can:
>>
>>  1. Use ifdefs to change the data memebers when compiling for different
>> targets. This is hard to get right because, in order to keep the data
>> layout otherwise the same, the user needs to understand the layout rules
>> in order to put something in the structure that is supported on the
>> target and keeps the layout the same (this is very error prone). Also,
>> if we move to a single-preprocessing-stage model, this no longer works.
>>
>>  2. Replace all pointers to relevant types with void*, or similar, and
>> use a lot of casts. This is also bad.
>>
>> We shouldn't be forcing users to play these games. The compiler knows
>> the layout on the host and it can use it on the target. The fact that
>> some operations on some types might not be supported on the target is
>> not relevant to handling pointers/references to containing types.
>>
>> Thanks again,
>>
>> Hal
>>
>>
>>> -------------
>>> Best regards,
>>> Alexey Bataev
>>>
>>> 17.01.2019 10:50, Finkel, Hal J. пишет:
>>>> On 1/17/19 9:27 AM, Alexey Bataev wrote:
>>>>> It should be compilable for the device only iff function foo is not used
>>>>> on the device.
>>>> Says whom? I disagree. This function should work on the device. Why
>>>> should it not?
>>>>
>>>>  -Hal
>>>>
>>>>
>>>>> -------------
>>>>> Best regards,
>>>>> Alexey Bataev
>>>>>
>>>>> 17.01.2019 10:24, Finkel, Hal J. пишет:
>>>>>> On 1/17/19 4:05 AM, Alexey Bataev wrote:
>>>>>>> Best regards,
>>>>>>> Alexey Bataev
>>>>>>>
>>>>>>>> 17 янв. 2019 г., в 0:46, Finkel, Hal J. <hfinkel at anl.gov> <mailto:hfinkel at anl.gov> написал(а):
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 1/16/19 8:45 AM, Alexey Bataev wrote:
>>>>>>>>>
>>>>>>>>> Yes, I thought about this. But we need to delay the diagnostic until
>>>>>>>>> the Codegen phase. What I need is the way to associate the diagnostic
>>>>>>>>> with the function so that this diagnostic is available in CodeGen.
>>>>>>>>>
>>>>>>>>> Also, we need to postpone the diagnotics not only for functions,
>>>>>>>>> but,for example, for some types. For example, __float128 type is not
>>>>>>>>> supported by CUDA. We can get error messages when we ran into
>>>>>>>>> something like `typedef __float128 SomeOtherType` (say, in some system
>>>>>>>>> header files) and get the error diagnostic when we compile for the
>>>>>>>>> device. Though, actually, this type is not used in the device code,
>>>>>>>>> the diagnostic is still emitted and we need to delay too and emit it
>>>>>>>>> only iff the type is used in the device code.
>>>>>>>>>
>>>>>>>> This should be fixed for CUDA too, right?
>>>>>>>>
>>>>>>>> Also, we still get to have pointers to aggregates containing those types
>>>>>>>> on the device, right?
>>>>>>>>
>>>>>>> No, why? This is not allowed and should be diagnosed too. If somebody tries somehow to use not allowed type for the device variables/functions - it should be diagnosed.
>>>>>> Because this should be allowed. If I have:
>>>>>>
>>>>>> struct X {
>>>>>>   int a;
>>>>>>   __float128 b;
>>>>>> };
>>>>>>
>>>>>> and we have some function which does this:
>>>>>>
>>>>>> X *foo(X *x) {
>>>>>>   return x;
>>>>>> }
>>>>>>
>>>>>> We'll certainly want this function to compile for all targets, even if
>>>>>> there's no __float128 support on some accelerator. The whole model only
>>>>>> really makes sense if the accelerator shares the aggregate-layout rules
>>>>>> of the host, and this is a needless hassle for users if this causes an
>>>>>> error (especially in a unified-memory environment where configuration
>>>>>> data structures, etc. are shared between devices).
>>>>>>
>>>>>> Thanks again,
>>>>>>
>>>>>> Hal
>>>>>>
>>>>>>
>>>>>>>> Thanks again,
>>>>>>>>
>>>>>>>> Hal
>>>>>>>>
>>>>>>>>
>>>>>>>>> -------------
>>>>>>>>> Best regards,
>>>>>>>>> Alexey Bataev
>>>>>>>>> 15.01.2019 17:33, John McCall пишет:
>>>>>>>>>>> On 15 Jan 2019, at 17:20, Alexey Bataev wrote:
>>>>>>>>>>> This is not only for asm, we need to delay all target-specific
>>>>>>>>>>> diagnostics.
>>>>>>>>>>> I'm not saying that we need to move the host diagnostic, only the
>>>>>>>>>>> diagnostic for the device compilation.
>>>>>>>>>>> As for Cuda, it is a little but different. In Cuda the programmer
>>>>>>>>>>> must explicitly mark the device functions,  while in OpenMP it must
>>>>>>>>>>> be done implicitly. Thus, we cannot reuse the solution used for Cuda.
>>>>>>>>>> All it means is that you can't just use the solution used for CUDA
>>>>>>>>>> "off the shelf".  The basic idea of associating diagnostics with the
>>>>>>>>>> current function and then emitting those diagnostics later when you
>>>>>>>>>> realize that you have to emit that function is still completely
>>>>>>>>>> applicable.
>>>>>>>>>>
>>>>>>>>>> John.
>>>>>>>> -- 
>>>>>>>> Hal Finkel
>>>>>>>> Lead, Compiler Technology and Programming Languages
>>>>>>>> Leadership Computing Facility
>>>>>>>> Argonne National Laboratory
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190117/e441b225/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190117/e441b225/attachment.sig>