[cfe-dev] [RFC] Delayed target-specific diagnostic when compiling for the devices.

Thu Jan 17 07:50:40 PST 2019

On 1/17/19 9:27 AM, Alexey Bataev wrote:
> It should be compilable for the device only iff function foo is not used
> on the device.

Says whom? I disagree. This function should work on the device. Why
should it not?

 -Hal

>
> -------------
> Best regards,
> Alexey Bataev
>
> 17.01.2019 10:24, Finkel, Hal J. пишет:
>> On 1/17/19 4:05 AM, Alexey Bataev wrote:
>>> Best regards,
>>> Alexey Bataev
>>>
>>>> 17 янв. 2019 г., в 0:46, Finkel, Hal J. <hfinkel at anl.gov> написал(а):
>>>>
>>>>
>>>>> On 1/16/19 8:45 AM, Alexey Bataev wrote:
>>>>>
>>>>> Yes, I thought about this. But we need to delay the diagnostic until
>>>>> the Codegen phase. What I need is the way to associate the diagnostic
>>>>> with the function so that this diagnostic is available in CodeGen.
>>>>>
>>>>> Also, we need to postpone the diagnotics not only for functions,
>>>>> but,for example, for some types. For example, __float128 type is not
>>>>> supported by CUDA. We can get error messages when we ran into
>>>>> something like `typedef __float128 SomeOtherType` (say, in some system
>>>>> header files) and get the error diagnostic when we compile for the
>>>>> device. Though, actually, this type is not used in the device code,
>>>>> the diagnostic is still emitted and we need to delay too and emit it
>>>>> only iff the type is used in the device code.
>>>>>
>>>> This should be fixed for CUDA too, right?
>>>>
>>>> Also, we still get to have pointers to aggregates containing those types
>>>> on the device, right?
>>>>
>>> No, why? This is not allowed and should be diagnosed too. If somebody tries somehow to use not allowed type for the device variables/functions - it should be diagnosed.
>> Because this should be allowed. If I have:
>>
>> struct X {
>>   int a;
>>   __float128 b;
>> };
>>
>> and we have some function which does this:
>>
>> X *foo(X *x) {
>>   return x;
>> }
>>
>> We'll certainly want this function to compile for all targets, even if
>> there's no __float128 support on some accelerator. The whole model only
>> really makes sense if the accelerator shares the aggregate-layout rules
>> of the host, and this is a needless hassle for users if this causes an
>> error (especially in a unified-memory environment where configuration
>> data structures, etc. are shared between devices).
>>
>> Thanks again,
>>
>> Hal
>>
>>
>>>> Thanks again,
>>>>
>>>> Hal
>>>>
>>>>
>>>>> -------------
>>>>> Best regards,
>>>>> Alexey Bataev
>>>>> 15.01.2019 17:33, John McCall пишет:
>>>>>>> On 15 Jan 2019, at 17:20, Alexey Bataev wrote:
>>>>>>> This is not only for asm, we need to delay all target-specific
>>>>>>> diagnostics.
>>>>>>> I'm not saying that we need to move the host diagnostic, only the
>>>>>>> diagnostic for the device compilation.
>>>>>>> As for Cuda, it is a little but different. In Cuda the programmer
>>>>>>> must explicitly mark the device functions,  while in OpenMP it must
>>>>>>> be done implicitly. Thus, we cannot reuse the solution used for Cuda.
>>>>>> All it means is that you can't just use the solution used for CUDA
>>>>>> "off the shelf".  The basic idea of associating diagnostics with the
>>>>>> current function and then emitting those diagnostics later when you
>>>>>> realize that you have to emit that function is still completely
>>>>>> applicable.
>>>>>>
>>>>>> John.
>>>> -- 
>>>> Hal Finkel
>>>> Lead, Compiler Technology and Programming Languages
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory