[cfe-dev] [RFC] Delayed target-specific diagnostic when compiling for the devices.

Finkel, Hal J. via cfe-dev cfe-dev at lists.llvm.org
Mon Jan 14 08:01:14 PST 2019


I have some questions about this... I'll write them up later today.

 -Hal

On 1/14/19 8:25 AM, Alexey Bataev wrote:
> Ping!
>
> -------------
> Best regards,
> Alexey Bataev
>
> 09.01.2019 10:02, Alexey.Bataev пишет:
>> Hi, many of the OpenMP users experience troubles when they try to
>> compile real-world OpenMP applications, which use offloading constructs.
>>
>> Problem Description
>> ===============
>> For example, let’s look at the abstract code:
>> ```
>> void target_unused() {
>>   int a;
>>   __asm__("constraints"
>>           : "=a"(a)
>>           :         // None
>>   );
>> }
>>
>> void target_used() {
>>   int a;
>>   __asm__("constraints"
>>           : "=a"(a)
>>           :         // None
>>   );
>> }
>>
>> void foo() {
>>   target_unused();
>> #pragma omp target
>>   target_used();
>> }
>> ```
>> Assume, we going to compile this code on X86_64 host to run on the
>> NVidia NVPTX64 device. When we compile this code for the host,
>> everything is good. But when we compile the same code for the NVPTX64
>> target, we get the next error messages:
>> ```
>> 11:13: error: invalid output constraint '=a' in asm
>> 20:13: error: invalid output constraint '=a' in asm
>> ```
>> But, actually, we should see only one error message, the second one,
>> for the function `target_used()`, which is actually used in the target
>> region. The second function, `target_unused()` is used only on the
>> host and we should no produce error message for this function when we
>> compile the code for the device.
>>
>> The main problem with those functions is that they are not marked
>> explicitly as the device functions, just like it is required in CUDA.
>> In OpenMP, it is not required to mark them explicitly as the
>> device-only or both device-host function. They can be marked
>> implicitly, depending of the fact that they are used in target-based
>> constructs (probably, indirectly, through chain of calls) or not.
>>
>> That’s why we need to postpone some of the target-related diagnostics
>> till the CodeGen phase.
>>
>> Possible solution.
>> ==============
>> 1. Move target-specific checks to the CodeGen.
>>
>> The best solutions of all, does not require significant redesign of
>> the existing code base, just requires some copy-paste of the
>> diagnostics and the associated logic.
>>
>> 2. Add special delayed diagnostics class, associate it with the
>> declarations somehow and check it for each declaration during the CodeGen.
>>
>> Requires redesign of the diagnostic subsystem + redesign of the
>> codegen subsystem, at least.
>>
>> 3. Introduce special expression, statement and declaration nodes.
>>
>> This nodes serve only as the containers for the target-specific error
>> messages, generated in Sema. During the codegen phase, if the codegen
>> run into one of such constructs, it just emits the diagnostic, stored
>> in these nodes.
>>
>> Requires additional expression/statement/declarations nodes + looks
>> like a hack.
>>
>> -- 
>> -------------
>> Best regards,
>> Alexey Bataev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the cfe-dev mailing list