[Openmp-commits] [openmp] r350333 - [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.

Alexey Bataev via Openmp-commits openmp-commits at lists.llvm.org
Thu Jan 3 10:49:36 PST 2019


Agree, spent about 4 weeks trying to make work OpenMP code with the
inlined runtime. Found, that many problems were caused by this bug.

-------------
Best regards,
Alexey Bataev

03.01.2019 13:46, Finkel, Hal J. пишет:
> On 1/3/19 12:40 PM, Alexey Bataev wrote:
>> It is a bug in the LLVM optimizer, somewhere in the split critical
>> edges pass or something like this, I don't remember exactly.
>>
> Okay. We should definitely have a tracking bug for it. Doru, can you
> please take care of that? Is NVIDIA working on an upstream fix?
>
> Thanks again,
>
> Hal
>
>> -------------
>> Best regards,
>> Alexey Bataev
>> 03.01.2019 13:38, Finkel, Hal J. пишет:
>>> On 1/3/19 12:32 PM, Alexey Bataev wrote:
>>>> Hi Hal, this problem was reported to NVidia by Doru Bercea and was
>>>> confirmed, as far as I know. I don't know if he reported it in the
>>>> bugzilla. But the problem exists.
>>> So this is a bug in Clang's CUDA, in nvcc, in ptxas?
>>>
>>> Thanks again,
>>>
>>> Hal
>>>
>>>
>>>> -------------
>>>> Best regards,
>>>> Alexey Bataev
>>>>
>>>> 03.01.2019 13:30, Finkel, Hal J. пишет:
>>>>> Hi, Alexey,
>>>>>
>>>>> We shouldn't have a semantic problem with __syncthreads because of how
>>>>> inference works on the convergent attribute. Do we have an open bug
>>>>> describing the problem, and if not, can you please file one? It sounds
>>>>> like this tail cloning in critical-edge splitting needs to be updated to
>>>>> respect the convergent attribute?
>>>>>
>>>>> Thanks again,
>>>>>
>>>>> Hal
>>>>>
>>>>> On 1/3/19 11:43 AM, Alexey Bataev via Openmp-commits wrote:
>>>>>> Author: abataev
>>>>>> Date: Thu Jan  3 09:43:46 2019
>>>>>> New Revision: 350333
>>>>>>
>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=350333&view=rev
>>>>>> Log:
>>>>>> [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.
>>>>>>
>>>>>> Summary:
>>>>>> One of the LLVM optimizations, split critical edges, also clones tail
>>>>>> instructions. This is a dangerous operation for __syncthreads()
>>>>>> functions and this transformation leads to undefined behavior or
>>>>>> incorrect results. Patch fixes this problem by replacing __syncthreads()
>>>>>> function with the assembler instruction, which cost is too high and
>>>>>> wich cannot be copied.
>>>>>>
>>>>>> Reviewers: grokos, gtbercea, kkwli0
>>>>>>
>>>>>> Subscribers: guansong, openmp-commits, caomhin
>>>>>>
>>>>>> Differential Revision: https://reviews.llvm.org/D56274
>>>>>>
>>>>>> Modified:
>>>>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>>>>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>>>>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>>>>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>>>>>>
>>>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>>>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu?rev=350333&r1=350332&r2=350333&view=diff
>>>>>> ==============================================================================
>>>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu (original)
>>>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu Thu Jan  3 09:43:46 2019
>>>>>> @@ -564,7 +564,8 @@ EXTERN void __kmpc_get_team_static_memor
>>>>>>      if (GetThreadIdInBlock() == 0) {
>>>>>>        *frame = omptarget_nvptx_simpleMemoryManager.Acquire(buf, size);
>>>>>>      }
>>>>>> -    __syncthreads();
>>>>>> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>>>> +    __SYNCTHREADS();
>>>>>>      return;
>>>>>>    }
>>>>>>    ASSERT0(LT_FUSSY, GetThreadIdInBlock() == getMasterThreadId(),
>>>>>> @@ -577,7 +578,8 @@ EXTERN void __kmpc_restore_team_static_m
>>>>>>    if (is_shared)
>>>>>>      return;
>>>>>>    if (isSPMDMode()) {
>>>>>> -    __syncthreads();
>>>>>> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>>>> +    __SYNCTHREADS();
>>>>>>      if (GetThreadIdInBlock() == 0) {
>>>>>>        omptarget_nvptx_simpleMemoryManager.Release();
>>>>>>      }
>>>>>>
>>>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>>>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu?rev=350333&r1=350332&r2=350333&view=diff
>>>>>> ==============================================================================
>>>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu (original)
>>>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu Thu Jan  3 09:43:46 2019
>>>>>> @@ -105,7 +105,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>>>>>>        omptarget_nvptx_simpleThreadPrivateContext =
>>>>>>            omptarget_nvptx_device_simpleState[slot].Dequeue();
>>>>>>      }
>>>>>> -    __syncthreads();
>>>>>> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>>>> +    __SYNCTHREADS();
>>>>>>      omptarget_nvptx_simpleThreadPrivateContext->Init();
>>>>>>      return;
>>>>>>    }
>>>>>> @@ -129,7 +130,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>>>>>>      // init team context
>>>>>>      currTeamDescr.InitTeamDescr();
>>>>>>    }
>>>>>> -  __syncthreads();
>>>>>> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>>>> +  __SYNCTHREADS();
>>>>>>  
>>>>>>    omptarget_nvptx_TeamDescr &currTeamDescr = getMyTeamDescriptor();
>>>>>>    omptarget_nvptx_WorkDescr &workDescr = getMyWorkDescriptor();
>>>>>> @@ -170,7 +172,8 @@ EXTERN __attribute__((deprecated)) void
>>>>>>  EXTERN void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime) {
>>>>>>    // We're not going to pop the task descr stack of each thread since
>>>>>>    // there are no more parallel regions in SPMD mode.
>>>>>> -  __syncthreads();
>>>>>> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>>>> +  __SYNCTHREADS();
>>>>>>    int threadId = GetThreadIdInBlock();
>>>>>>    if (!RequiresOMPRuntime) {
>>>>>>      if (threadId == 0) {
>>>>>>
>>>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>>>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h?rev=350333&r1=350332&r2=350333&view=diff
>>>>>> ==============================================================================
>>>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h (original)
>>>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h Thu Jan  3 09:43:46 2019
>>>>>> @@ -63,6 +63,9 @@
>>>>>>  #define __ACTIVEMASK() __ballot(1)
>>>>>>  #endif
>>>>>>  
>>>>>> +#define __SYNCTHREADS_N(n) asm volatile("bar.sync %0;" : : "r"(n) : "memory");
>>>>>> +#define __SYNCTHREADS() __SYNCTHREADS_N(0)
>>>>>> +
>>>>>>  // arguments needed for L0 parallelism only.
>>>>>>  class omptarget_nvptx_SharedArgs {
>>>>>>  public:
>>>>>>
>>>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>>>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu?rev=350333&r1=350332&r2=350333&view=diff
>>>>>> ==============================================================================
>>>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu (original)
>>>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu Thu Jan  3 09:43:46 2019
>>>>>> @@ -74,7 +74,8 @@ EXTERN void __kmpc_barrier(kmp_Ident *lo
>>>>>>  // parallel region and that all worker threads participate.
>>>>>>  EXTERN void __kmpc_barrier_simple_spmd(kmp_Ident *loc_ref, int32_t tid) {
>>>>>>    PRINT0(LD_SYNC, "call kmpc_barrier_simple_spmd\n");
>>>>>> -  __syncthreads();
>>>>>> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>>>> +  __SYNCTHREADS();
>>>>>>    PRINT0(LD_SYNC, "completed kmpc_barrier_simple_spmd\n");
>>>>>>  }
>>>>>>  
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Openmp-commits mailing list
>>>>>> Openmp-commits at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20190103/0936f192/attachment-0001.sig>


More information about the Openmp-commits mailing list