[Openmp-commits] [openmp] r350333 - [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.

Alexey Bataev via Openmp-commits openmp-commits at lists.llvm.org
Thu Jan 3 10:40:04 PST 2019


It is a bug in the LLVM optimizer, somewhere in the split critical edges
pass or something like this, I don't remember exactly.

-------------
Best regards,
Alexey Bataev

03.01.2019 13:38, Finkel, Hal J. пишет:
> On 1/3/19 12:32 PM, Alexey Bataev wrote:
>> Hi Hal, this problem was reported to NVidia by Doru Bercea and was
>> confirmed, as far as I know. I don't know if he reported it in the
>> bugzilla. But the problem exists.
> So this is a bug in Clang's CUDA, in nvcc, in ptxas?
>
> Thanks again,
>
> Hal
>
>
>> -------------
>> Best regards,
>> Alexey Bataev
>>
>> 03.01.2019 13:30, Finkel, Hal J. пишет:
>>> Hi, Alexey,
>>>
>>> We shouldn't have a semantic problem with __syncthreads because of how
>>> inference works on the convergent attribute. Do we have an open bug
>>> describing the problem, and if not, can you please file one? It sounds
>>> like this tail cloning in critical-edge splitting needs to be updated to
>>> respect the convergent attribute?
>>>
>>> Thanks again,
>>>
>>> Hal
>>>
>>> On 1/3/19 11:43 AM, Alexey Bataev via Openmp-commits wrote:
>>>> Author: abataev
>>>> Date: Thu Jan  3 09:43:46 2019
>>>> New Revision: 350333
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=350333&view=rev
>>>> Log:
>>>> [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.
>>>>
>>>> Summary:
>>>> One of the LLVM optimizations, split critical edges, also clones tail
>>>> instructions. This is a dangerous operation for __syncthreads()
>>>> functions and this transformation leads to undefined behavior or
>>>> incorrect results. Patch fixes this problem by replacing __syncthreads()
>>>> function with the assembler instruction, which cost is too high and
>>>> wich cannot be copied.
>>>>
>>>> Reviewers: grokos, gtbercea, kkwli0
>>>>
>>>> Subscribers: guansong, openmp-commits, caomhin
>>>>
>>>> Differential Revision: https://reviews.llvm.org/D56274
>>>>
>>>> Modified:
>>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>>>>
>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu?rev=350333&r1=350332&r2=350333&view=diff
>>>> ==============================================================================
>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu (original)
>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu Thu Jan  3 09:43:46 2019
>>>> @@ -564,7 +564,8 @@ EXTERN void __kmpc_get_team_static_memor
>>>>      if (GetThreadIdInBlock() == 0) {
>>>>        *frame = omptarget_nvptx_simpleMemoryManager.Acquire(buf, size);
>>>>      }
>>>> -    __syncthreads();
>>>> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> +    __SYNCTHREADS();
>>>>      return;
>>>>    }
>>>>    ASSERT0(LT_FUSSY, GetThreadIdInBlock() == getMasterThreadId(),
>>>> @@ -577,7 +578,8 @@ EXTERN void __kmpc_restore_team_static_m
>>>>    if (is_shared)
>>>>      return;
>>>>    if (isSPMDMode()) {
>>>> -    __syncthreads();
>>>> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> +    __SYNCTHREADS();
>>>>      if (GetThreadIdInBlock() == 0) {
>>>>        omptarget_nvptx_simpleMemoryManager.Release();
>>>>      }
>>>>
>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu?rev=350333&r1=350332&r2=350333&view=diff
>>>> ==============================================================================
>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu (original)
>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu Thu Jan  3 09:43:46 2019
>>>> @@ -105,7 +105,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>>>>        omptarget_nvptx_simpleThreadPrivateContext =
>>>>            omptarget_nvptx_device_simpleState[slot].Dequeue();
>>>>      }
>>>> -    __syncthreads();
>>>> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> +    __SYNCTHREADS();
>>>>      omptarget_nvptx_simpleThreadPrivateContext->Init();
>>>>      return;
>>>>    }
>>>> @@ -129,7 +130,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>>>>      // init team context
>>>>      currTeamDescr.InitTeamDescr();
>>>>    }
>>>> -  __syncthreads();
>>>> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> +  __SYNCTHREADS();
>>>>  
>>>>    omptarget_nvptx_TeamDescr &currTeamDescr = getMyTeamDescriptor();
>>>>    omptarget_nvptx_WorkDescr &workDescr = getMyWorkDescriptor();
>>>> @@ -170,7 +172,8 @@ EXTERN __attribute__((deprecated)) void
>>>>  EXTERN void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime) {
>>>>    // We're not going to pop the task descr stack of each thread since
>>>>    // there are no more parallel regions in SPMD mode.
>>>> -  __syncthreads();
>>>> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> +  __SYNCTHREADS();
>>>>    int threadId = GetThreadIdInBlock();
>>>>    if (!RequiresOMPRuntime) {
>>>>      if (threadId == 0) {
>>>>
>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h?rev=350333&r1=350332&r2=350333&view=diff
>>>> ==============================================================================
>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h (original)
>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h Thu Jan  3 09:43:46 2019
>>>> @@ -63,6 +63,9 @@
>>>>  #define __ACTIVEMASK() __ballot(1)
>>>>  #endif
>>>>  
>>>> +#define __SYNCTHREADS_N(n) asm volatile("bar.sync %0;" : : "r"(n) : "memory");
>>>> +#define __SYNCTHREADS() __SYNCTHREADS_N(0)
>>>> +
>>>>  // arguments needed for L0 parallelism only.
>>>>  class omptarget_nvptx_SharedArgs {
>>>>  public:
>>>>
>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu?rev=350333&r1=350332&r2=350333&view=diff
>>>> ==============================================================================
>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu (original)
>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu Thu Jan  3 09:43:46 2019
>>>> @@ -74,7 +74,8 @@ EXTERN void __kmpc_barrier(kmp_Ident *lo
>>>>  // parallel region and that all worker threads participate.
>>>>  EXTERN void __kmpc_barrier_simple_spmd(kmp_Ident *loc_ref, int32_t tid) {
>>>>    PRINT0(LD_SYNC, "call kmpc_barrier_simple_spmd\n");
>>>> -  __syncthreads();
>>>> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> +  __SYNCTHREADS();
>>>>    PRINT0(LD_SYNC, "completed kmpc_barrier_simple_spmd\n");
>>>>  }
>>>>  
>>>>
>>>>
>>>> _______________________________________________
>>>> Openmp-commits mailing list
>>>> Openmp-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20190103/15a462dc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20190103/15a462dc/attachment-0001.sig>


More information about the Openmp-commits mailing list