[Openmp-commits] [openmp] r350333 - [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.

Finkel, Hal J. via Openmp-commits openmp-commits at lists.llvm.org
Thu Jan 3 10:38:07 PST 2019


On 1/3/19 12:32 PM, Alexey Bataev wrote:
> Hi Hal, this problem was reported to NVidia by Doru Bercea and was
> confirmed, as far as I know. I don't know if he reported it in the
> bugzilla. But the problem exists.

So this is a bug in Clang's CUDA, in nvcc, in ptxas?

Thanks again,

Hal


>
> -------------
> Best regards,
> Alexey Bataev
>
> 03.01.2019 13:30, Finkel, Hal J. пишет:
>> Hi, Alexey,
>>
>> We shouldn't have a semantic problem with __syncthreads because of how
>> inference works on the convergent attribute. Do we have an open bug
>> describing the problem, and if not, can you please file one? It sounds
>> like this tail cloning in critical-edge splitting needs to be updated to
>> respect the convergent attribute?
>>
>> Thanks again,
>>
>> Hal
>>
>> On 1/3/19 11:43 AM, Alexey Bataev via Openmp-commits wrote:
>>> Author: abataev
>>> Date: Thu Jan  3 09:43:46 2019
>>> New Revision: 350333
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=350333&view=rev
>>> Log:
>>> [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.
>>>
>>> Summary:
>>> One of the LLVM optimizations, split critical edges, also clones tail
>>> instructions. This is a dangerous operation for __syncthreads()
>>> functions and this transformation leads to undefined behavior or
>>> incorrect results. Patch fixes this problem by replacing __syncthreads()
>>> function with the assembler instruction, which cost is too high and
>>> wich cannot be copied.
>>>
>>> Reviewers: grokos, gtbercea, kkwli0
>>>
>>> Subscribers: guansong, openmp-commits, caomhin
>>>
>>> Differential Revision: https://reviews.llvm.org/D56274
>>>
>>> Modified:
>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>>>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>>>
>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu?rev=350333&r1=350332&r2=350333&view=diff
>>> ==============================================================================
>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu (original)
>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu Thu Jan  3 09:43:46 2019
>>> @@ -564,7 +564,8 @@ EXTERN void __kmpc_get_team_static_memor
>>>      if (GetThreadIdInBlock() == 0) {
>>>        *frame = omptarget_nvptx_simpleMemoryManager.Acquire(buf, size);
>>>      }
>>> -    __syncthreads();
>>> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>> +    __SYNCTHREADS();
>>>      return;
>>>    }
>>>    ASSERT0(LT_FUSSY, GetThreadIdInBlock() == getMasterThreadId(),
>>> @@ -577,7 +578,8 @@ EXTERN void __kmpc_restore_team_static_m
>>>    if (is_shared)
>>>      return;
>>>    if (isSPMDMode()) {
>>> -    __syncthreads();
>>> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>> +    __SYNCTHREADS();
>>>      if (GetThreadIdInBlock() == 0) {
>>>        omptarget_nvptx_simpleMemoryManager.Release();
>>>      }
>>>
>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu?rev=350333&r1=350332&r2=350333&view=diff
>>> ==============================================================================
>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu (original)
>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu Thu Jan  3 09:43:46 2019
>>> @@ -105,7 +105,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>>>        omptarget_nvptx_simpleThreadPrivateContext =
>>>            omptarget_nvptx_device_simpleState[slot].Dequeue();
>>>      }
>>> -    __syncthreads();
>>> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>> +    __SYNCTHREADS();
>>>      omptarget_nvptx_simpleThreadPrivateContext->Init();
>>>      return;
>>>    }
>>> @@ -129,7 +130,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>>>      // init team context
>>>      currTeamDescr.InitTeamDescr();
>>>    }
>>> -  __syncthreads();
>>> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>> +  __SYNCTHREADS();
>>>  
>>>    omptarget_nvptx_TeamDescr &currTeamDescr = getMyTeamDescriptor();
>>>    omptarget_nvptx_WorkDescr &workDescr = getMyWorkDescriptor();
>>> @@ -170,7 +172,8 @@ EXTERN __attribute__((deprecated)) void
>>>  EXTERN void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime) {
>>>    // We're not going to pop the task descr stack of each thread since
>>>    // there are no more parallel regions in SPMD mode.
>>> -  __syncthreads();
>>> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>> +  __SYNCTHREADS();
>>>    int threadId = GetThreadIdInBlock();
>>>    if (!RequiresOMPRuntime) {
>>>      if (threadId == 0) {
>>>
>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h?rev=350333&r1=350332&r2=350333&view=diff
>>> ==============================================================================
>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h (original)
>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h Thu Jan  3 09:43:46 2019
>>> @@ -63,6 +63,9 @@
>>>  #define __ACTIVEMASK() __ballot(1)
>>>  #endif
>>>  
>>> +#define __SYNCTHREADS_N(n) asm volatile("bar.sync %0;" : : "r"(n) : "memory");
>>> +#define __SYNCTHREADS() __SYNCTHREADS_N(0)
>>> +
>>>  // arguments needed for L0 parallelism only.
>>>  class omptarget_nvptx_SharedArgs {
>>>  public:
>>>
>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu?rev=350333&r1=350332&r2=350333&view=diff
>>> ==============================================================================
>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu (original)
>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu Thu Jan  3 09:43:46 2019
>>> @@ -74,7 +74,8 @@ EXTERN void __kmpc_barrier(kmp_Ident *lo
>>>  // parallel region and that all worker threads participate.
>>>  EXTERN void __kmpc_barrier_simple_spmd(kmp_Ident *loc_ref, int32_t tid) {
>>>    PRINT0(LD_SYNC, "call kmpc_barrier_simple_spmd\n");
>>> -  __syncthreads();
>>> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>> +  __SYNCTHREADS();
>>>    PRINT0(LD_SYNC, "completed kmpc_barrier_simple_spmd\n");
>>>  }
>>>  
>>>
>>>
>>> _______________________________________________
>>> Openmp-commits mailing list
>>> Openmp-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the Openmp-commits mailing list