[Openmp-commits] [openmp] r350333 - [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.

Finkel, Hal J. via Openmp-commits openmp-commits at lists.llvm.org
Thu Jan 3 10:30:07 PST 2019


Hi, Alexey,

We shouldn't have a semantic problem with __syncthreads because of how
inference works on the convergent attribute. Do we have an open bug
describing the problem, and if not, can you please file one? It sounds
like this tail cloning in critical-edge splitting needs to be updated to
respect the convergent attribute?

Thanks again,

Hal

On 1/3/19 11:43 AM, Alexey Bataev via Openmp-commits wrote:
> Author: abataev
> Date: Thu Jan  3 09:43:46 2019
> New Revision: 350333
>
> URL: http://llvm.org/viewvc/llvm-project?rev=350333&view=rev
> Log:
> [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.
>
> Summary:
> One of the LLVM optimizations, split critical edges, also clones tail
> instructions. This is a dangerous operation for __syncthreads()
> functions and this transformation leads to undefined behavior or
> incorrect results. Patch fixes this problem by replacing __syncthreads()
> function with the assembler instruction, which cost is too high and
> wich cannot be copied.
>
> Reviewers: grokos, gtbercea, kkwli0
>
> Subscribers: guansong, openmp-commits, caomhin
>
> Differential Revision: https://reviews.llvm.org/D56274
>
> Modified:
>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>     openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>
> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu?rev=350333&r1=350332&r2=350333&view=diff
> ==============================================================================
> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu (original)
> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu Thu Jan  3 09:43:46 2019
> @@ -564,7 +564,8 @@ EXTERN void __kmpc_get_team_static_memor
>      if (GetThreadIdInBlock() == 0) {
>        *frame = omptarget_nvptx_simpleMemoryManager.Acquire(buf, size);
>      }
> -    __syncthreads();
> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
> +    __SYNCTHREADS();
>      return;
>    }
>    ASSERT0(LT_FUSSY, GetThreadIdInBlock() == getMasterThreadId(),
> @@ -577,7 +578,8 @@ EXTERN void __kmpc_restore_team_static_m
>    if (is_shared)
>      return;
>    if (isSPMDMode()) {
> -    __syncthreads();
> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
> +    __SYNCTHREADS();
>      if (GetThreadIdInBlock() == 0) {
>        omptarget_nvptx_simpleMemoryManager.Release();
>      }
>
> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu?rev=350333&r1=350332&r2=350333&view=diff
> ==============================================================================
> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu (original)
> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu Thu Jan  3 09:43:46 2019
> @@ -105,7 +105,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>        omptarget_nvptx_simpleThreadPrivateContext =
>            omptarget_nvptx_device_simpleState[slot].Dequeue();
>      }
> -    __syncthreads();
> +    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
> +    __SYNCTHREADS();
>      omptarget_nvptx_simpleThreadPrivateContext->Init();
>      return;
>    }
> @@ -129,7 +130,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>      // init team context
>      currTeamDescr.InitTeamDescr();
>    }
> -  __syncthreads();
> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
> +  __SYNCTHREADS();
>  
>    omptarget_nvptx_TeamDescr &currTeamDescr = getMyTeamDescriptor();
>    omptarget_nvptx_WorkDescr &workDescr = getMyWorkDescriptor();
> @@ -170,7 +172,8 @@ EXTERN __attribute__((deprecated)) void
>  EXTERN void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime) {
>    // We're not going to pop the task descr stack of each thread since
>    // there are no more parallel regions in SPMD mode.
> -  __syncthreads();
> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
> +  __SYNCTHREADS();
>    int threadId = GetThreadIdInBlock();
>    if (!RequiresOMPRuntime) {
>      if (threadId == 0) {
>
> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h?rev=350333&r1=350332&r2=350333&view=diff
> ==============================================================================
> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h (original)
> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h Thu Jan  3 09:43:46 2019
> @@ -63,6 +63,9 @@
>  #define __ACTIVEMASK() __ballot(1)
>  #endif
>  
> +#define __SYNCTHREADS_N(n) asm volatile("bar.sync %0;" : : "r"(n) : "memory");
> +#define __SYNCTHREADS() __SYNCTHREADS_N(0)
> +
>  // arguments needed for L0 parallelism only.
>  class omptarget_nvptx_SharedArgs {
>  public:
>
> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu?rev=350333&r1=350332&r2=350333&view=diff
> ==============================================================================
> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu (original)
> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu Thu Jan  3 09:43:46 2019
> @@ -74,7 +74,8 @@ EXTERN void __kmpc_barrier(kmp_Ident *lo
>  // parallel region and that all worker threads participate.
>  EXTERN void __kmpc_barrier_simple_spmd(kmp_Ident *loc_ref, int32_t tid) {
>    PRINT0(LD_SYNC, "call kmpc_barrier_simple_spmd\n");
> -  __syncthreads();
> +  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
> +  __SYNCTHREADS();
>    PRINT0(LD_SYNC, "completed kmpc_barrier_simple_spmd\n");
>  }
>  
>
>
> _______________________________________________
> Openmp-commits mailing list
> Openmp-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the Openmp-commits mailing list