[Openmp-commits] [openmp] r350333 - [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.
Alexey Bataev via Openmp-commits
openmp-commits at lists.llvm.org
Thu Jan 3 10:32:51 PST 2019
Hi Hal, this problem was reported to NVidia by Doru Bercea and was
confirmed, as far as I know. I don't know if he reported it in the
bugzilla. But the problem exists.
-------------
Best regards,
Alexey Bataev
03.01.2019 13:30, Finkel, Hal J. пишет:
> Hi, Alexey,
>
> We shouldn't have a semantic problem with __syncthreads because of how
> inference works on the convergent attribute. Do we have an open bug
> describing the problem, and if not, can you please file one? It sounds
> like this tail cloning in critical-edge splitting needs to be updated to
> respect the convergent attribute?
>
> Thanks again,
>
> Hal
>
> On 1/3/19 11:43 AM, Alexey Bataev via Openmp-commits wrote:
>> Author: abataev
>> Date: Thu Jan 3 09:43:46 2019
>> New Revision: 350333
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=350333&view=rev
>> Log:
>> [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.
>>
>> Summary:
>> One of the LLVM optimizations, split critical edges, also clones tail
>> instructions. This is a dangerous operation for __syncthreads()
>> functions and this transformation leads to undefined behavior or
>> incorrect results. Patch fixes this problem by replacing __syncthreads()
>> function with the assembler instruction, which cost is too high and
>> wich cannot be copied.
>>
>> Reviewers: grokos, gtbercea, kkwli0
>>
>> Subscribers: guansong, openmp-commits, caomhin
>>
>> Differential Revision: https://reviews.llvm.org/D56274
>>
>> Modified:
>> openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>> openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>> openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>> openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>>
>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu?rev=350333&r1=350332&r2=350333&view=diff
>> ==============================================================================
>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu (original)
>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu Thu Jan 3 09:43:46 2019
>> @@ -564,7 +564,8 @@ EXTERN void __kmpc_get_team_static_memor
>> if (GetThreadIdInBlock() == 0) {
>> *frame = omptarget_nvptx_simpleMemoryManager.Acquire(buf, size);
>> }
>> - __syncthreads();
>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>> + __SYNCTHREADS();
>> return;
>> }
>> ASSERT0(LT_FUSSY, GetThreadIdInBlock() == getMasterThreadId(),
>> @@ -577,7 +578,8 @@ EXTERN void __kmpc_restore_team_static_m
>> if (is_shared)
>> return;
>> if (isSPMDMode()) {
>> - __syncthreads();
>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>> + __SYNCTHREADS();
>> if (GetThreadIdInBlock() == 0) {
>> omptarget_nvptx_simpleMemoryManager.Release();
>> }
>>
>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu?rev=350333&r1=350332&r2=350333&view=diff
>> ==============================================================================
>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu (original)
>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu Thu Jan 3 09:43:46 2019
>> @@ -105,7 +105,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>> omptarget_nvptx_simpleThreadPrivateContext =
>> omptarget_nvptx_device_simpleState[slot].Dequeue();
>> }
>> - __syncthreads();
>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>> + __SYNCTHREADS();
>> omptarget_nvptx_simpleThreadPrivateContext->Init();
>> return;
>> }
>> @@ -129,7 +130,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>> // init team context
>> currTeamDescr.InitTeamDescr();
>> }
>> - __syncthreads();
>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>> + __SYNCTHREADS();
>>
>> omptarget_nvptx_TeamDescr &currTeamDescr = getMyTeamDescriptor();
>> omptarget_nvptx_WorkDescr &workDescr = getMyWorkDescriptor();
>> @@ -170,7 +172,8 @@ EXTERN __attribute__((deprecated)) void
>> EXTERN void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime) {
>> // We're not going to pop the task descr stack of each thread since
>> // there are no more parallel regions in SPMD mode.
>> - __syncthreads();
>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>> + __SYNCTHREADS();
>> int threadId = GetThreadIdInBlock();
>> if (!RequiresOMPRuntime) {
>> if (threadId == 0) {
>>
>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h?rev=350333&r1=350332&r2=350333&view=diff
>> ==============================================================================
>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h (original)
>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h Thu Jan 3 09:43:46 2019
>> @@ -63,6 +63,9 @@
>> #define __ACTIVEMASK() __ballot(1)
>> #endif
>>
>> +#define __SYNCTHREADS_N(n) asm volatile("bar.sync %0;" : : "r"(n) : "memory");
>> +#define __SYNCTHREADS() __SYNCTHREADS_N(0)
>> +
>> // arguments needed for L0 parallelism only.
>> class omptarget_nvptx_SharedArgs {
>> public:
>>
>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu?rev=350333&r1=350332&r2=350333&view=diff
>> ==============================================================================
>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu (original)
>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu Thu Jan 3 09:43:46 2019
>> @@ -74,7 +74,8 @@ EXTERN void __kmpc_barrier(kmp_Ident *lo
>> // parallel region and that all worker threads participate.
>> EXTERN void __kmpc_barrier_simple_spmd(kmp_Ident *loc_ref, int32_t tid) {
>> PRINT0(LD_SYNC, "call kmpc_barrier_simple_spmd\n");
>> - __syncthreads();
>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>> + __SYNCTHREADS();
>> PRINT0(LD_SYNC, "completed kmpc_barrier_simple_spmd\n");
>> }
>>
>>
>>
>> _______________________________________________
>> Openmp-commits mailing list
>> Openmp-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20190103/bcbb529a/attachment-0001.sig>
More information about the Openmp-commits
mailing list