[Openmp-commits] [openmp] r350333 - [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.
Alexey Bataev via Openmp-commits
openmp-commits at lists.llvm.org
Thu Jan 3 10:40:04 PST 2019
It is a bug in the LLVM optimizer, somewhere in the split critical edges
pass or something like this, I don't remember exactly.
-------------
Best regards,
Alexey Bataev
03.01.2019 13:38, Finkel, Hal J. пишет:
> On 1/3/19 12:32 PM, Alexey Bataev wrote:
>> Hi Hal, this problem was reported to NVidia by Doru Bercea and was
>> confirmed, as far as I know. I don't know if he reported it in the
>> bugzilla. But the problem exists.
> So this is a bug in Clang's CUDA, in nvcc, in ptxas?
>
> Thanks again,
>
> Hal
>
>
>> -------------
>> Best regards,
>> Alexey Bataev
>>
>> 03.01.2019 13:30, Finkel, Hal J. пишет:
>>> Hi, Alexey,
>>>
>>> We shouldn't have a semantic problem with __syncthreads because of how
>>> inference works on the convergent attribute. Do we have an open bug
>>> describing the problem, and if not, can you please file one? It sounds
>>> like this tail cloning in critical-edge splitting needs to be updated to
>>> respect the convergent attribute?
>>>
>>> Thanks again,
>>>
>>> Hal
>>>
>>> On 1/3/19 11:43 AM, Alexey Bataev via Openmp-commits wrote:
>>>> Author: abataev
>>>> Date: Thu Jan 3 09:43:46 2019
>>>> New Revision: 350333
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=350333&view=rev
>>>> Log:
>>>> [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.
>>>>
>>>> Summary:
>>>> One of the LLVM optimizations, split critical edges, also clones tail
>>>> instructions. This is a dangerous operation for __syncthreads()
>>>> functions and this transformation leads to undefined behavior or
>>>> incorrect results. Patch fixes this problem by replacing __syncthreads()
>>>> function with the assembler instruction, which cost is too high and
>>>> wich cannot be copied.
>>>>
>>>> Reviewers: grokos, gtbercea, kkwli0
>>>>
>>>> Subscribers: guansong, openmp-commits, caomhin
>>>>
>>>> Differential Revision: https://reviews.llvm.org/D56274
>>>>
>>>> Modified:
>>>> openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>>>> openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>>>> openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>>>> openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>>>>
>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu?rev=350333&r1=350332&r2=350333&view=diff
>>>> ==============================================================================
>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu (original)
>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu Thu Jan 3 09:43:46 2019
>>>> @@ -564,7 +564,8 @@ EXTERN void __kmpc_get_team_static_memor
>>>> if (GetThreadIdInBlock() == 0) {
>>>> *frame = omptarget_nvptx_simpleMemoryManager.Acquire(buf, size);
>>>> }
>>>> - __syncthreads();
>>>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> + __SYNCTHREADS();
>>>> return;
>>>> }
>>>> ASSERT0(LT_FUSSY, GetThreadIdInBlock() == getMasterThreadId(),
>>>> @@ -577,7 +578,8 @@ EXTERN void __kmpc_restore_team_static_m
>>>> if (is_shared)
>>>> return;
>>>> if (isSPMDMode()) {
>>>> - __syncthreads();
>>>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> + __SYNCTHREADS();
>>>> if (GetThreadIdInBlock() == 0) {
>>>> omptarget_nvptx_simpleMemoryManager.Release();
>>>> }
>>>>
>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu?rev=350333&r1=350332&r2=350333&view=diff
>>>> ==============================================================================
>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu (original)
>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu Thu Jan 3 09:43:46 2019
>>>> @@ -105,7 +105,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>>>> omptarget_nvptx_simpleThreadPrivateContext =
>>>> omptarget_nvptx_device_simpleState[slot].Dequeue();
>>>> }
>>>> - __syncthreads();
>>>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> + __SYNCTHREADS();
>>>> omptarget_nvptx_simpleThreadPrivateContext->Init();
>>>> return;
>>>> }
>>>> @@ -129,7 +130,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
>>>> // init team context
>>>> currTeamDescr.InitTeamDescr();
>>>> }
>>>> - __syncthreads();
>>>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> + __SYNCTHREADS();
>>>>
>>>> omptarget_nvptx_TeamDescr &currTeamDescr = getMyTeamDescriptor();
>>>> omptarget_nvptx_WorkDescr &workDescr = getMyWorkDescriptor();
>>>> @@ -170,7 +172,8 @@ EXTERN __attribute__((deprecated)) void
>>>> EXTERN void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime) {
>>>> // We're not going to pop the task descr stack of each thread since
>>>> // there are no more parallel regions in SPMD mode.
>>>> - __syncthreads();
>>>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> + __SYNCTHREADS();
>>>> int threadId = GetThreadIdInBlock();
>>>> if (!RequiresOMPRuntime) {
>>>> if (threadId == 0) {
>>>>
>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h?rev=350333&r1=350332&r2=350333&view=diff
>>>> ==============================================================================
>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h (original)
>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h Thu Jan 3 09:43:46 2019
>>>> @@ -63,6 +63,9 @@
>>>> #define __ACTIVEMASK() __ballot(1)
>>>> #endif
>>>>
>>>> +#define __SYNCTHREADS_N(n) asm volatile("bar.sync %0;" : : "r"(n) : "memory");
>>>> +#define __SYNCTHREADS() __SYNCTHREADS_N(0)
>>>> +
>>>> // arguments needed for L0 parallelism only.
>>>> class omptarget_nvptx_SharedArgs {
>>>> public:
>>>>
>>>> Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
>>>> URL: http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu?rev=350333&r1=350332&r2=350333&view=diff
>>>> ==============================================================================
>>>> --- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu (original)
>>>> +++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu Thu Jan 3 09:43:46 2019
>>>> @@ -74,7 +74,8 @@ EXTERN void __kmpc_barrier(kmp_Ident *lo
>>>> // parallel region and that all worker threads participate.
>>>> EXTERN void __kmpc_barrier_simple_spmd(kmp_Ident *loc_ref, int32_t tid) {
>>>> PRINT0(LD_SYNC, "call kmpc_barrier_simple_spmd\n");
>>>> - __syncthreads();
>>>> + // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
>>>> + __SYNCTHREADS();
>>>> PRINT0(LD_SYNC, "completed kmpc_barrier_simple_spmd\n");
>>>> }
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Openmp-commits mailing list
>>>> Openmp-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20190103/15a462dc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20190103/15a462dc/attachment-0001.sig>
More information about the Openmp-commits
mailing list