<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>It is a bug in the LLVM optimizer, somewhere in the split
      critical edges pass or something like this, I don't remember
      exactly.<br>
    </p>
    <pre class="moz-signature" cols="72">-------------
Best regards,
Alexey Bataev</pre>
    <div class="moz-cite-prefix">03.01.2019 13:38, Finkel, Hal J. пишет:<br>
    </div>
    <blockquote type="cite"
      cite="mid:9f970ed1-aabd-081e-fead-238361f20426@anl.gov">
      <pre class="moz-quote-pre" wrap="">
On 1/3/19 12:32 PM, Alexey Bataev wrote:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Hi Hal, this problem was reported to NVidia by Doru Bercea and was
confirmed, as far as I know. I don't know if he reported it in the
bugzilla. But the problem exists.
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
So this is a bug in Clang's CUDA, in nvcc, in ptxas?

Thanks again,

Hal


</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">
-------------
Best regards,
Alexey Bataev

03.01.2019 13:30, Finkel, Hal J. пишет:
</pre>
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">Hi, Alexey,

We shouldn't have a semantic problem with __syncthreads because of how
inference works on the convergent attribute. Do we have an open bug
describing the problem, and if not, can you please file one? It sounds
like this tail cloning in critical-edge splitting needs to be updated to
respect the convergent attribute?

Thanks again,

Hal

On 1/3/19 11:43 AM, Alexey Bataev via Openmp-commits wrote:
</pre>
          <blockquote type="cite">
            <pre class="moz-quote-pre" wrap="">Author: abataev
Date: Thu Jan  3 09:43:46 2019
New Revision: 350333

URL: <a class="moz-txt-link-freetext" href="http://llvm.org/viewvc/llvm-project?rev=350333&view=rev">http://llvm.org/viewvc/llvm-project?rev=350333&view=rev</a>
Log:
[OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.

Summary:
One of the LLVM optimizations, split critical edges, also clones tail
instructions. This is a dangerous operation for __syncthreads()
functions and this transformation leads to undefined behavior or
incorrect results. Patch fixes this problem by replacing __syncthreads()
function with the assembler instruction, which cost is too high and
wich cannot be copied.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, openmp-commits, caomhin

Differential Revision: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D56274">https://reviews.llvm.org/D56274</a>

Modified:
    openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
    openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
    openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
    openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu

Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
URL: <a class="moz-txt-link-freetext" href="http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu?rev=350333&r1=350332&r2=350333&view=diff">http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu?rev=350333&r1=350332&r2=350333&view=diff</a>
==============================================================================
--- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu (original)
+++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu Thu Jan  3 09:43:46 2019
@@ -564,7 +564,8 @@ EXTERN void __kmpc_get_team_static_memor
     if (GetThreadIdInBlock() == 0) {
       *frame = omptarget_nvptx_simpleMemoryManager.Acquire(buf, size);
     }
-    __syncthreads();
+    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
+    __SYNCTHREADS();
     return;
   }
   ASSERT0(LT_FUSSY, GetThreadIdInBlock() == getMasterThreadId(),
@@ -577,7 +578,8 @@ EXTERN void __kmpc_restore_team_static_m
   if (is_shared)
     return;
   if (isSPMDMode()) {
-    __syncthreads();
+    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
+    __SYNCTHREADS();
     if (GetThreadIdInBlock() == 0) {
       omptarget_nvptx_simpleMemoryManager.Release();
     }

Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu
URL: <a class="moz-txt-link-freetext" href="http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu?rev=350333&r1=350332&r2=350333&view=diff">http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu?rev=350333&r1=350332&r2=350333&view=diff</a>
==============================================================================
--- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu (original)
+++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu Thu Jan  3 09:43:46 2019
@@ -105,7 +105,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
       omptarget_nvptx_simpleThreadPrivateContext =
           omptarget_nvptx_device_simpleState[slot].Dequeue();
     }
-    __syncthreads();
+    // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
+    __SYNCTHREADS();
     omptarget_nvptx_simpleThreadPrivateContext->Init();
     return;
   }
@@ -129,7 +130,8 @@ EXTERN void __kmpc_spmd_kernel_init(int
     // init team context
     currTeamDescr.InitTeamDescr();
   }
-  __syncthreads();
+  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
+  __SYNCTHREADS();
 
   omptarget_nvptx_TeamDescr &currTeamDescr = getMyTeamDescriptor();
   omptarget_nvptx_WorkDescr &workDescr = getMyWorkDescriptor();
@@ -170,7 +172,8 @@ EXTERN __attribute__((deprecated)) void
 EXTERN void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime) {
   // We're not going to pop the task descr stack of each thread since
   // there are no more parallel regions in SPMD mode.
-  __syncthreads();
+  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
+  __SYNCTHREADS();
   int threadId = GetThreadIdInBlock();
   if (!RequiresOMPRuntime) {
     if (threadId == 0) {

Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h
URL: <a class="moz-txt-link-freetext" href="http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h?rev=350333&r1=350332&r2=350333&view=diff">http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h?rev=350333&r1=350332&r2=350333&view=diff</a>
==============================================================================
--- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h (original)
+++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h Thu Jan  3 09:43:46 2019
@@ -63,6 +63,9 @@
 #define __ACTIVEMASK() __ballot(1)
 #endif
 
+#define __SYNCTHREADS_N(n) asm volatile("bar.sync %0;" : : "r"(n) : "memory");
+#define __SYNCTHREADS() __SYNCTHREADS_N(0)
+
 // arguments needed for L0 parallelism only.
 class omptarget_nvptx_SharedArgs {
 public:

Modified: openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu
URL: <a class="moz-txt-link-freetext" href="http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu?rev=350333&r1=350332&r2=350333&view=diff">http://llvm.org/viewvc/llvm-project/openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu?rev=350333&r1=350332&r2=350333&view=diff</a>
==============================================================================
--- openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu (original)
+++ openmp/trunk/libomptarget/deviceRTLs/nvptx/src/sync.cu Thu Jan  3 09:43:46 2019
@@ -74,7 +74,8 @@ EXTERN void __kmpc_barrier(kmp_Ident *lo
 // parallel region and that all worker threads participate.
 EXTERN void __kmpc_barrier_simple_spmd(kmp_Ident *loc_ref, int32_t tid) {
   PRINT0(LD_SYNC, "call kmpc_barrier_simple_spmd\n");
-  __syncthreads();
+  // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
+  __SYNCTHREADS();
   PRINT0(LD_SYNC, "completed kmpc_barrier_simple_spmd\n");
 }
 


_______________________________________________
Openmp-commits mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Openmp-commits@lists.llvm.org">Openmp-commits@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits">http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits</a>
</pre>
          </blockquote>
        </blockquote>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
</pre>
    </blockquote>
  </body>
</html>