[Openmp-commits] [openmp] [Libomptarget] Fix Nvidia offloading hanging on dataRetrieve using RPC (PR #66817)

Sat Sep 23 07:55:47 PDT 2023

================
@@ -0,0 +1,22 @@
+// RUN: %libomptarget-compile-run-and-check-generic
+
+// REQUIRES: libc
+
+#include <assert.h>
+#include <stdio.h>
+
+#pragma omp declare target to(stdout)
+
+int main() {
+  int r = 0;
+// CHECK: PASS
+#pragma omp target map(from : r)
+  { r = fwrite("PASS\n", 1, sizeof("PASS\n") - 1, stdout); }
+  assert(r == sizeof("PASS\n") - 1 && "Incorrect number of bytes written");
----------------
jdoerfert wrote:

> we schedule both the kernel and the copy-back on the steam stream.
Yes, that is correct.

> i.e. we cannot issue the memcpy until the kernel is done so we wait inside of the CUDA runtime.
That I don't get.

> This is bad because it means that we aren't periodicaly checking the RPC server while waiting.
The checking is done by the thread that issued the kernel, right? It does not wait for the stream with the kernel to finish before it checks the RPC stream, correct? I don't get why this interferes. I would assume it checks the RCP, then non-blocking the kernel, and so on?



https://github.com/llvm/llvm-project/pull/66817