[Openmp-commits] [openmp] [OpenMP][libomptarget][RFC] extend libomptarget with mechanism to execute fill memory on the target (PR #73801)

Johannes Doerfert via Openmp-commits openmp-commits at lists.llvm.org
Wed Nov 29 08:55:37 PST 2023


================
@@ -320,21 +320,31 @@ EXTERN void *omp_target_memset(void *Ptr, int ByteVal, size_t NumBytes,
     // That will require the ability to execute a kernel from within
     // libomptarget.so (which we do not have at the moment).
 
-    // This is a very slow path: create a filled array on the host and upload
-    // it to the GPU device.
-    int InitialDevice = omp_get_initial_device();
-    void *Shadow = omp_target_alloc(NumBytes, InitialDevice);
-    if (Shadow) {
-      (void)memset(Shadow, ByteVal, NumBytes);
-      (void)omp_target_memcpy(Ptr, Shadow, NumBytes, 0, 0, DeviceNum,
-                              InitialDevice);
-      (void)omp_target_free(Shadow, InitialDevice);
+    if (NumBytes % sizeof(int32_t) == 0) {
+      DeviceTy &Dev = *PM->Devices[DeviceNum];
+      AsyncInfoTy AsyncInfo(Dev);
+      int32_t Val =
+          ByteVal + (ByteVal << 8) + (ByteVal << 16) + (ByteVal << 24);
+      uint64_t NumValues = NumBytes / sizeof(int32_t);
+      int Rc = Dev.fillMemory(Ptr, Val, NumValues, AsyncInfo);
+      printf("--> Rc=%d\n", Rc);
----------------
jdoerfert wrote:

leftover

https://github.com/llvm/llvm-project/pull/73801


More information about the Openmp-commits mailing list