[polly] r309808 - [GPUJIT] Add GPUJIT APIs for allocating and freeing managed memory.

Wed Aug 2 05:23:22 PDT 2017

Author: bollu
Date: Wed Aug  2 05:23:22 2017
New Revision: 309808

URL: http://llvm.org/viewvc/llvm-project?rev=309808&view=rev
Log:
[GPUJIT] Add GPUJIT APIs for allocating and freeing managed memory.

We introduce `polly_mallocManaged` and `polly_freeManaged` as
proxies for `cudaMallocManaged` / `cudaFree`. This is currently not
used by Polly. It is auxiliary code that is used in `COSMO`.

This is useful because `polly_mallocManaged` matches the signature of `malloc`,
while `cudaMallocManaged` does not. We introduce `polly_freeManaged` for
symmetry.

We use this in COSMO to use the unified memory feature of the newer
CUDA APIs (>= 6).

Differential Revision: https://reviews.llvm.org/D35991

Modified:
    polly/trunk/tools/GPURuntime/GPUJIT.c
    polly/trunk/tools/GPURuntime/GPUJIT.h

Modified: polly/trunk/tools/GPURuntime/GPUJIT.c
URL: http://llvm.org/viewvc/llvm-project/polly/trunk/tools/GPURuntime/GPUJIT.c?rev=309808&r1=309807&r2=309808&view=diff
==============================================================================

--- polly/trunk/tools/GPURuntime/GPUJIT.c (original)
+++ polly/trunk/tools/GPURuntime/GPUJIT.c Wed Aug  2 05:23:22 2017
@@ -26,6 +26,7 @@
 #endif /* __APPLE__ */
 #endif /* HAS_LIBOPENCL */
 
+#include <assert.h>
 #include <dlfcn.h>
 #include <stdarg.h>
 #include <stdio.h>
@@ -1409,6 +1410,61 @@ static void launchKernelCUDA(PollyGPUFun
   }
 }
 
+// Maximum number of managed memory pointers.
+#define MAX_POINTERS 4000
+// For the rationale behing a list of free pointers, see `polly_freeManaged`.
+void *g_managedptrs[MAX_POINTERS];
+int g_nmanagedptrs = 0;
+
+// Add a pointer as being allocated by cuMallocManaged
+void addManagedPtr(void *mem) {
+  assert(g_nmanagedptrs < MAX_POINTERS && "We have hit the maximum number of "
+                                          "managed pointers allowed. Increase "
+                                          "MAX_POINTERS");
+  g_managedptrs[g_nmanagedptrs++] = mem;
+}
+
+int isManagedPtr(void *mem) {
+  for (int i = 0; i < g_nmanagedptrs; i++) {
+    if (g_managedptrs[i] == mem)
+      return 1;
+  }
+  return 0;
+}
+
+void polly_freeManaged(void *mem) {
+  dump_function();
+
+  // In a real-world program this was used (COSMO), there were more `free`
+  // calls in the original source than `malloc` calls. Hence, replacing all
+  // `free`s with `cudaFree` does not work, since we would try to free
+  // 'illegal' memory.
+  // As a quick fix, we keep a free list and check if `mem` is a managed memory
+  // pointer. If it is, we call `cudaFree`.
+  // If not, we pass it along to the underlying allocator.
+  // This is a hack, and can be removed if the underlying issue is fixed.
+  if (isManagedPtr(mem)) {
+    if (cudaFree(mem) != cudaSuccess) {
+      fprintf(stderr, "cudaFree failed.\n");
+      exit(-1);
+    }
+    return;
+  } else {
+    free(mem);
+  }
+}
+
+void *polly_mallocManaged(size_t size) {
+  dump_function();
+  void *a;
+  if (cudaMallocManaged(&a, size, cudaMemAttachGlobal) != cudaSuccess) {
+    fprintf(stderr, "cudaMallocManaged failed for size: %zu\n", size);
+    exit(-1);
+  }
+  addManagedPtr(a);
+  return a;
+}
+
 static void freeDeviceMemoryCUDA(PollyGPUDevicePtr *Allocation) {
   dump_function();
   CUDADevicePtr *DevPtr = (CUDADevicePtr *)Allocation->DevicePtr;

Modified: polly/trunk/tools/GPURuntime/GPUJIT.h
URL: http://llvm.org/viewvc/llvm-project/polly/trunk/tools/GPURuntime/GPUJIT.h?rev=309808&r1=309807&r2=309808&view=diff
==============================================================================
--- polly/trunk/tools/GPURuntime/GPUJIT.h (original)
+++ polly/trunk/tools/GPURuntime/GPUJIT.h Wed Aug  2 05:23:22 2017
@@ -13,6 +13,7 @@
 
 #ifndef GPUJIT_H_
 #define GPUJIT_H_
+#include "stddef.h"
 
 /*
  * The following demostrates how we can use the GPURuntime library to
@@ -110,4 +111,13 @@ void polly_launchKernel(PollyGPUFunction
                         void **Parameters);
 void polly_freeDeviceMemory(PollyGPUDevicePtr *Allocation);
 void polly_freeContext(PollyGPUContext *Context);
+
+// Note that polly_{malloc/free}Managed are currently not used by Polly.
+// We use them in COSMO by replacing all malloc with polly_mallocManaged and all
+// frees with cudaFree, so we can get managed memory "automatically".
+// Needless to say, this is a hack.
+// Please make sure that this code is not present in Polly when 2018 rolls in.
+// If this is still present, ping Siddharth Bhat <siddu.druid at gmail.com>
+void *polly_mallocManaged(size_t size);
+void polly_freeManaged(void *mem);
 #endif /* GPUJIT_H_ */