[Mlir-commits] [mlir] 4edc9e2 - [MLIR][GPU] Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support

Uday Bondhugula llvmlistbot at llvm.org
Fri Aug 27 23:14:24 PDT 2021


Author: Uday Bondhugula
Date: 2021-08-28T11:37:55+05:30
New Revision: 4edc9e2acf1d9350ce4da77c93e8869f774a24e2

URL: https://github.com/llvm/llvm-project/commit/4edc9e2acf1d9350ce4da77c93e8869f774a24e2
DIFF: https://github.com/llvm/llvm-project/commit/4edc9e2acf1d9350ce4da77c93e8869f774a24e2.diff

LOG: [MLIR][GPU] Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support

Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support. This
method is the only one in CUDA runtime wrappers library that creates
a dependence on libLLVMSupport due to its use of SmallVector and
ArrayRef. The code can be as easily/compactly written without those ADT.
The dependence on LLVMSupport adds a significant amount of additional
complexity for external things that want to link this library in (both
statically or as a shared object) since libLLVMSupport includes numerous
other objects that are sensitive to C++ compiler version and ABI.

Differential Revision: https://reviews.llvm.org/D108684

Added: 
    

Modified: 
    mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp

Removed: 
    


################################################################################
diff  --git a/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp b/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
index 9122e34ba4537..cbaa3dfe1f793 100644
--- a/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
+++ b/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
@@ -12,11 +12,7 @@
 //
 //===----------------------------------------------------------------------===//
 
-#include <cassert>
-#include <numeric>
-
 #include "mlir/ExecutionEngine/CRunnerUtils.h"
-#include "llvm/ADT/ArrayRef.h"
 
 #include "cuda.h"
 
@@ -160,26 +156,25 @@ mgpuMemHostRegister(void *ptr, uint64_t sizeBytes) {
   CUDA_REPORT_IF_ERROR(cuMemHostRegister(ptr, sizeBytes, /*flags=*/0));
 }
 
-// Allows to register a MemRef with the CUDA runtime. Helpful until we have
-// transfer functions implemented.
+/// Registers a memref with the CUDA runtime. `descriptor` is a pointer to a
+/// ranked memref descriptor struct of rank `rank`. Helpful until we have
+/// transfer functions implemented.
 extern "C" MLIR_CUDA_WRAPPERS_EXPORT void
 mgpuMemHostRegisterMemRef(int64_t rank, StridedMemRefType<char, 1> *descriptor,
                           int64_t elementSizeBytes) {
-
-  llvm::SmallVector<int64_t, 4> denseStrides(rank);
-  llvm::ArrayRef<int64_t> sizes(descriptor->sizes, rank);
-  llvm::ArrayRef<int64_t> strides(sizes.end(), rank);
-
-  std::partial_sum(sizes.rbegin(), sizes.rend(), denseStrides.rbegin(),
-                   std::multiplies<int64_t>());
-  auto sizeBytes = denseStrides.front() * elementSizeBytes;
-
   // Only densely packed tensors are currently supported.
-  std::rotate(denseStrides.begin(), denseStrides.begin() + 1,
-              denseStrides.end());
-  denseStrides.back() = 1;
-  assert(strides == llvm::makeArrayRef(denseStrides));
+  int64_t *denseStrides = (int64_t *)alloca(rank * sizeof(int64_t));
+  int64_t *sizes = descriptor->sizes;
+  for (int64_t i = rank - 1, runningStride = 1; i >= 0; i--) {
+    denseStrides[i] = runningStride;
+    runningStride *= sizes[i];
+  }
+  uint64_t sizeBytes = sizes[0] * denseStrides[0] * elementSizeBytes;
+  int64_t *strides = &sizes[rank];
+  for (unsigned i = 0; i < rank; ++i)
+    assert(strides[i] == denseStrides[i] &&
+           "Mismatch in computed dense strides");
 
-  auto ptr = descriptor->data + descriptor->offset * elementSizeBytes;
+  auto *ptr = descriptor->data + descriptor->offset * elementSizeBytes;
   mgpuMemHostRegister(ptr, sizeBytes);
 }


        


More information about the Mlir-commits mailing list