[Mlir-commits] [mlir] 4edc9e2 - [MLIR][GPU] Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support
Uday Bondhugula
llvmlistbot at llvm.org
Fri Aug 27 23:14:24 PDT 2021
Author: Uday Bondhugula
Date: 2021-08-28T11:37:55+05:30
New Revision: 4edc9e2acf1d9350ce4da77c93e8869f774a24e2
URL: https://github.com/llvm/llvm-project/commit/4edc9e2acf1d9350ce4da77c93e8869f774a24e2
DIFF: https://github.com/llvm/llvm-project/commit/4edc9e2acf1d9350ce4da77c93e8869f774a24e2.diff
LOG: [MLIR][GPU] Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support
Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support. This
method is the only one in CUDA runtime wrappers library that creates
a dependence on libLLVMSupport due to its use of SmallVector and
ArrayRef. The code can be as easily/compactly written without those ADT.
The dependence on LLVMSupport adds a significant amount of additional
complexity for external things that want to link this library in (both
statically or as a shared object) since libLLVMSupport includes numerous
other objects that are sensitive to C++ compiler version and ABI.
Differential Revision: https://reviews.llvm.org/D108684
Added:
Modified:
mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
Removed:
################################################################################
diff --git a/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp b/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
index 9122e34ba4537..cbaa3dfe1f793 100644
--- a/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
+++ b/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
@@ -12,11 +12,7 @@
//
//===----------------------------------------------------------------------===//
-#include <cassert>
-#include <numeric>
-
#include "mlir/ExecutionEngine/CRunnerUtils.h"
-#include "llvm/ADT/ArrayRef.h"
#include "cuda.h"
@@ -160,26 +156,25 @@ mgpuMemHostRegister(void *ptr, uint64_t sizeBytes) {
CUDA_REPORT_IF_ERROR(cuMemHostRegister(ptr, sizeBytes, /*flags=*/0));
}
-// Allows to register a MemRef with the CUDA runtime. Helpful until we have
-// transfer functions implemented.
+/// Registers a memref with the CUDA runtime. `descriptor` is a pointer to a
+/// ranked memref descriptor struct of rank `rank`. Helpful until we have
+/// transfer functions implemented.
extern "C" MLIR_CUDA_WRAPPERS_EXPORT void
mgpuMemHostRegisterMemRef(int64_t rank, StridedMemRefType<char, 1> *descriptor,
int64_t elementSizeBytes) {
-
- llvm::SmallVector<int64_t, 4> denseStrides(rank);
- llvm::ArrayRef<int64_t> sizes(descriptor->sizes, rank);
- llvm::ArrayRef<int64_t> strides(sizes.end(), rank);
-
- std::partial_sum(sizes.rbegin(), sizes.rend(), denseStrides.rbegin(),
- std::multiplies<int64_t>());
- auto sizeBytes = denseStrides.front() * elementSizeBytes;
-
// Only densely packed tensors are currently supported.
- std::rotate(denseStrides.begin(), denseStrides.begin() + 1,
- denseStrides.end());
- denseStrides.back() = 1;
- assert(strides == llvm::makeArrayRef(denseStrides));
+ int64_t *denseStrides = (int64_t *)alloca(rank * sizeof(int64_t));
+ int64_t *sizes = descriptor->sizes;
+ for (int64_t i = rank - 1, runningStride = 1; i >= 0; i--) {
+ denseStrides[i] = runningStride;
+ runningStride *= sizes[i];
+ }
+ uint64_t sizeBytes = sizes[0] * denseStrides[0] * elementSizeBytes;
+ int64_t *strides = &sizes[rank];
+ for (unsigned i = 0; i < rank; ++i)
+ assert(strides[i] == denseStrides[i] &&
+ "Mismatch in computed dense strides");
- auto ptr = descriptor->data + descriptor->offset * elementSizeBytes;
+ auto *ptr = descriptor->data + descriptor->offset * elementSizeBytes;
mgpuMemHostRegister(ptr, sizeBytes);
}
More information about the Mlir-commits
mailing list