[all-commits] [llvm/llvm-project] 86888e: [mlir][sparse][gpu] generate proper memcpy in/out ...
Aart Bik via All-commits
all-commits at lists.llvm.org
Fri Apr 21 09:30:59 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 86888e420c41ebb07fa1a8818ea9af218b015fe3
https://github.com/llvm/llvm-project/commit/86888e420c41ebb07fa1a8818ea9af218b015fe3
Author: Aart Bik <ajcbik at google.com>
Date: 2023-04-21 (Fri, 21 Apr 2023)
Changed paths:
M mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
M mlir/test/Dialect/SparseTensor/GPU/gpu_combi.mlir
M mlir/test/Dialect/SparseTensor/GPU/gpu_matmul.mlir
M mlir/test/Dialect/SparseTensor/GPU/gpu_matvec.mlir
A mlir/test/Integration/Dialect/SparseTensor/GPU/CUDA/sparse-matvec-const.mlir
M mlir/test/Integration/Dialect/SparseTensor/GPU/CUDA/sparse-mma-2-4-f16.mlir
Log Message:
-----------
[mlir][sparse][gpu] generate proper memcpy in/out host and device
The host registration is a convenient way to get CUDA kernels
running, but it may be slow and does not work for all buffer
(like global constants). This revision uses the proper alloc
copy dealloc chains for buffers, using asynchronous chains
to increase overlap. The host registration mechanism is
kept under a flag for the output, just for experimentation
purposes while this project ramps up.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D148682
More information about the All-commits
mailing list