[all-commits] [llvm/llvm-project] 371366: [mlir][nvgpu] add simple pipelining for shared mem...
ftynse via All-commits
all-commits at lists.llvm.org
Mon Jul 17 07:29:28 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 371366ce27303e0b949aeb643b973a1a110da469
https://github.com/llvm/llvm-project/commit/371366ce27303e0b949aeb643b973a1a110da469
Author: Alex Zinenko <zinenko at google.com>
Date: 2023-07-17 (Mon, 17 Jul 2023)
Changed paths:
M mlir/include/mlir/Dialect/NVGPU/TransformOps/NVGPUTransformOps.h
M mlir/include/mlir/Dialect/NVGPU/TransformOps/NVGPUTransformOps.td
M mlir/include/mlir/Dialect/SCF/Transforms/Patterns.h
M mlir/include/mlir/Dialect/SCF/Transforms/Transforms.h
M mlir/lib/Dialect/NVGPU/TransformOps/CMakeLists.txt
M mlir/lib/Dialect/NVGPU/TransformOps/NVGPUTransformOps.cpp
M mlir/lib/Dialect/SCF/Transforms/LoopPipelining.cpp
A mlir/test/Dialect/NVGPU/transform-pipeline-shared.mlir
M utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
Log Message:
-----------
[mlir][nvgpu] add simple pipelining for shared memory copies
Add a simple transform operation to the NVGPU extension that performs
software pipelining of copies to shared memory. The functionality is
extremely minimalistic in this version and only supports copies from
global to shared memory inside an `scf.for` loop with either
`vector.transfer` or `nvgpu.device_async_copy` operations when
pipelining preconditions are already satisfied in the IR. This is the
minimally useful version that uses the more general loop pipeliner in an
NVGPU-specific way. Further extensions and orthogonalizations will be
necessary.
This required a change to the loop pipeliner itself to properly
propagate errors should the predicate generator fail.
This is loosely inspired from the vesion in IREE, but has less unsafe
assumptions and more principled way of communicating decisions.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D155223
More information about the All-commits
mailing list