[all-commits] [llvm/llvm-project] e1da62: [MLIR][GPU] Define gpu.printf op and its lowerings
Krzysztof Drewniak via All-commits
all-commits at lists.llvm.org
Thu Dec 9 07:54:43 PST 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: e1da62910e140cf45eafec64193c813e79796f05
https://github.com/llvm/llvm-project/commit/e1da62910e140cf45eafec64193c813e79796f05
Author: Krzysztof Drewniak <Krzysztof.Drewniak at amd.com>
Date: 2021-12-09 (Thu, 09 Dec 2021)
Changed paths:
M mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h
A mlir/include/mlir/Conversion/GPUToROCDL/Runtimes.h
M mlir/include/mlir/Conversion/Passes.td
M mlir/include/mlir/Dialect/GPU/GPUOps.td
M mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp
M mlir/lib/Conversion/GPUCommon/GPUOpsLowering.h
M mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
M mlir/lib/Conversion/PassDetail.h
M mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp
M mlir/lib/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.cpp
A mlir/test/Conversion/GPUToROCDL/gpu-to-rocdl-hip.mlir
A mlir/test/Conversion/GPUToROCDL/gpu-to-rocdl-opencl.mlir
M mlir/test/Dialect/GPU/ops.mlir
A mlir/test/Integration/GPU/ROCM/printf.mlir
Log Message:
-----------
[MLIR][GPU] Define gpu.printf op and its lowerings
- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments
- Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP.
- Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered
This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle.
And:
[MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels
This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support.
In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110448
More information about the All-commits
mailing list