[all-commits] [llvm/llvm-project] 7aa22f: [mlir][gpu] Add 'cluster_size' attribute to gpu.su...
Andrea Faulds via All-commits
all-commits at lists.llvm.org
Tue Aug 20 10:37:24 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 7aa22f013e24d20291aad745368ff907baa9dfa4
https://github.com/llvm/llvm-project/commit/7aa22f013e24d20291aad745368ff907baa9dfa4
Author: Andrea Faulds <andrea.faulds at amd.com>
Date: 2024-08-20 (Tue, 20 Aug 2024)
Changed paths:
M mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
M mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
M mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRV.cpp
M mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
M mlir/lib/Dialect/GPU/Transforms/SubgroupReduceLowering.cpp
M mlir/test/Dialect/GPU/canonicalize.mlir
M mlir/test/Dialect/GPU/invalid.mlir
M mlir/test/Dialect/GPU/subgroup-reduce-lowering.mlir
Log Message:
-----------
[mlir][gpu] Add 'cluster_size' attribute to gpu.subgroup_reduce (#104851)
This enables performing several reductions in parallel, each smaller
than the size of the subgroup.
One potential application is flash attention with subgroup-wide matrix
multiplication and reduction combined in one kernel. The multiplication
operation requires a 2D matrix to be distributed over the lanes of the
subgroup, which then constrains the shape the following reduction can
have if we want to keep data in registers.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list