[Mlir-commits] [mlir] [AMDGPU] Implement gpu.subgroup_reduce with DPP intrinsics on AMD GPUs (PR #133204)
Andrea Faulds
llvmlistbot at llvm.org
Wed Apr 16 00:49:50 PDT 2025
================
@@ -372,6 +500,14 @@ void mlir::populateGpuBreakDownSubgroupReducePatterns(
patterns.add<ScalarizeSingleElementReduce>(patterns.getContext(), benefit);
}
+void mlir::populateGpuLowerSubgroupReduceToDPPPatterns(
+ RewritePatternSet &patterns, unsigned subgroupSize, amdgpu::Chipset chipset,
+ PatternBenefit benefit) {
+ patterns.add<ScalarSubgroupReduceToDPP>(patterns.getContext(), subgroupSize,
+ /*matchClustered=*/true, chipset,
+ benefit);
+}
+
----------------
andfau-amd wrote:
Thanks for tagging me! I described the motivation in the commit message of https://github.com/llvm/llvm-project/commit/a800ffac4115259a76d803512eda31e4de787570. Basically, for certain backends, you might want to or have to apply different lowering strategies for the clustered and non-clustered forms. Off the top of my head, I'm pretty sure I had Vulkan SPIR-V in mind here, because there's a native SPIR-V op for doing a non-clustered reduction, whereas the clustered form would need to use the lowering to shuffles.
https://github.com/llvm/llvm-project/pull/133204
More information about the Mlir-commits
mailing list