[Mlir-commits] [mlir] [AMDGPU] Implement gpu.subgroup_reduce with DPP intrinsics on AMD GPUs (PR #133204)

Wed Apr 16 01:02:05 PDT 2025

================
@@ -372,6 +500,14 @@ void mlir::populateGpuBreakDownSubgroupReducePatterns(
   patterns.add<ScalarizeSingleElementReduce>(patterns.getContext(), benefit);
 }
 
+void mlir::populateGpuLowerSubgroupReduceToDPPPatterns(
+    RewritePatternSet &patterns, unsigned subgroupSize, amdgpu::Chipset chipset,
+    PatternBenefit benefit) {
+  patterns.add<ScalarSubgroupReduceToDPP>(patterns.getContext(), subgroupSize,
+                                          /*matchClustered=*/true, chipset,
+                                          benefit);
+}
+
----------------
Muzammiluddin-Syed-ECE wrote:

Ah I see thanks @andfau-amd!

So, as a sanity check: since our use case in IREE doesn't rely on these native SPIR-V ops, we could get away with collapsing the strategies for both clustered and non clustered forms into one since there's no obvious benefit in one or the other in our pipeline unlike in  SPIR-V?

https://github.com/llvm/llvm-project/pull/133204