[Mlir-commits] [mlir] [AMDGPU] Implement gpu.subgroup_reduce with DPP intrinsics on AMD GPUs (PR #133204)
Jakub Kuderski
llvmlistbot at llvm.org
Thu Apr 3 18:22:13 PDT 2025
================
@@ -362,6 +364,106 @@ struct VectorSubgroupReduceToShuffles final
unsigned shuffleBitwidth = 0;
bool matchClustered = false;
};
+
+Value createSubgroupDPPReduction(OpBuilder &b, Location loc, Value input,
+ gpu::AllReduceOperation mode,
+ const ClusterInfo &ci) {
+ Value result = input;
+ if (ci.clusterSize >= 2) {
+ auto permArg = b.getIntegerAttr(b.getIntegerType(32), 1);
----------------
kuhar wrote:
You can use `b.getI32IntegerAttr(1)`. Also below.
https://github.com/llvm/llvm-project/pull/133204
More information about the Mlir-commits
mailing list