[Mlir-commits] [mlir] [mlir][AMDGPU] Implement gpu.subgroup_reduce with DPP intrinsics on AMD GPUs (PR #133204)
Krzysztof Drewniak
llvmlistbot at llvm.org
Wed Apr 16 13:53:13 PDT 2025
================
@@ -362,6 +366,164 @@ struct VectorSubgroupReduceToShuffles final
unsigned shuffleBitwidth = 0;
bool matchClustered = false;
};
+
+std::optional<Value> createSubgroupDPPReduction(OpBuilder &b, Location loc,
+ Value input,
+ gpu::AllReduceOperation mode,
+ const ClusterInfo &ci,
+ amdgpu::Chipset chipset) {
+ Value result = input;
+ constexpr int allRows = 0xf;
+ constexpr int allBanks = 0xf;
+ const bool boundCtrl = true;
+ Value lane0 =
+ b.create<arith::ConstantOp>(loc, b.getI32Type(), b.getI32IntegerAttr(0));
+ Value lane32 =
----------------
krzysz00 wrote:
Let's only create these on the branches where they gut used to avoid polluting the IR with random constants
https://github.com/llvm/llvm-project/pull/133204
More information about the Mlir-commits
mailing list