[Mlir-commits] [mlir] [mlir][gpu] Add `subgroup_broadcast` op (PR #152808)

Thu Sep 4 09:57:31 PDT 2025

fabianmcg wrote:

I'll preface the comment with, I'm not trying to be pedantic. I'm trying to get to a common understanding, and move forward.

> And it's not a "random" lane - it's an _arbitrary_ lane. Any lowering to any kind of broadcast operation is a valid lowering for a broadcast from `any_lane` is valid - possibly including the one where it's a nop, but you're supposed to do a broadcast from _something_.

Arbitrary is a synonym for random. https://www.merriam-webster.com/dictionary/arbitrary
My point is, `any_lane` implies randomness because it doesn't describe which lane to choose. As such, is not a good abstraction. That's why I think a `uniform_value` is a better op. And in targets like AMD, this can promoted to a VGPR to SGPR conversion.

> And re the pattern match on `first_active_lane_id` - because first_active_lane_id isn't really a primitive or a common abstraction, I'm not sure we should add such a thing. And also, I don't trust that pattern match. Having "read first active lane", which is a thing that really exists, as an operation (or a mode of an operation) feels quite reasonable to m

I agree that's not a common primitive, I'm just saying it's an option to model it. I don't get why a pattern match would present any issues. But how about:

- Add the `gpu.uniform_value`.
- Remove the mode flag from the broadcast op. If no lane is passed, the semantics mean use the first active lane. If a lane is passed, use that lane.

https://github.com/llvm/llvm-project/pull/152808