[Mlir-commits] [mlir] [mlir][gpu] Add `subgroup_broadcast` op (PR #152808)

Thu Sep 4 09:39:58 PDT 2025

krzysz00 wrote:

> That's not the point. Tell me how the compiler chooses which lane to take? The only way to do it is to take a random lane, but that would mean all lanes hold the same value, at which point what you have is gpu.uniform_value. I'm arguing that's a better abstraction than an overloaded broadcast.

I don't have strong opinions on whether `gpu.uniform_value` is a better abstraction than sneaking in a case into a broadcast op.

The reason I'd think you'd put it here is that - while an `any_lane` *could* be a nop, it has hinting semantics - I *want* this value to be treated as uniform / promoted to uniform registers / ... where that's a thing on my target. So there is a sense in which you will perform _some_ sort of broadcast to implement this, where such an operation exists.

And it's not a "random" lane - it's an *arbitrary* lane. Any lowering to any kind of broadcast operation is a valid lowering for a broadcast from `any_lane` is valid - possibly including the one where it's a nop, but you're supposed to do a broadcast from _something_.

And re the pattern match on `first_active_lane_id` - because first_active_lane_id isn't really a primitive or a common abstraction, I'm not sure we should add such a thing. And also, I don't trust that pattern match. Having "read first active lane", which is a thing that really exists, as an operation (or a mode of an operation) feels quite reasonable to me

https://github.com/llvm/llvm-project/pull/152808