[Mlir-commits] [mlir] [mlir][gpu] Add `broadcast_lane` op (PR #152808)

Sat Aug 9 02:37:18 PDT 2025

Hardcode84 wrote:

> How is the broadcast first_lane semantics different than gpu.shuffle?

Unlike `gpu.shuffle`, result of broadcast is guaranteed to be uniform across the subgroup, which can enable more efficient lowering (e.g. using scalar registers instead of vector on AMDGPU). Regarding `any_lane` option, it follows the same broadcast logic (take value from some lane and make it uniform across all the subgroup), the only difference is that user guarantees the input value is also uniform, so compiler can choose any lane to take from and still put the result into scalar reg. And unlike `first_lane`, `any_lane` provides more relaxed speculation guarantees. `first_lane` cannot be speculated across the control flow as it can change active lanes, but `any_lane` can as it knows all inputs are already uniform.

`any_lane` and speculation was one the original motivations for this op (https://github.com/llvm/llvm-project/pull/152740#issuecomment-3168692683 for the technical details)

https://github.com/llvm/llvm-project/pull/152808