[llvm] [WIP][AMDGPU] combine uniform AMDGPU lane Intrinsics (PR #116953)

Sameer Sahasrabuddhe via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 5 23:31:13 PST 2024


ssahasra wrote:

> I am concerned this does not have enough context to be definitively correct after later transformations. Consider a case like this:
> 
> ```
>   if (ballot_all(x)) {
>     uniform_x = readfirstlane x
>     speculatable_use(uniform_x)
>     ...
>   }
> ```

If `speculatable_use()` depends on the uniformity of its argument, then shouldn't it be marked convergent? Clearly it has cross-thread semantics and should not be moved in the control flow.

Alternatively, @jayfoad , is this a use-case for `readanylane`? Perhaps this pass can replace `readfirstlane` with `readanylane`, which eventually becomes a nop if `x` is in an sreg, sufficiently far down in the codegen flow?

https://github.com/llvm/llvm-project/pull/116953


More information about the llvm-commits mailing list