[llvm] [WIP][AMDGPU] combine uniform AMDGPU lane Intrinsics (PR #116953)
Sameer Sahasrabuddhe via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 5 23:31:13 PST 2024
ssahasra wrote:
> I am concerned this does not have enough context to be definitively correct after later transformations. Consider a case like this:
>
> ```
> if (ballot_all(x)) {
> uniform_x = readfirstlane x
> speculatable_use(uniform_x)
> ...
> }
> ```
If `speculatable_use()` depends on the uniformity of its argument, then shouldn't it be marked convergent? Clearly it has cross-thread semantics and should not be moved in the control flow.
Alternatively, @jayfoad , is this a use-case for `readanylane`? Perhaps this pass can replace `readfirstlane` with `readanylane`, which eventually becomes a nop if `x` is in an sreg, sufficiently far down in the codegen flow?
https://github.com/llvm/llvm-project/pull/116953
More information about the llvm-commits
mailing list