[llvm] [WIP][AMDGPU] combine uniform AMDGPU lane Intrinsics (PR #116953)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 6 02:08:38 PST 2024
jayfoad wrote:
> If speculatable_use() depends on the uniformity of its argument, then shouldn't it be marked convergent? Clearly it has cross-thread semantics and should not be moved in the control flow.
Agreed. The only kind of `speculatable_use` I can think of that _might_ be affected by this is a simple SALU-only logical operation like this one, where we choose to legalize VGPR inputs with readfirstlane instead of a waterfall loop:
```
// Lowers to S_BITREPLICATE_B64_B32.
// The argument must be uniform; otherwise, the result is undefined.
def int_amdgcn_s_bitreplicate :
DefaultAttrsIntrinsic<[llvm_i64_ty], [llvm_i32_ty], [IntrNoMem, IntrConvergent]>;
```
But it is already marked as convergent, so I don't think there's a problem here after all.
https://github.com/llvm/llvm-project/pull/116953
More information about the llvm-commits
mailing list