[llvm] [AMDGPU] - Generate s_bitreplicate_b64_b32 (PR #69209)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 23 01:12:43 PDT 2023
jayfoad wrote:
> The intrinsic is marked `convergent` precisely so that the waterfall loop isn't needed.
I don't understand. What is the definition of the intrinsic? What result are you supposed to get if you give it a non-uniform input? (Maybe we should actually start documenting these instruction-like intrinsics, so we're not just arguing about the definition in a vacuum.)
https://github.com/llvm/llvm-project/pull/69209
More information about the llvm-commits
mailing list