[llvm] [AMDGPU] - Generate s_bitreplicate_b64_b32 (PR #69209)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 23 01:12:43 PDT 2023


jayfoad wrote:

> The intrinsic is marked `convergent` precisely so that the waterfall loop isn't needed.

I don't understand. What is the definition of the intrinsic? What result are you supposed to get if you give it a non-uniform input? (Maybe we should actually start documenting these instruction-like intrinsics, so we're not just arguing about the definition in a vacuum.)

https://github.com/llvm/llvm-project/pull/69209


More information about the llvm-commits mailing list