[llvm] [AMDGPU] - Generate s_bitreplicate_b64_b32 (PR #69209)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 16 07:37:09 PDT 2023


jayfoad wrote:

> Support VGPR arguments by inserting a v_readfirstlane.

Unfortunately this is not correct behaviour. You would need to either generate a waterfall loop, or find some other way to implement the bitreplicate operation using VALU instructions.

The problem is that, even if the input program only uses bitreplicate on uniform arguments, the compiler could theoretically transform input like this:
```
  result = divergent_condition ? bitreplicate(uniform_input) : bitreplicate(another_uniform_input);
```
into this:
```
  result = bitreplicate(divergent_condition ? uniform_input : another_uniform_input);
```
so the backend needs to be prepared to handle divergent arguments.

@arsenm @nhaehnle please correct me if I am wrong about this.

https://github.com/llvm/llvm-project/pull/69209


More information about the llvm-commits mailing list