[llvm] [AMDGPU] Cleanup hasUnwantedEffectsWhenEXECEmpty function (PR #70206)

Carl Ritson via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 26 02:20:05 PDT 2023


perlfu wrote:

Out of caution, I think this should be reverted.
`V_READLANE` and `V_WRITELANE` always execute (as if they are scalar instructions) regardless of EXEC mask.

The lack of test coverage in the code base is probably part of the problem why this looks like an OK change.
I will see if I can find some test cases where this matters and add them -- if there are none then maybe this is fine.

The pre-existing comment is misleading, and that might have not helped the confusion.
```
// These are like SALU instructions in terms of effects, so it's questionable
// whether we should return true for those.
//
// However, executing them with EXEC = 0 causes them to operate on undefined
// data, which we avoid by returning true here.
```
This suggests maybe we can remove them, but then says they are undefined.
Undefined behaviour at the instruction level only exists for `V_READFIRSTLANE` with EXEC=0.
However you can imagine a sequence like this:
```
v_cmp_eq_u32 s0, v0, 0.0
v_writelane_b32 v1, s0, 0
```
Where the result of v_cmp is effected by EXEC=0, but v_writelane still uses the value.
As @ruiling stated, any write to an SGPR could be an issue.

Taking a step back, the problem exists because we peephole remove `s_cbranch_execz` irrespective of whether it is part of uniform or non-uniform control flow.
We would only expect it to occur as part of non-uniform control flow, and executing scalar operations as part of non-uniform control flow should be fine, because it is just the same as executing with any number of lanes active.
However, we do not know for sure the purpose or origin of the `s_cbranch_execz` instructions which is why we must be cautious.


https://github.com/llvm/llvm-project/pull/70206


More information about the llvm-commits mailing list