[llvm] [AMDGPU] Add intrinsic readanylane (PR #115696)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 13 03:17:43 PST 2024
jayfoad wrote:
> If the goal is to allow certain defs to be sunk into divergent control flow, then maybe this particular situation is sufficient justification to add a new attribute that allows allow sinking. This can be combined with builtin that declares "uniform at use". A combination of these two will allow the A --> B transformation.
Would this new attribute be in addition to `convergent`, or instead of it? I still think `readanylane` should be `convergent` because it depends on the set of active threads, so I see this as a special case where we know the rules can be relaxed: in general `readanylane` should not be moved past divergent control flow, but in the specific case of sinking into an `if` it is OK.
FYI previously, probably before most of your work on convergence, someone proposed splitting `convergent` into two separate attributes, one to prevent sinking (adding additional control dependencies) and one to prevent hoisting (removing control dependencies).
Let me add one more example:
```
A: x = def(); B: x = def(); C: x = def();
y = read*lane(x); if (divergentcondition) { y = read*lane(x);
if (divergentcondition) { y = read*lane(x); if (divergentcondition) {
use(y); use(y); z = read*lane(x);
} } use(z);
}
```
CSE will transform C-->A if we don't prevent it in some way. That's one of the reasons for marking `readanylane` as `convergent`.
https://github.com/llvm/llvm-project/pull/115696
More information about the llvm-commits
mailing list