[llvm] [AMDGPU] Add intrinsic readanylane (PR #115696)

Wed Nov 13 03:17:43 PST 2024

jayfoad wrote:

> If the goal is to allow certain defs to be sunk into divergent control flow, then maybe this particular situation is sufficient justification to add a new attribute that allows allow sinking. This can be combined with builtin that declares "uniform at use". A combination of these two will allow the A --> B transformation.

Would this new attribute be in addition to `convergent`, or instead of it? I still think `readanylane` should be `convergent` because it depends on the set of active threads, so I see this as a special case where we know the rules can be relaxed: in general `readanylane` should not be moved past divergent control flow, but in the specific case of sinking into an `if` it is OK.

FYI previously, probably before most of your work on convergence, someone proposed splitting `convergent` into two separate attributes, one to prevent sinking (adding additional control dependencies) and one to prevent hoisting (removing control dependencies).

Let me add one more example:
```
A: x = def();                           B: x = def();                           C: x = def();
   y = read*lane(x);                       if (divergentcondition) {               y = read*lane(x);
   if (divergentcondition) {                 y = read*lane(x);                     if (divergentcondition) {
     use(y);                                 use(y);                                 z = read*lane(x);
   }                                       }                                         use(z);
                                                                                   }
```
CSE will transform C-->A if we don't prevent it in some way. That's one of the reasons for marking `readanylane` as `convergent`.

https://github.com/llvm/llvm-project/pull/115696