[llvm] [AMDGPU] Add intrinsic readanylane (PR #115696)

Wed Nov 13 04:54:33 PST 2024

jayfoad wrote:

> > Let me add one more example:
> > ```
> > A: x = def();                           B: x = def();                           C: x = def();
> >    y = read*lane(x);                       if (divergentcondition) {               y = read*lane(x);
> >    if (divergentcondition) {                 y = read*lane(x);                     if (divergentcondition) {
> >      use(y);                                 use(y);                                 z = read*lane(x);
> >    }                                       }                                         use(z);
> >                                                                                    }
> > ```
> > CSE will transform C-->A if we don't prevent it in some way. That's one of the reasons for marking `readanylane` as `convergent`.
> 
> This example (`C --> A`) maybe good if I understand correctly when `readanylane` used. If we think x is uniform in if-block (y) and also uniform in then-block (z), we can safely do CSE in this case.

The problem is when `x` is uniform inside the `if` but divergent outside the `if`. Then `z` is defined but `y` is undefined, so it is not OK to replace a use of `z` with a use of `y`.

https://github.com/llvm/llvm-project/pull/115696