[llvm] [AMDGPU] Add intrinsic readanylane (PR #115696)

Wed Nov 13 04:41:43 PST 2024

GinShio wrote:

Sorry, my understanding of `readanylane` may be gap. But, I prefer `@llvm.assume` if we want to mark the value is uniform when defined.

> Let me add one more example:
> 
> ```
> A: x = def();                           B: x = def();                           C: x = def();
>    y = read*lane(x);                       if (divergentcondition) {               y = read*lane(x);
>    if (divergentcondition) {                 y = read*lane(x);                     if (divergentcondition) {
>      use(y);                                 use(y);                                 z = read*lane(x);
>    }                                       }                                         use(z);
>                                                                                    }
> ```
> 
> CSE will transform C-->A if we don't prevent it in some way. That's one of the reasons for marking `readanylane` as `convergent`.

This example (`C --> A`) maybe good if I understand correctly when `readanylane` used. If we think x is uniform in if-block (y) and also uniform in then-block (z), we can safely do CSE in this case.

I thought we must prevent optimize to GVN hoisting (Following case that `B --> A`, `uniform when used`). It's one of reasons.
```
A: x = def();                            B: x = def();
   y = readanylane(x);                     if (divergentcondition) {
   if (divergentcondition) {                  y = readanylane(x);
     use(y);                                  dosomthing0(y);
   }                                        } else {
                                              z = readanylane(x);
                                              dosomthing1(z);
                                            }
```

Finally, I updated the code. Currently, `readanylane` hits that value is uniform when used.

https://github.com/llvm/llvm-project/pull/115696