[llvm] [AMDGPU] Add intrinsic readanylane (PR #115696)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 13 04:54:33 PST 2024
jayfoad wrote:
> > Let me add one more example:
> > ```
> > A: x = def(); B: x = def(); C: x = def();
> > y = read*lane(x); if (divergentcondition) { y = read*lane(x);
> > if (divergentcondition) { y = read*lane(x); if (divergentcondition) {
> > use(y); use(y); z = read*lane(x);
> > } } use(z);
> > }
> > ```
> > CSE will transform C-->A if we don't prevent it in some way. That's one of the reasons for marking `readanylane` as `convergent`.
>
> This example (`C --> A`) maybe good if I understand correctly when `readanylane` used. If we think x is uniform in if-block (y) and also uniform in then-block (z), we can safely do CSE in this case.
The problem is when `x` is uniform inside the `if` but divergent outside the `if`. Then `z` is defined but `y` is undefined, so it is not OK to replace a use of `z` with a use of `y`.
https://github.com/llvm/llvm-project/pull/115696
More information about the llvm-commits
mailing list