[PATCH] D60834: [AMDGPU] Uniform values being used outside loop marked non-divergent

Wed Apr 17 15:36:30 PDT 2019

rtaylor added a comment.

In D60834#1470764 <https://reviews.llvm.org/D60834#1470764>, @arsenm wrote:

> In D60834#1470763 <https://reviews.llvm.org/D60834#1470763>, @arsenm wrote:
>
> > It sort of intuitively makes sense to me that the control flow lowering would like LCSSA. However, this should not be handled by adding it directly to the pass pipeline. You can add this as a dependency, e.g.     AU.addRequiredID(LCSSAID);
> >
> > I would also like to see the an IR->IR testcase showing LCSSA was implicitly run
>
>
> Actually, what really requires LCSSA? Is it DivergenceAnalysis or StructurizeCFG directly?

How I understand the problem is that DA is not looking across blocks and therefore won't see that tmp62 is actually divergent in loop exit (though it is uniform in the loop).  LCSSA provides a phi node for the loop exit block (where it is divergent) and allows DA to mark it divergent so that the s_buffer_load can be lowered to a buffer_load.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60834/new/

https://reviews.llvm.org/D60834