[PATCH] D40547: AMDGPU: Fix copying i1 value out of loop with non-uniform exit

Alexander via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 29 03:56:09 PST 2018


alex-t added a comment.

In https://reviews.llvm.org/D40547#989137, @nhaehnle wrote:

> In https://reviews.llvm.org/D40547#938894, @alex-t wrote:
>
> > If I understand everything correct...
> >  The problem you're trying to solve is well known.
> >  You have divergent loop-exit and a value that is uniformly defined inside the loop but used outside the loop.
>
>
> More or less. However, whether the value is uniform or not doesn't really make a difference: I can change the test case so that %cc is non-uniform, and the same issue occurs. So this isn't really about DivergenceAnalysis.
>
> > Could you please look here: https://reviews.llvm.org/D40556
> > 
> > Could you use same approach?
> > 
> > You have 2 blocks: defBlock and useBlock and you want to know:
> > 
> > 1. is useBlock  is control dependent of defBlock  ?
> > 2. if 1 is true is defBlock's termination branch uniform? The set of control dependencies for defBlock is it's post-dominance frontier set The set of control dependencies for useBlock is it's post-dominance frontier set We need to check the branches that are NOT common in 2 sets above.
>
> I don't think this works, but perhaps I'm misunderstanding you. In the test case which I've added, the defBlock is %for.body, and the useBlock is %for.end.
>
> %for.end post-dominates the entire loop, so its post-dominance frontier is empty.
>
> %for.body post-dominates %entry and %end.loop, so its PDF is only %mid.loop.
>
> None of that information seems to help?


for.body:

  %i = phi i32 [0, %entry], [%i.inc, %end.loop]

- %cc = icmp ult i32 %i, 4**                               <-- definition br i1 %cc, label %mid.loop, label %for.end

mid.loop:

  %v = call float @llvm.amdgcn.buffer.load.f32(<4 x i32> %rsrc, i32 %tid, i32 %i, i1 false, i1 false)
  %cc2 = fcmp oge float %v, 0.0

- br i1 %cc2, label %end.loop, label %for.end   **           <-- divergent branch condition

end.loop:

  %i.inc = add i32 %i, 1
  br label %for.body

for.end:

  **br i1 %cc, label %if, label %end**     <-- use

Since the use block's PDF is empty and def block PDF contains the only one block "mid.loop" we only should check the "mid.loop"'s termination branch divergence.
Here it's immediately clear that the "cc2" is divergent and the branch in "mid.loop" is divergent as well.
So, the use in "for.end" is divergent by the control dependency of the "mid.block" divergent branch.


https://reviews.llvm.org/D40547





More information about the llvm-commits mailing list