[PATCH] D40547: AMDGPU: Fix copying i1 value out of loop with non-uniform exit

Alexander via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 29 04:05:18 PST 2017


alex-t added a comment.

If I understand everything correct...
The problem you're trying to solve is well known.
You have divergent loop-exit and a value that is uniformly defined inside the loop but used outside the loop.
In this case different threads would have different values.
Traditional Divergence Analysis cannot handle this. Since definition inside the loop body is uniform the use is uniform as well.
Since the value has no explicit data dependency of the loop index, the PHI-node in the loop header (that is divergent if loop-exit is) does not affect it's divergence formally.
The value in fact does have loop-carried dependency. For example:

%tid = call i32 @llvm.amdgcn.workitem.id.x()

for.body:

  %val = add i32 %val, 1                                               <== definition of %val is uniform
  %cmp = icmp gt i64 %tid, %arg1
  br i1 %cmp, label %for.end, label %for.body             <== loop exit condition is divergent

for.end:

  store i32 %val, i32 addrspace(1)* %out     <== each thread will have different %val here

Fortunately, you interested in the concrete def and concrete use:  AMDGPU::laneDominates(DefInst->getParent(), &MBB))

Could you please look here: https://reviews.llvm.org/D40556

Could you use same approach?

You have 2 blocks: defBlock and useBlock and you want to know:

1. is useBlock  is control dependent of defBlock  ?
2. if 1 is true is defBlock's termination branch uniform?

The set of control dependencies for defBlock is it's post-dominance frontier set
The set of control dependencies for useBlock is it's post-dominance frontier set
We need to check the branches that are NOT common in 2 sets above.


https://reviews.llvm.org/D40547





More information about the llvm-commits mailing list