[PATCH] D40547: AMDGPU: Fix copying i1 value out of loop with non-uniform exit

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 26 09:27:05 PST 2018


nhaehnle added a comment.

In https://reviews.llvm.org/D40547#938894, @alex-t wrote:

> If I understand everything correct...
>  The problem you're trying to solve is well known.
>  You have divergent loop-exit and a value that is uniformly defined inside the loop but used outside the loop.


More or less. However, whether the value is uniform or not doesn't really make a difference: I can change the test case so that %cc is non-uniform, and the same issue occurs. So this isn't really about DivergenceAnalysis.

> Could you please look here: https://reviews.llvm.org/D40556
> 
> Could you use same approach?
> 
> You have 2 blocks: defBlock and useBlock and you want to know:
> 
> 1. is useBlock  is control dependent of defBlock  ?
> 2. if 1 is true is defBlock's termination branch uniform? The set of control dependencies for defBlock is it's post-dominance frontier set The set of control dependencies for useBlock is it's post-dominance frontier set We need to check the branches that are NOT common in 2 sets above.

I don't think this works, but perhaps I'm misunderstanding you. In the test case which I've added, the defBlock is %for.body, and the useBlock is %for.end.

%for.end post-dominates the entire loop, so its post-dominance frontier is empty.

%for.body post-dominates %entry and %end.loop, so its PDF is only %mid.loop.

None of that information seems to help?


https://reviews.llvm.org/D40547





More information about the llvm-commits mailing list