[PATCH] D40547: AMDGPU: Fix copying i1 value out of loop with non-uniform exit
Nicolai Hähnle via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 26 09:27:05 PST 2018
nhaehnle added a comment.
In https://reviews.llvm.org/D40547#938894, @alex-t wrote:
> If I understand everything correct...
> The problem you're trying to solve is well known.
> You have divergent loop-exit and a value that is uniformly defined inside the loop but used outside the loop.
More or less. However, whether the value is uniform or not doesn't really make a difference: I can change the test case so that %cc is non-uniform, and the same issue occurs. So this isn't really about DivergenceAnalysis.
> Could you please look here: https://reviews.llvm.org/D40556
>
> Could you use same approach?
>
> You have 2 blocks: defBlock and useBlock and you want to know:
>
> 1. is useBlock is control dependent of defBlock ?
> 2. if 1 is true is defBlock's termination branch uniform? The set of control dependencies for defBlock is it's post-dominance frontier set The set of control dependencies for useBlock is it's post-dominance frontier set We need to check the branches that are NOT common in 2 sets above.
I don't think this works, but perhaps I'm misunderstanding you. In the test case which I've added, the defBlock is %for.body, and the useBlock is %for.end.
%for.end post-dominates the entire loop, so its post-dominance frontier is empty.
%for.body post-dominates %entry and %end.loop, so its PDF is only %mid.loop.
None of that information seems to help?
https://reviews.llvm.org/D40547
More information about the llvm-commits
mailing list