[PATCH] D50433: A New Divergence Analysis for LLVM

Alexander via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 9 07:08:23 PDT 2018


alex-t added a comment.

A few comments based on my experience of implemented the DA for AMD GPU legacy compiler :)

You handle the divergence induced by the divergent branches mapping the branch to the set of PHIs. In other words: you compute the PHIs control-dependent of the branch when you encounter the branch that is divergent.
There could be another way. As you know, all BasicBlocks on which the given block B is control dependent belongs to B's post-dominance frontier. So, for given PHI node we can easy know the set of branches on which this PHI is control-dependent.
Also, there is one more observation:  the DA itself is the canonical iterative solver upon the trivial laticce {divergent, unknown, uniform}. Given that the instruction is divergent immediately if it has the divergent operand. The "bottom" element of the laticce being joined to any produces "bottom" (divergent) again. So we have restricted ordered set and descending function and as a result the fixed point. Sorry for repeating the trivial things - just to explain the idea better...

Let's consider the PHI as operation which has extra operands - the join of the usual PHIs operands and the set of the all branches on which this PHI is control dependent.
Now we can process the PHI in usual solving loop as any other instruction computing it's divergence as the minimum over all operands.

Usual op:   D = MIN (Opnd0, Opnd1, .... OpndN)
PHI:            D = MIN(Opnd0, Opnd1, .... OpndN,  ControlDepBranch0, ControlDepBranch1 ......   ControlDepBranchN)

This algorithm assumes:

1. SSA form
2. We operate on both instructions and operands as Values and we have a mapping Value => Divergence i. e.    divergence[V] = {1|0}
3. We have post-dominance frontiers sets pre-computed for all BasicBlocks in Function.

This approach works pretty good in AMD HSAIL compiler.
Since it employs iterative analysis it works even the reversed CFG is irreducible but takes more iteration to reach the fixed point.


Repository:
  rL LLVM

https://reviews.llvm.org/D50433





More information about the llvm-commits mailing list