[LLVMdev] : Predication on SIMD architectures and LLVM
atrick at apple.com
Fri Nov 9 18:50:33 PST 2012
On Oct 31, 2012, at 1:13 PM, Bjorn De Sutter <bjorn.desutter at elis.ugent.be> wrote:
> Hi all,
> I am working on a CGRA backend (something like a 2D VLIW), and we also absolutely need predication. I extended the IfConversion pass to allow it to be executed multiple times and to predicate already predicated code. This is necessary to predicate code with nested conditional statements. At this point, we support or, and, and conditional predicates (see Scott Mahlke's papers on this issue) as supported by the OpenIMPACT compiler (=Trimaran). If anyone is interested, I can show some of the code. It is rather ad-hoc, however, so it is not at all ready for integration in the trunk (I think).
> The problem we are still facing is that this predication works post instruction selection and post register allocation. This is problematic because some of the earlier optimizations such as loop unrolling should ideally be applied on if-converted code, on which it is easier to judge the opportunities for, e.g., modulo scheduling and initiation interval constraints (such as ResMII, RecMII).
> In my view, the ideal would be to have very generic, full (OpenIMPACT-like) predication support throughout LLVM, with the option of enabling/skipping early if-conversion just like one can enable or disable aggressive inlining.
In theory, MachineInstrs can be predicated before register allocation (in SSA), but the machine code will be full of false dependencies and liveness will not be predicate-aware. (Predicated instructions would need implicit use operand for any virtual register defs). You would basically lose reaching defs after predication.
You could implement a loop unroller for SSA MachineInstrs. It's conceptually similar to early tail duplication.
"Predicating" at IR level would have to take the form of an analysis that indicates which blocks *will* be predicated. You would have to leave the original CFG and phi nodes in tact to preserve control dependence and dataflow. Then you just have the problem of preserving that analysis across any changes to the CFG.
More information about the llvm-dev