[PATCH] D35267: Pass Divergence Analysis data to selection DAG to drive divergence dependent instruction selection

Thu Dec 21 23:40:08 PST 2017

alex-t added a comment.

To start with, let's make sure that we're agreed on terms.
Divergent machine runs a set of threads (warp or wavefront) that execute same set of instructions in same order (SIMT).
Divergent operation operates on "vector" registers such that each register consists of many lanes - each thread operates on the data in corresponding lane.
>From the above immediately follows that the only source of divergence is thread ID or any data that is derived from thread ID.
Usually it is a small set of target intrinsics that may be the source of such a data.

There are 2 reasons of operation to be divergent:

1. It data-dependent on some divergent operation

  %tid = call i64 get_global_id_x()  // source of divergence
  %1 = add i64 %x, %tid                // data dependence on operand 1
  %2 = shl i64 %1, 16                     // data dependence on operand 0
  %gep = getelementptr i32, i32 addrspace(1) * %array, i64 %2   // data dependence on operand 1
  %val = load i32, %gep                // data dependence on operand 0

2. operation that is uniform itself but is control-dependent on the divergent branch:

  int tid = get_global_id(0)

  if (tid < n) {
    x = 1;               // no data-dependency on any divergent data
  } else {
    x = 2;              // no data-dependency on any divergent data
  }
  y = x + 5;     // threads taking different branch-targets have different "y" value - operation is divergent ( it is vector addition on vector registers )

Since the selection DAG only models data dependency the latter case is out of scope of this discussion.
The DAG is constructed, transformed and selected per block.

>From the above follows that operation in the selection DAG only may be divergent if there is a path in the DAG from some divergent node to the current node.

Initially DAG is constructing by the walk of the IR (SelectionDAGBuilder) and models IR exactly. Thus the divergence property is kept unchanged.

Both DAG peephole optimizations (combiner) and operations/types legalization do not create the new edges in data dependence graph.
I mean that they match the pattern following the existing edges and then change it to some another sub-graph such that all incoming edges of the old subgraph become incoming edges of the new one and same for the outgoing.
Even if several incoming/outgoing are merged together it keeps data flow pattern.

> This is only true if your original computation is correct, and if DAGCombine/Legalization doesn't create any nodes which are naturally divergent.  Neither of those are safe assumptions, I think.  DAGCombine and legalization will transform loads and stores, which could end up creating a naturally divergent node.

So, my question is: could you imagine even theoretical sensible transformation that convert the graph in such a way that uniform node will get divergent income?

> And some divergent nodes will never be passed to SelectionDAGBuilder::setValue when you build the DAG, due to the way SelectionDAGBuilder handles values with illegal types.  But I'm not sure that's a complete list of the issues with the current version, and there's no practical way to check without a verifier.

Event if it creates new DAG pattern it returns it's root that (because of CreateOperands) has correct divergence that will be passed to setValue. Or I did not understand what you meant?

https://reviews.llvm.org/D35267