[PATCH] D17518: [ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv.
Justin Lebar via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 31 14:01:08 PDT 2016
jlebar added a comment.
In http://reviews.llvm.org/D17518#387824, @rnk wrote:
> I'm not sure I understand the comment. This transform is basically turning
> ifs into predicated instructions, right? And we're basically saying that a
> predicated convergent operation (threadsync or what have you) is the same as
> having the operation in a conditional basic block?
Not quite. The simple case of converting
if (pred) __syncthreads();
==>
(predicated on pred) __syncthreads();
is trivially safe, like I think you're saying. But ifconv can also perform
transformations which duplicate the newly-predicated instruction. The question
is whether or not it's safe to duplicate a convergent instruction, in the
process of if-conversion.
A simple example is
BB0
/ \
BB1 BB2
|\_ _/
| | |
| TBB --> exit
|
FBB
TBB forms ifconv's "simple" shape with its predecessors. Can we move TBB's
contents into BB1 and BB2 (as predicated instructions) if TBB contains a
convergent op?
I'd argued in http://reviews.llvm.org/D17430 that this was safe, but now I think it is not. If we
think concretely in terms of CUDA, a necessary condition for correctness is
that
- if all threads in the warp executed TBB "together" (i.e., convergently) before the transformation,
- then all threads in the warp must execute *the same copy* of TBB's instructions after the transformation.
But clearly that's not necessarily true in this example.
In terms of the LLVM spec, we say that you're not allowed to "add" a
control-flow dependency to a convergent op. In this case, if BB1 switches on
"cond", then the original CFG dependency expression for TBB is
(BB1 && cond) || BB2
But after predication, we have two copies of the convergent operation:
BB1 && cond
BB2
I think my mistake earlier was concluding from this that we "removed" CFG
dependencies, rather than adding them. But let me write this in an equivalent
way:
BB1 && cond && !BB2
BB2 && !BB1
Now it's suddenly clear that we added a CFG dependency to each of these new
instructions.
Put another way, there's a set of control flows which result in us reaching an
instruction. If I add a CFG dependency, I make that set *smaller*. That's
what we're not allowed to do with convergent operations, and that's what we're
doing here.
@resistor, I'm going to revert this back to what we originally had in http://reviews.llvm.org/D17430.
Reid, thank you for being confused by this. :)
http://reviews.llvm.org/D17518
More information about the llvm-commits
mailing list