[llvm-dev] RFC: make calls "convergent" by default
Sameer Sahasrabuddhe via llvm-dev
llvm-dev at lists.llvm.org
Tue Jun 1 04:58:33 PDT 2021
TL;DR
=====
We propose the following changes to LLVM IR in order to better support
operations that are sensitive to the set of threads that execute them
together:
- Redefine "convergent" in terms of thread divergence in a
multi-threaded execution.
- Fix all optimizations that examine the "convergent" attribute to also
depend on divergence analysis. This avoids any impact on CPU
compilation since control flow is always uniform on CPUs.
- Make all function calls "convergent" by default (D69498). Introduce a
new "noconvergent" attribute, and make "convergent" a nop.
- Update the "convergence tokens" proposal to take into account this new
default property (D85603).
Motivation
==========
This effort is necessary because the current "convergent" attribute is
considered under-defined and sorely needs replacement.
1. On GPU targets, the "convergent" attribute is required for
correctness. This is unlike other attributes that are only
used as optimization hints. Missing an attribute should not
result in a miscompilation.
2. The current definition of "convergent" attribute does not precisely
represent the constraints on the compiler for a GPU target. The
actual implementation in LLVM sources is far more conservative than
what the definition says.
3. Due to the same lack of precision, the attribute cannot properly
represent the side-effects of jump threading on a GPU program.
Background
==========
This RFC is a continuation of a discussion split across the following
two reviews. The two reviews compose well to cover all the shortcomings
of the convergent attribute.
D69498: IR: Invert convergent attribute handling
https://reviews.llvm.org/D69498
The above review aims to make all function calls "convergent" by
default, but it received strong opposition due to the requirement that
CPU frontends must now emit a new "noconvergent" attribute on every
function call.
D85603: IR: Add convergence control operand bundle and intrinsics
https://reviews.llvm.org/D85603
The above review defines a "convergent operation" in terms of divergent
control flow in multi-threaded executions. It introduces a "convergence
token" passed as an operand bundle argument at a call, representing the
set of threads that together execute that call. This review has
progressed to the point where there don't seem to be any major
objections to it, but there is some interest in combining it with the
original idea of making all calls convergent by default.
Terms Used
==========
The following definitions are paraphrased from D85603:
Convergent Operation
Some parallel execution environments execute threads in groups that
allow efficient communication within each group. When control flow
diverges, i.e. threads of the same group follow different paths
through the CFG, not all threads of the group may be available to
participate in this communication. A convergent operation involves
inter-thread communication or synchronization that occurs outside of
the memory model, where the set of threads which participate in
communication is implicitly affected by control flow.
Dynamic Instance
Every execution of an LLVM IR instruction occurs in a dynamic instance
of the instruction. Different executions of the same instruction by a
single thread give rise to different dynamic instances of that
instruction. Executions of different instructions always occur in
different dynamic instances. Executions of the same instruction by
different threads may occur in the same dynamic instance. When
executing a convergent operation, the set of threads that execute the
same dynamic instance is the set of threads that communicate with each
other for that operation.
Optimization Constraints due to Convergent Calls
================================================
In general, an optimization that modifies control flow in the program
must ensure that the set of threads executing each dynamic instance of a
convergent call is not affected.
By default, every call in LLVM IR is assumed to be convergent. A
frontend may further relax this in the following ways:
1. The "noconvergent" attribute may be added to indicate that a call
is not sensitive to the set of threads executing any dynamic
instance of that call.
2. A "convergencectrl" operand bundle may be passed to the call. The
semantics of such a "token", provides fine-grained control over the
transforms possible near the callsite.
The overall effect is to make the notion of convergence and divergence a
universal property of LLVM IR. This provides a "safe default" in the IR
semantics, so that frontends and optimizations cannot produce incorrect
IR on a GPU target by merely missing an attribute.
At the same time, there is no effect on CPU optimizations. An
optimization may use divergence analysis along with the above
information to determine if a transformation is possible. The only
impact on CPU compilation flows is the addition of divergence analysis
as a dependency when checking for convergent operations. This analysis
is trivial on CPUs where branches do not have divergence and hence all
control flow is uniform.
Implementation
==============
The above proposal will be implemented as follows:
1. Optimizations that check for convergent operations will be updated to
depend on divergent analysis. For example, the following change will
be made in llvm/lib/Transforms/Scalar/Sink.cpp:
Before:
bool isSafeToMove(Instruction *Inst) {
...
if (auto *Call = dyn_cast<CallBase>(Inst)) {
...
if (Call->isConvergent())
return false;
...
}
}
After:
bool isSafeToMove(Instruction *Inst, DivergenceAnalysis &DA, ...) {
...
// don't sink a convergent call across a divergent branch
if (auto *Call = dyn_cast<CallBase>(Inst)) {
...
auto Term = Inst->getParent()->getTerminator();
if (Call->isConvergent() && DA.isDivergent(Term))
return false;
...
}
}
2. D69498 will be updated so that the convergent property is made
default, but the new requirements on CPU frontends will be retracted.
3. D85603 will be revised to include the new default convergent
property.
Thanks,
Sameer.
More information about the llvm-dev
mailing list