[llvm-dev] RFC: make calls "convergent" by default

Tue Jun 1 23:02:13 PDT 2021

CC'ing some more people who got dropped when sending the previous mail.

Sameer.

Sameer Sahasrabuddhe via llvm-dev writes:

> TL;DR
> =====
>
> We propose the following changes to LLVM IR in order to better support
> operations that are sensitive to the set of threads that execute them
> together:
>
> - Redefine "convergent" in terms of thread divergence in a
>   multi-threaded execution.
> - Fix all optimizations that examine the "convergent" attribute to also
>   depend on divergence analysis. This avoids any impact on CPU
>   compilation since control flow is always uniform on CPUs.
> - Make all function calls "convergent" by default (D69498). Introduce a
>   new "noconvergent" attribute, and make "convergent" a nop.
> - Update the "convergence tokens" proposal to take into account this new
>   default property (D85603).
>
> Motivation
> ==========
>
> This effort is necessary because the current "convergent" attribute is
> considered under-defined and sorely needs replacement.
>
> 1. On GPU targets, the "convergent" attribute is required for
>    correctness. This is unlike other attributes that are only
>    used as optimization hints. Missing an attribute should not
>    result in a miscompilation.
>
> 2. The current definition of "convergent" attribute does not precisely
>    represent the constraints on the compiler for a GPU target. The
>    actual implementation in LLVM sources is far more conservative than
>    what the definition says.
>
> 3. Due to the same lack of precision, the attribute cannot properly
>    represent the side-effects of jump threading on a GPU program.
>
> Background
> ==========
>
> This RFC is a continuation of a discussion split across the following
> two reviews. The two reviews compose well to cover all the shortcomings
> of the convergent attribute.
>
>   D69498: IR: Invert convergent attribute handling
>   https://reviews.llvm.org/D69498
>
> The above review aims to make all function calls "convergent" by
> default, but it received strong opposition due to the requirement that
> CPU frontends must now emit a new "noconvergent" attribute on every
> function call.
>
>   D85603: IR: Add convergence control operand bundle and intrinsics
>   https://reviews.llvm.org/D85603
>
> The above review defines a "convergent operation" in terms of divergent
> control flow in multi-threaded executions. It introduces a "convergence
> token" passed as an operand bundle argument at a call, representing the
> set of threads that together execute that call. This review has
> progressed to the point where there don't seem to be any major
> objections to it, but there is some interest in combining it with the
> original idea of making all calls convergent by default.
>
> Terms Used
> ==========
>
> The following definitions are paraphrased from D85603:
>
> Convergent Operation
>
>   Some parallel execution environments execute threads in groups that
>   allow efficient communication within each group. When control flow
>   diverges, i.e. threads of the same group follow different paths
>   through the CFG, not all threads of the group may be available to
>   participate in this communication. A convergent operation involves
>   inter-thread communication or synchronization that occurs outside of
>   the memory model, where the set of threads which participate in
>   communication is implicitly affected by control flow.
>
> Dynamic Instance
>
>   Every execution of an LLVM IR instruction occurs in a dynamic instance
>   of the instruction. Different executions of the same instruction by a
>   single thread give rise to different dynamic instances of that
>   instruction. Executions of different instructions always occur in
>   different dynamic instances. Executions of the same instruction by
>   different threads may occur in the same dynamic instance. When
>   executing a convergent operation, the set of threads that execute the
>   same dynamic instance is the set of threads that communicate with each
>   other for that operation.
>
> Optimization Constraints due to Convergent Calls
> ================================================
>
> In general, an optimization that modifies control flow in the program
> must ensure that the set of threads executing each dynamic instance of a
> convergent call is not affected.
>
> By default, every call in LLVM IR is assumed to be convergent. A
> frontend may further relax this in the following ways:
>
>   1. The "noconvergent" attribute may be added to indicate that a call
>      is not sensitive to the set of threads executing any dynamic
>      instance of that call.
>
>   2. A "convergencectrl" operand bundle may be passed to the call. The
>      semantics of such a "token", provides fine-grained control over the
>      transforms possible near the callsite.
>
> The overall effect is to make the notion of convergence and divergence a
> universal property of LLVM IR. This provides a "safe default" in the IR
> semantics, so that frontends and optimizations cannot produce incorrect
> IR on a GPU target by merely missing an attribute.
>
> At the same time, there is no effect on CPU optimizations. An
> optimization may use divergence analysis along with the above
> information to determine if a transformation is possible. The only
> impact on CPU compilation flows is the addition of divergence analysis
> as a dependency when checking for convergent operations. This analysis
> is trivial on CPUs where branches do not have divergence and hence all
> control flow is uniform.
>
> Implementation
> ==============
>
> The above proposal will be implemented as follows:
>
> 1. Optimizations that check for convergent operations will be updated to
>    depend on divergent analysis. For example, the following change will
>    be made in llvm/lib/Transforms/Scalar/Sink.cpp:
>
>    Before:
>
>      bool isSafeToMove(Instruction *Inst) {
>          ...
>          if (auto *Call = dyn_cast<CallBase>(Inst)) {
>              ...
>              if (Call->isConvergent())
>                  return false;
>              ...
>          }
>      }
>
>
>    After:
>
>      bool isSafeToMove(Instruction *Inst, DivergenceAnalysis &DA, ...) {
>          ...
>          // don't sink a convergent call across a divergent branch
>          if (auto *Call = dyn_cast<CallBase>(Inst)) {
>              ...
>              auto Term = Inst->getParent()->getTerminator();
>              if (Call->isConvergent() && DA.isDivergent(Term))
>                  return false;
>              ...
>          }
>      }
>
> 2. D69498 will be updated so that the convergent property is made
>    default, but the new requirements on CPU frontends will be retracted.
>
> 3. D85603 will be revised to include the new default convergent
>    property.
>
> Thanks,
> Sameer.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev