[LLVMdev] RFC: Convergent attribute

Chandler Carruth chandlerc at google.com
Wed May 13 15:50:43 PDT 2015


Just so I understand, this only would apply to intrinsic calls? Are there
any instructions that would make sense to carry a flag with the same
semantics?

Generally, I like this a lot. Using control equivalence as the fundamental
model is a very nice and simple rule to follow. As long as the wiggle room
between it and the technical limitations in SPMD models, fantastic.

-Chandler

On Wed, May 13, 2015 at 2:24 PM Owen Anderson <resistor at mac.com> wrote:

> Below is a proposal for a new "convergent" intrinsic attribute and
> MachineInstr property, needed for correctly modeling many SPMD/SIMT
> programming models in LLVM.  Comments and feedback welcome.
>
> —Owen
>
>
>
>
>
> In order to make LLVM more suitable for programming models variously
> called SPMD
> and SIMT, we would like to propose a new intrinsic and MachineInstr
> annotation
> called "convergent", which will be used to impose certain control flow
> and/or
> code motion constraints that are necessary for the correct compilation of
> some
> common constructs in these programming models.
>
> Our proposal strives to define the semantics of these annotations *without*
> introducing a definition of SPMD/SIMT programming models into LLVM IR.
> Rather,
> the properties that must be preserved are specified purely in terms of
> single
> thread semantics.  This allows pass authors to reason about the constraints
> without having to consider alternative programming models.  The downside to
> this approach is that the motivation and necessity of these constraints in
> not
> easily understood without understanding the programming model from which
> they
> derive.
>
> *** WHAT ***
>
> (Thanks to Phil Reames for input on this definition.)
>
> An operation marked convergent may be transformed or moved within the
> program
> if and only the post-transform placement of the convergent operation is
> control equivalent (A dominated B, B post-dominates A, or vice-versa) to
> its original position.
>
> This definition is overly strict with respect to some SPMD/SIMT models,
> but cannot be relaxed without introducing a specific model into LLVM IR. We
> believe it is important for LLVM itself to remain agnostic to any specific
> model.  This allows core passes to preserve correctness for stricter
> models,
> while more relaxed models can implement additional transforms that use
> weaker constraints on top of core LLVM.
>
> *** HOW ***
>
> Once the attribute has been added, we anticipate the following changes to
> optimization passes will be required:
>   - Restrict Sink and MachineSink for convergent operations
>   - Disabling PRE for convergent operations
>   - Disabling jump threading of convergent operations
>   - Auditing SimplifyCFG for additional transforms that break convergent
> guarantees
>
> *** WHY ***
>
> SPMD/SIMT programming models are a family of related programming models in
> which multiple threads execute in a per-instruction lockstep fashion.
> Predication is typically used to implement acyclic control flow that would
> otherwise diverge the PC address of the lockstep threads.
>
> In these models, each thread's register set is typically indepedent, but
> there
> exist a small number of important circumstances in which a thread may
> access
> register storage from one of its lockstep neighbors.  Examples include
> gradient
> computation for texture lookups, as well a cross-thread broadcast and
> shuffle
> operations.
>
> These operations that provide access to another thread's register storage
> pose
> a particular challenge to the compiler, particularly when combined with the
> use of predication for control flow.  Consider the following example:
>
> // texture lookup that computes gradient of r0, last use of r0
> r1 = texture2D(..., r0, ...)
> if (...) {
>   // r0 used as temporary here
>   r0 = ...
>   r2 = r0 + ...
> } else {
>   // only use of r1
>   r2 = r1 + ...
> }
>
> In this example, various optimizations might try to sink the texture2D
> operation
> into the else block, like so:
>
> if (...) {
>   r0 = ...
>   r2 = r0 + ...
> } else {
>   r1 = texture2D(..., r0, ...)
>   r2 = r1 + ...
> }
>
> At this point, it starts to become clear that a problem can occur when two
> neighbor threads want to take different paths through the if-else
> construct.
> Logically, the thread that wishes to execute the texture2D races with its
> neighbor to reads the neighbor's value of r0 before it gets overridden.
>
> In most SPMD/SIMT implementations, the fallout of this races is exposed via
> the predicated expression of acyclic control flow:
>
> pred0 <- cmp ...
> if (pred0)  r0 = ...
> if (pred0)  r2 = r0 + ...
> if (!pred0) r1 = texture2D(..., r0, ...)
> if (!pred0) r2 = r1 + ...
>
> If thread 0 takes the else path and perform the texture2D operation, but
> its neighbor thread 1 takes the then branch, then the texture2D will fail
> because thread 1 has already overwritten its value of r0 before thread 0
> has
> a chance to read it.
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150513/e2554818/attachment.html>


More information about the llvm-dev mailing list