[PATCH] D85603: IR: Add convergence control operand bundle and intrinsics

Sun Apr 11 21:38:49 PDT 2021

sameerds added a comment.

In D85603#2679362 <https://reviews.llvm.org/D85603#2679362>, @Anastasia wrote:

> Sorry for not being clear - I was talking about two separate threads here (1) generalizing convergent attribute to non-uniform CF that is addressed by this patch and (2) inverting convergent attribute that is addressed in https://reviews.llvm.org/D69498. Just to provide more details regarding (2) - right now in clang we have a logic that adds convergent to every single function because when we parse the function we don't know whether it will call any function in a call tree that would use convergent operations. Therefore we need to be conservative to prevent incorrect optimizations but this is not ideal for multiple reasons. The optimiser can undo all or some of those convergent decorations if it can prove they are not needed. And for the uniform CF convergent operations this was the only "broken" functionality to my memory.

I see now. Thanks! Besides goal (1), the other goal for this new formalism is to clarify the meaning of "convergence" in a way that allows more freedom to the optimizer. Language specs typically define convergence with operational semantics, such as:

1. SPIRV: "different invocations of an entry point execute the same dynamic instances of an instruction when they follow the same control-flow path"
2. OpenCL: "all work-items in the work-group must enter the conditional if any work-item in the work-group enters the conditional statement"

The proposed formalism lifts this into a declarative semantics which is easier for the compiler to reason with. This allows optimizations like jump threading, where the transformed program has ambiguous operational semantics (see the example in the actual spec). The presence of convergence control tokens makes sure that the "point of convergence" is well-defined even if the transformed control flow is ambiguous.

> To address this there was an attempt to invert the behavior of convergent attribute in this patch (https://reviews.llvm.org/D69498) then the frontend wouldn't need to generate the attribute everywhere and the optimizer wouldn't need to undo what frontend does. The change in this review doesn't address (2) as far as I can see - it seems it only generalized old convergent semantics to cover the cases with non-uniform CF. I am not clear yet about the details of how and what frontend should generate in IR for this new logic but it looks more complex than before. And if we have to stick to the conservative approach of assuming everything is convergent as it is now this might complicate and slow down the parsing. So I am just checking whether addressing (2) is still feasible with the new approach or it is not a direction we can/should go?

To be honest, I was not aware of this other effort, and even after you pointed it out, I wasn't paying attention to the words that I was reading. It seems like the current spec has so far focussed on demonstrating the soundness of the formalism. But I think it is possible to cover (2), which is to make the default setting conservative. This will need a bit of a rewording. In particular, this definition from the spec:

  The convergence control intrinsics described in this document and convergent
  operations that have a ``convergencectrl`` operand bundle are considered
  *controlled* convergent operations.

  Other convergent operations are *uncontrolled*.

This needs to be inverted in the spirit of D69498 <https://reviews.llvm.org/D69498>. I would propose the following tweak:

1. By default, every call has an implicit `convergencectrl` bundle with a token returned by the `@llvm.experimental.convergence.entry` intrinsic from the entry block of the caller. This default is the most conservative setting within the semantics defined here.
2. A more informed frontend or a suitable transformation can replace this conservative token with one of the following:
  1. A token returned by any of the other intrinsics, which provides more specific information about convergence at this callsite.
  2. A predefined constant token (say `none`), which indicates complete freedom. This would be equivalent to the `noconvergent` attribute proposed in D69498 <https://reviews.llvm.org/D69498>.

Such a rewording would invert how we approach the spec. Instead of a representation that explicitly talks about special intrinsics that "need" convergence, the new semantics applies to all function calls. The redefined default is conservative instead of free, and the presence of the bundles relaxes the default instead of adding constraints.

Also, answering one of your comments in the other review (D85609#inline-943432 <https://reviews.llvm.org/D85609#inline-943432>) about the relevance of the `llvm.experimental.convergence.anchor`, this intrinsic cannot be inferred by the frontend. It represents a new ability to represent optimization opportunities like the one demonstrated in the "opportunistic convergence" example. The intrinsic says that the call that uses this token doesn't depend on any specific set of threads, but merely marks the threads that do reach it. This is most useful when multiple calls agree on the same set of threads. Identifying such sets of operations will need help from the user (or more realistically, a library writer). Something like the following might work, where the actual value of `group` doesn't really matter beyond relating the various calls to each other.

  auto group = non_uniform_group_active_workitems();
  op1(group);
  if (C)
     op2(group);
  op3(group);

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85603/new/

https://reviews.llvm.org/D85603