[PATCH] D85603: IR: Add convergence control operand bundle and intrinsics

Thu Apr 15 10:51:56 PDT 2021

sameerds added a comment.

In D85603#2691257 <https://reviews.llvm.org/D85603#2691257>, @Anastasia wrote:

>> This needs to be inverted in the spirit of D69498 <https://reviews.llvm.org/D69498>. I would propose the following tweak:
>>
>> 1. By default, every call has an implicit `convergencectrl` bundle with a token returned by the `@llvm.experimental.convergence.entry` intrinsic from the entry block of the caller. This default is the most conservative setting within the semantics defined here.
>> 2. A more informed frontend or a suitable transformation can replace this conservative token with one of the following:
>>   1. A token returned by any of the other intrinsics, which provides more specific information about convergence at this callsite.
>>   2. A predefined constant token (say `none`), which indicates complete freedom. This would be equivalent to the `noconvergent` attribute proposed in D69498 <https://reviews.llvm.org/D69498>.
>>
>> Such a rewording would invert how we approach the spec. Instead of a representation that explicitly talks about special intrinsics that "need" convergence, the new semantics applies to all function calls. The redefined default is conservative instead of free, and the presence of the bundles relaxes the default instead of adding constraints.
>
> Sounds good. If that would be acceptable to the wider community it might help to simplify the frontend design and improve the user interface and the coherence of the interfaces within the compiler stack too.

>From what I understand, there was a fair bit of agreement in D69498 <https://reviews.llvm.org/D69498> about the need to make the default safer. The real question is //should// we fold that idea into this proposal?

There's one mistake in what I outlined above. The first point about default token is expressly forbidden by the static rule on cycles: if an intrinsic other than `.loop` inside a cycle uses a token, then the definition must also be in the same cycle. But I think this can be fixed by simply saying that the dynamic instance of a call without an explicit operand bundle is "undertermined". Any optimization must then back off, the whole point being that an optimization is safe if it preserves dynamic instances, and it is impossible to preserve an undetermined dynamic instance.

> FYI, if we forced early inlining in the LLVM stack, the frontend would not need to mark every function as convergent conservatively but in the Compute scenarios we occasionally have very large functions that when inlined result in huge binaries and longer compilation time. And we also have extern functions too that we have no information of during the compilation. So this doesn't seem like a route we can safely take at least not for all languages.

Or in other words, the "proper" definition must cover the whole of LLVM IR, and not introduce any assumptions like the absence of function calls.

> If we invert the convergent logic then we can add **nocovergent** attribute or even a pragma directive for the application developers to indicate what code doesn't contain cross-threads operations and can be optimized more aggressively.

Dynamic instances provide complete information at the callsite. But having an attribute on a function declaration (especially extern) is useful because it removes the need to analyse the function body itself.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85603/new/

https://reviews.llvm.org/D85603