[PATCH] D147116: [RFC] Introduce convergence control intrinsics

Sameer Sahasrabuddhe via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 16 05:33:18 PDT 2023


sameerds marked 3 inline comments as done.
sameerds added a comment.

In D147116#4426454 <https://reviews.llvm.org/D147116#4426454>, @efriedma wrote:

>   start:
>   @llvm.experimental.convergence.loop()
>   if (!cond) goto end
>   body;
>   goto start;
>   end:
>
> For most cases, making this work doesn't require llvm.experimental.convergence.loop to return a token; the mere existence of a convergent call that executes on every thread every iteration forces the necessary structure on the code.  The problem is that after a "break" or "return" inside, the loop doesn't execute the llvm.experimental.convergence.loop call; to solve this, you make llvm.experimental.convergence.loop return a token, and impose a bunch of rules on the placement of the intrinsic and the usage of the token, so the control flow can be reconstructed.

The problem is not that exited threads don't execute the call to `llvm.experimental.convergence.loop`.  We actually want to allow that, and then identify subsets of threads that executed the loop the same number of times and then broke out. The key part is this missing convergent op in the example:

  start:
    %inner = @llvm.experimental.convergence.loop() [token %outer]
    if (!cond) {
      convergent_op() [%inner]
      goto end
    }
    body;
    goto start;
  end:

The tokens allow us to identify the subsets of threads that will execute convergent_op() "together", on their way out of the loop along the break statement. The token `%outer` define the set S of threads that entered the loop together, and the token `%inner` now identifies subsets of S that exited "together". The explicit use of `%inner` is specifying which threads should communicate at convergent_op(). If the call had used `%outer` as an argument, it would have meant that the communication at convergent_op() is "outside" the loop, and all threads that entered the loop should execute it together.

The important fact is that convergent_op() is itself outside the CFG loop, although lexically it looks like it is inside the loop. This distinction is even greater when the we replace the "start ... goto start" with a proper high-level //loop statement//.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147116/new/

https://reviews.llvm.org/D147116



More information about the llvm-commits mailing list