[PATCH] D85603: IR: Add convergence control operand bundle and intrinsics

Vinod Grover via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 5 08:35:22 PST 2020


vgrover99 added a comment.

In D85603#2370634 <https://reviews.llvm.org/D85603#2370634>, @sameerds wrote:

> 



> This correctly covers `__syncthreads()`, and is a bit conservative about builtins that take a mask like `__syncwarp()`. In @jlebar's example, each call to `__syncwarp()` is a separate dynamic instance, although CUDA actually treats them as a single synchronization point.

Yes, in CUDA and in PTX `__syncwarp` is an unaligned primitive.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85603/new/

https://reviews.llvm.org/D85603



More information about the llvm-commits mailing list