[PATCH] D85603: IR: Add convergence control operand bundle and intrinsics
Vinod Grover via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 5 08:35:22 PST 2020
vgrover99 added a comment.
In D85603#2370634 <https://reviews.llvm.org/D85603#2370634>, @sameerds wrote:
>
> This correctly covers `__syncthreads()`, and is a bit conservative about builtins that take a mask like `__syncwarp()`. In @jlebar's example, each call to `__syncwarp()` is a separate dynamic instance, although CUDA actually treats them as a single synchronization point.
Yes, in CUDA and in PTX `__syncwarp` is an unaligned primitive.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D85603/new/
https://reviews.llvm.org/D85603
More information about the llvm-commits
mailing list