[PATCH] D100124: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX redux.sync instructions

Jon Chesterfield via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Apr 12 15:03:20 PDT 2021


JonChesterfield added a comment.

Interesting. Reduction across lanes in warp? If so, this is probably a way to handle the last step reduction for openmp reductions


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100124/new/

https://reviews.llvm.org/D100124



More information about the cfe-commits mailing list