[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive
Artem Belevich via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Jul 20 15:12:09 PDT 2022
tra added a comment.
In D129536#3666860 <https://reviews.llvm.org/D129536#3666860>, @jdoerfert wrote:
> The assertion is arguably not great but doesn't really matter, does it? How would I detect if they are supported?
The latest revision of the patch is fine in this regard. My comment pointing to compiler crash reproducer was only intended to address the "For me this passes fine" part.
The only remaining thing is the manual `__CUDA_ARCH__` redifinition which looks suspect to me. Is there any reason not to use `-target-cpu sm_30` instead?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129536/new/
https://reviews.llvm.org/D129536
More information about the cfe-commits
mailing list