[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive
Johannes Doerfert via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Jul 20 15:37:30 PDT 2022
jdoerfert added a comment.
In D129536#3666884 <https://reviews.llvm.org/D129536#3666884>, @tra wrote:
> In D129536#3666860 <https://reviews.llvm.org/D129536#3666860>, @jdoerfert wrote:
>
>> The assertion is arguably not great but doesn't really matter, does it? How would I detect if they are supported?
>
> The latest revision of the patch is fine in this regard. My comment pointing to compiler crash reproducer was only intended to address the "For me this passes fine" part.
>
> The only remaining thing is the manual `__CUDA_ARCH__` redifinition which looks suspect to me. Is there any reason not to use `-target-cpu sm_30` instead?
No reason. Changed.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129536/new/
https://reviews.llvm.org/D129536
More information about the cfe-commits
mailing list