[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

Johannes Doerfert via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jul 20 15:37:30 PDT 2022


jdoerfert added a comment.

In D129536#3666884 <https://reviews.llvm.org/D129536#3666884>, @tra wrote:

> In D129536#3666860 <https://reviews.llvm.org/D129536#3666860>, @jdoerfert wrote:
>
>> The assertion is arguably not great but doesn't really matter, does it? How would I detect if they are supported?
>
> The latest revision of the patch is fine in this regard. My comment pointing to compiler crash reproducer was only intended to address the "For me this passes fine" part.
>
> The only remaining thing is the manual `__CUDA_ARCH__` redifinition which looks suspect to me. Is there any reason not to use `-target-cpu sm_30` instead?

No reason. Changed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129536/new/

https://reviews.llvm.org/D129536



More information about the cfe-commits mailing list