[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive
Artem Belevich via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Jul 12 12:27:03 PDT 2022
tra added a comment.
Oops. Thank you for fixing this.
================
Comment at: clang/test/CodeGenCUDA/shuffle_long_long.cu:52
+ long long ll = 17;
+ ull = __shfl(ull, 7, 32);
+ ll = __shfl(ll, 7, 32);
----------------
This crashes LLVM when we taget sm_70 where these instructions no longer exist. We should probably disable those sync wrappers when we compile for GPUs where they are not available, so we'd get a proper compiler error instead of a crash.
Also, we should probably make non-sync instruction use conditional on SYNC. https://godbolt.org/z/7n4vsb41v
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129536/new/
https://reviews.llvm.org/D129536
More information about the cfe-commits
mailing list