[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jul 12 12:27:03 PDT 2022


tra added a comment.

Oops. Thank you for fixing this.



================
Comment at: clang/test/CodeGenCUDA/shuffle_long_long.cu:52
+  long long ll = 17;
+  ull = __shfl(ull, 7, 32);
+  ll = __shfl(ll, 7, 32);
----------------
This crashes LLVM when we taget sm_70 where these instructions no longer exist.  We should probably disable those sync wrappers when we compile for GPUs where they are not available, so we'd get a proper compiler error instead of a crash.

Also, we should probably make non-sync instruction use conditional on SYNC. https://godbolt.org/z/7n4vsb41v


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129536/new/

https://reviews.llvm.org/D129536



More information about the cfe-commits mailing list