[Openmp-commits] [PATCH] D94731: [libomptarget][nvptx] Reduce calls to cuda header
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Thu Jan 14 18:14:37 PST 2021
JonChesterfield added a comment.
Remaining calls could be replaced with builtins by duplicating parts of
the cuda wrapper infra. This would mean choosing which to call based
on architecture number (>= sm_70) instead of CUDA_VERSION.
That would yield a deviceRTL that is independent of cuda. However, it
would also mean that mixed cuda + openmp code uses CUDA_VERSION
to choose intrinsics in some places and architecture number in others,
which seems likely to cause problems.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94731/new/
https://reviews.llvm.org/D94731
More information about the Openmp-commits
mailing list