[Openmp-commits] [PATCH] D94731: [libomptarget][nvptx] Reduce calls to cuda header

Thu Jan 14 18:14:37 PST 2021

JonChesterfield added a comment.

Remaining calls could be replaced with builtins by duplicating parts of
the cuda wrapper infra. This would mean choosing which to call based
on architecture number (>= sm_70) instead of CUDA_VERSION.

That would yield a deviceRTL that is independent of cuda. However, it
would also mean that mixed cuda + openmp code uses CUDA_VERSION
to choose intrinsics in some places and architecture number in others,
which seems likely to cause problems.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94731/new/

https://reviews.llvm.org/D94731