[PATCH] D112053: [cuda] Add address space predicate funuctions.

Michael Liao via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 19 13:23:31 PDT 2021


hliao added inline comments.


================
Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:691-694
+BUILTIN(__nvvm_isspacep_const, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_global, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_local, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_shared, "bvC*", "nc")
----------------
hliao wrote:
> tra wrote:
> > CUDA appears to be using `__nv_isGlobal_impl` for the AS predicates. Perhaps we want to add those, too, forwarding them to the `__nvvm_...` implementations above. I've already added a few other AS-related `__nv_*` builtins in `lib/Headers/__clang_cuda_intrinsics.h`.
> `__nv_isGlobal_impl` is exposed as an official interface. In fact, in CUDA SDK 10.0 or earlier, `__isGlobal` is directly implemented as inline asm. If possible, we should avoid defining unofficial or undocumented interfaces. `__nv_isGlobal_impl` was introduced from CUDA SDK 10.1 but there is no documentation on it.
> 
>   // This function returns 1 if generic address "ptr" is in global memory space.
>   // It returns 0 if "ptr" is in shared, local or constant memory space.
>   __SM_20_INTRINSICS_DECL__ unsigned int __isGlobal(const void *ptr)
>   {
>     unsigned int ret;
>     asm volatile ("{ \n\t"
>                   "    .reg .pred p; \n\t"
>                   "    isspacep.global p, %1; \n\t"
>                   "    selp.u32 %0, 1, 0, p;  \n\t"
>   #if (defined(_MSC_VER) && defined(_WIN64)) || defined(__LP64__) || defined(__CUDACC_RTC__)
>                   "} \n\t" : "=r"(ret) : "l"(ptr));
>   #else
>                   "} \n\t" : "=r"(ret) : "r"(ptr));
>   #endif
> 
>     return ret;
>   }
> 
typo, `__nv_isGlobal_impl` is *not* exposed as an official interface.

> `__nv_isGlobal_impl` is exposed as an official interface. In fact, in CUDA SDK 10.0 or earlier, `__isGlobal` is directly implemented as inline asm. If possible, we should avoid defining unofficial or undocumented interfaces. `__nv_isGlobal_impl` was introduced from CUDA SDK 10.1 but there is no documentation on it.
> 
>   // This function returns 1 if generic address "ptr" is in global memory space.
>   // It returns 0 if "ptr" is in shared, local or constant memory space.
>   __SM_20_INTRINSICS_DECL__ unsigned int __isGlobal(const void *ptr)
>   {
>     unsigned int ret;
>     asm volatile ("{ \n\t"
>                   "    .reg .pred p; \n\t"
>                   "    isspacep.global p, %1; \n\t"
>                   "    selp.u32 %0, 1, 0, p;  \n\t"
>   #if (defined(_MSC_VER) && defined(_WIN64)) || defined(__LP64__) || defined(__CUDACC_RTC__)
>                   "} \n\t" : "=r"(ret) : "l"(ptr));
>   #else
>                   "} \n\t" : "=r"(ret) : "r"(ptr));
>   #endif
> 
>     return ret;
>   }
> 




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112053/new/

https://reviews.llvm.org/D112053



More information about the cfe-commits mailing list