[PATCH] D112053: [cuda] Add address space predicate funuctions.

Tue Oct 19 13:35:22 PDT 2021

tra added inline comments.

================
Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:691-694
+BUILTIN(__nvvm_isspacep_const, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_global, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_local, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_shared, "bvC*", "nc")
----------------
hliao wrote:
> hliao wrote:
> > tra wrote:
> > > CUDA appears to be using `__nv_isGlobal_impl` for the AS predicates. Perhaps we want to add those, too, forwarding them to the `__nvvm_...` implementations above. I've already added a few other AS-related `__nv_*` builtins in `lib/Headers/__clang_cuda_intrinsics.h`.
> > `__nv_isGlobal_impl` is exposed as an official interface. In fact, in CUDA SDK 10.0 or earlier, `__isGlobal` is directly implemented as inline asm. If possible, we should avoid defining unofficial or undocumented interfaces. `__nv_isGlobal_impl` was introduced from CUDA SDK 10.1 but there is no documentation on it.
> > 
> >   // This function returns 1 if generic address "ptr" is in global memory space.
> >   // It returns 0 if "ptr" is in shared, local or constant memory space.
> >   __SM_20_INTRINSICS_DECL__ unsigned int __isGlobal(const void *ptr)
> >   {
> >     unsigned int ret;
> >     asm volatile ("{ \n\t"
> >                   "    .reg .pred p; \n\t"
> >                   "    isspacep.global p, %1; \n\t"
> >                   "    selp.u32 %0, 1, 0, p;  \n\t"
> >   #if (defined(_MSC_VER) && defined(_WIN64)) || defined(__LP64__) || defined(__CUDACC_RTC__)
> >                   "} \n\t" : "=r"(ret) : "l"(ptr));
> >   #else
> >                   "} \n\t" : "=r"(ret) : "r"(ptr));
> >   #endif
> > 
> >     return ret;
> >   }
> > 
> typo, `__nv_isGlobal_impl` is *not* exposed as an official interface.
> 
> > `__nv_isGlobal_impl` is exposed as an official interface. In fact, in CUDA SDK 10.0 or earlier, `__isGlobal` is directly implemented as inline asm. If possible, we should avoid defining unofficial or undocumented interfaces. `__nv_isGlobal_impl` was introduced from CUDA SDK 10.1 but there is no documentation on it.
> > 
> >   // This function returns 1 if generic address "ptr" is in global memory space.
> >   // It returns 0 if "ptr" is in shared, local or constant memory space.
> >   __SM_20_INTRINSICS_DECL__ unsigned int __isGlobal(const void *ptr)
> >   {
> >     unsigned int ret;
> >     asm volatile ("{ \n\t"
> >                   "    .reg .pred p; \n\t"
> >                   "    isspacep.global p, %1; \n\t"
> >                   "    selp.u32 %0, 1, 0, p;  \n\t"
> >   #if (defined(_MSC_VER) && defined(_WIN64)) || defined(__LP64__) || defined(__CUDACC_RTC__)
> >                   "} \n\t" : "=r"(ret) : "l"(ptr));
> >   #else
> >                   "} \n\t" : "=r"(ret) : "r"(ptr));
> >   #endif
> > 
> >     return ret;
> >   }
> > 
> 
> 
> `__nv_isGlobal_impl` is *not* exposed as an official interface.
> If possible, we should avoid defining unofficial or undocumented interfaces. __nv_isGlobal_impl was introduced from CUDA SDK 10.1 but there is no documentation on it.

In general, I agree. I just wish NVIDIA would stop using undocumented APIs in the public headers they ship. By necessity, clang either has to rely on preprocessor hacks to edit out uncompileable code, or guess what the undocumented APIs do and implement them.

In this case I do not think `__nv_is*_impl`  are used anywhere other than in the functions you have renamed, so we're fine without them. They would be needed if the patch didn't have to rename `__isGlobal` and friends.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112053/new/

https://reviews.llvm.org/D112053