[PATCH] [CUDA] Support for built-in cuda variables (threadIdx and its friends).

Richard Smith richard at metafoo.co.uk
Mon Apr 20 17:52:11 PDT 2015


LGTM


================
Comment at: lib/Headers/cuda/cuda_builtin_vars.h:89
@@ +88,3 @@
+// architectures have a WARP_SZ value of 32'.
+__CUDA_BUILTIN_VAR int warpSize = 32;
+
----------------
tra wrote:
> rsmith wrote:
> > tra wrote:
> > > rsmith wrote:
> > > > This is a (strong) definition due to the `extern` in the `__CUDA_BUILTIN_VAR` macro, and will cause a link error if this file is `#include`d into multiple objects in the same binary. You could solve this by removing the `extern` (that'd give the variable internal linkage) or by using `enum { warpSize = 32 };` or similar. I'm not sure exactly what the CUDA spec requires here.
> > > Those have to be extern because no instances of those special variables can ever exist, yet they are referred to as if they do. 'extern' matches this functionality best. I've added a weak attribute to make linking work. That is, if/when we'll get to do any linking on the GPU code. Currently we produce PTX assembly and we don't ever link one PTX file with another.
> > > 
> > > I've reworked warpSize into a class with the same restrictions on construction and access as the other built-in variables.
> > > 
> > > As for CUDA spec, it does not say much -- 'They [built-in variables] are only valid within functions that are executed on the device. [warpSize] is of type int and contains the warp size in threads.'
> > > 
> > > 
> > I was only talking about `warpSize`, not the others. The others seem OK as (non-weak) `extern` declarations, on the assumption that programs aren't actually allowed to use the address of these.
> > 
> > Your new approach doesn't satisfy the requirement that "`warpSize` is of type `int`": this is detectable through template argument deduction, for instance:
> > 
> >   template<typename T> T max(T, T)
> >   max(warpSize, 10); // error, T is ambiguous, could be int or __cuda_builtin_warpSize_t
> > 
> > Using
> > 
> >   __attribute__((device)) const int warpSize = 32;
> > 
> > would probably work (that is, drop the `extern` from what you had before).
> Another downside of warpSize-in-a-class was that a reference to an instance of that class was required for the type conversion operator as it can't be a static member function. Scratch that.
> 
> Dropping extern is not sufficient. 'const int' suffers from the same problem as 'extern' -- we end up with warpSize visible externally which causes linking error.
> 
> I could use 'static const int' -- it keeps the symbol local and does not create linking issues. Unfortunately it allows one to take address of the variable. It's not perfect, but we can live with that for now.
> 
> 
> 
> 
In C++, `const` on a global variable definition (that is not explicitly marked `extern`) implies internal linkage just like `static` does -- the `static` is harmless but redundant.

http://reviews.llvm.org/D9064

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the cfe-commits mailing list