[cfe-dev] [CUDA] Why "Disallow 'extern __shared__' variables"?
Hal Finkel via cfe-dev
cfe-dev at lists.llvm.org
Fri Oct 28 10:13:08 PDT 2016
[+Samuel]
----- Original Message -----
> From: "Arpith C Jacob via cfe-dev" <cfe-dev at lists.llvm.org>
> To: "Justin Lebar" <jlebar at google.com>
> Cc: "cfe-dev" <cfe-dev at lists.llvm.org>
> Sent: Friday, October 28, 2016 11:58:17 AM
> Subject: Re: [cfe-dev] [CUDA] Why "Disallow 'extern __shared__' variables"?
For some additional context, see also:
https://llvm.org/bugs/show_bug.cgi?id=30812#c2
https://llvm.org/bugs/show_bug.cgi?id=26343
>
> Hi Justin,
>
> Thanks for your response.
>
> I am using a mix of our OpenMP nvptx toolchain for OpenMP-gpu
> programs and Clang-Cuda for the OpenMP runtime that we've written in
> Cuda. This may be the source of some of your surprises.
>
> I translate the Cuda code to LLVM IR and pull it into the user's GPU
> program (with - mlink-cuda-bitcode , similar to how you pull in
> libdevice.compute.bc). We then use our toolchain to build
> relocatable objects with ptxas. I'll be happy to talk more about our
> use case and how we can make the improvements you suggest.
>
> > Given that "extern __shared__" means "get me a pointer to the
> > dynamically-allocated shared memory for this kernel," using a
> > non-array / non-pointer type would be...odd?
> >
>
> I believe the difference is whether the cuda code is being compiled
> in whole-program or separate compilation modes. The following
> section covers the case I described for separate compilation mode,
> which is what I'm doing:
> https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#device-memory-qualifiers
>
> "When compiling in the separate compilation mode (see the nvcc user
> manual for a description of this mode), __device__, __shared__, and
> __constant__ variables can be defined as external using the extern
> keyword. nvlink will generate an error when it cannot find a
> definition for an external variable (unless it is a dynamically
> allocated __shared__ variable)."
>
> Can we add a flag in Clang-Cuda to indicate separate compilation
> mode?
I'd definitely like to see this happen. I have users for whom this capability is important.
-Hal
>
> Could you point me to patches/code that I can look at to understand
> the implications of separate compilation? What LLVM optimizations
> benefit from whole-program compilation mode? What, if anything,
> breaks if I use it for separate compilation?
>
> Thanks in advance for your time.
>
> Arpith
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
More information about the cfe-dev
mailing list