[cfe-dev] [CUDA] Why "Disallow 'extern __shared__' variables"?

Hal Finkel via cfe-dev cfe-dev at lists.llvm.org
Fri Oct 28 10:13:08 PDT 2016


[+Samuel]

----- Original Message -----
> From: "Arpith C Jacob via cfe-dev" <cfe-dev at lists.llvm.org>
> To: "Justin Lebar" <jlebar at google.com>
> Cc: "cfe-dev" <cfe-dev at lists.llvm.org>
> Sent: Friday, October 28, 2016 11:58:17 AM
> Subject: Re: [cfe-dev] [CUDA] Why "Disallow 'extern __shared__' variables"?

For some additional context, see also:

  https://llvm.org/bugs/show_bug.cgi?id=30812#c2
  https://llvm.org/bugs/show_bug.cgi?id=26343

> 
> Hi Justin,
> 
> Thanks for your response.
> 
> I am using a mix of our OpenMP nvptx toolchain for OpenMP-gpu
> programs and Clang-Cuda for the OpenMP runtime that we've written in
> Cuda. This may be the source of some of your surprises.
> 
> I translate the Cuda code to LLVM IR and pull it into the user's GPU
> program (with - mlink-cuda-bitcode , similar to how you pull in
> libdevice.compute.bc). We then use our toolchain to build
> relocatable objects with ptxas. I'll be happy to talk more about our
> use case and how we can make the improvements you suggest.
> 
> > Given that "extern __shared__" means "get me a pointer to the
> > dynamically-allocated shared memory for this kernel," using a
> > non-array / non-pointer type would be...odd?
> > 
> 
> I believe the difference is whether the cuda code is being compiled
> in whole-program or separate compilation modes. The following
> section covers the case I described for separate compilation mode,
> which is what I'm doing:
> https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#device-memory-qualifiers
> 
> "When compiling in the separate compilation mode (see the nvcc user
> manual for a description of this mode), __device__, __shared__, and
> __constant__ variables can be defined as external using the extern
> keyword. nvlink will generate an error when it cannot find a
> definition for an external variable (unless it is a dynamically
> allocated __shared__ variable)."
> 
> Can we add a flag in Clang-Cuda to indicate separate compilation
> mode?

I'd definitely like to see this happen. I have users for whom this capability is important.

 -Hal

> 
> Could you point me to patches/code that I can look at to understand
> the implications of separate compilation? What LLVM optimizations
> benefit from whole-program compilation mode? What, if anything,
> breaks if I use it for separate compilation?
> 
> Thanks in advance for your time.
> 
> Arpith
> 
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> 

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the cfe-dev mailing list