[cfe-dev] [LLVMdev] C++AMP -> OpenCL (NVPTX) prototype

Wed May 8 02:49:04 PDT 2013

Hi!

I'm very sure that there's great public interest in CLang being able to
compile C++AMP code. Although how keywords are given versus attributes, I'd
like myelf too if code looked even more native, but we'll just have to go
with this for the time being. I have no clue how the compiler works under
the hood, but once these restrictions are implemented, it cannot be too big
work to redirect where they might originate from in the source code.

There are a few things I don't understand and it would really rock if
someone could explain.

CLang compiles into LLVM IR which then creates some platform specific code
(binary) out of it. In the present state of this feature it is fed to LLVM
to create PTX which then can be fed to the NV drivers pretty much directly
for execution. My question is, how can this be extended for optimal and
portable compilations? I take it that RadeonSI driver developers are puting 
great effort <http://www.phoronix.com/scan.php?page=news_item&px=MTM2NTU>  
into making an LLVM back-end. But how does this fit into the bigger picture?

CLang as a compiler should be aiming on translating C++AMP decorated code to
LLVM IR in a manner that decorations are represented in the IR. Then it
should be LLVM's job to turn this IR into OpenCL SPIR, which functions
similarly to DX bytecode in the case of the Microsoft implementation.
Sometime during execution (I can't tell where the best place could be), this
SPIR must be compiled by the chosen OpenCL platform into either PTX (by the
NV driver, not LLVM) or ISA (by the AMD driver).

This is the neatest toolchain design that I can think of, but I do not
understand how Gallium comes into place if they are working on an LLVM
backend. Or is that only a choice of optimizing their own shaders solely?

To make some corrections, there is a means of using address spaces in
C++AMP, namely all variables declared in amp restricted functions are
__private as far as OpenCL is concerned, all variables declared as
"tile_static" are inside the __local namespace, and memory inside a
concurrency::array<T,N> is stored in __global. The reason why these might
not be available for pointers if because the AMP spec forbids storing
pointers to such types. There is a similar restriction in OpenCL, where you
cannot store a pointer to __global memory in a __private variable, so
address spaces cannot mingle. AMP restricts storing such pointers
alltogether.

I'm very much interested in this project, as I see that either AMP, or
something very similar will be the future of GPU parallelism after OpenCL,
which will remain to evolve and will most likely serve as a back-end to AMP.
Could we get some status update, as to what progress has been (or is planned
to be) made?

--
View this message in context: http://clang-developers.42468.n3.nabble.com/Re-LLVMdev-C-AMP-OpenCL-NVPTX-prototype-tp4031478p4031991.html
Sent from the Clang Developers mailing list archive at Nabble.com.