[cfe-dev] Proposal: CUDA support; outline, initial patches

Peter Collingbourne peter at pcc.me.uk
Sat Nov 20 21:58:20 PST 2010


Hi,

This is intended to outline how we could add CUDA support to Clang,
feedback appreciated.  I've also attached patches with initial
progress.

Architecture
------------

The nvcc driver is somewhat idiosyncratic.  It first performs
a source-to-source transformation on the CUDA code which splits
the code into declarations targeted for the host and the device (a
set of type qualifiers controls this process) to produce a pair of
preprocessed source files.  The host file is compiled by the system
compiler (e.g. gcc) and the device source file is compiled to PTX
using a custom Open64-based compiler (nvopencc) and then compiled to
the nvidia target using ptxas.  At a later stage the resultant object
files are combined to produce a single executable.

With a Clang-based driver, we can do better, by parsing the source
file once to produce a single AST, and performing two separate
CodeGen passes to produce 2 modules: one targeted to the host and
the other to the device.  CodeGen would need to be taught to skip
certain declarations depending on their type qualifiers.

>From there, the process would be target specific, but for nvidia
targets, we can use our PTX backend -> ptxas to produce a cubin file
(actually an ELF format file).  Then parse the cubin and link it
into the host module in the same way as nvcc (which has yet to be
determined).

Clang Requirements
------------------

(references are to CUDA Programming Guide, version 3.0)

- lex/parse/sema/ast/codegen support for kernel call expression (B.13)
- lex/parse/sema/ast/codegen support for type qualifiers: (B.1 B.2)
    __device__ __global__ __host__ __constant__ __shared__
  - multi-pass/filtering codegen
- lex/parse/sema/ast/codegen support for launch bounds (B.14)
- support for built-in variable semantics (B.4.6)
- nvcc workalike driver
- others?

LLVM Requirements
-----------------

- (nvidia only) PTX backend
- (nvidia only) capability to parse cubin (object file library?)
- others?

Kernel calls
------------

nvcc translates kernel calls into equivalent calls to the CUDA
runtime API during the initial source-to-source transformation.
I propose that we do the same thing by default (at the CodeGen level)
while making this overridable for a specific target.

Progress so far
---------------

Kernel call AST node has been added and lex/parse/sema have been
modified to produce the node.  Patches attached are for review only
and not to be committed yet.  For one thing, they lack test cases.
For another, I'm highly bothered by the hardcoded type names in sema's
execution configuration handling code (patch 8), which stem from how
execution configurations are specified in the CUDA Programming Guide
(B.13).  Any suggestions for a better way to handle this would be
very welcome.

Next steps
----------

I intend to continue building out the Clang requirements.  Next steps
are codegen support for kernel call and parse/sema support for type
qualifiers.

Thanks,
-- 
Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-CUDA-language-option.patch
Type: text/x-diff
Size: 1164 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Lexer-add-CUDA-kernel-call-tokens.patch
Type: text/x-diff
Size: 2115 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0001.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Frontend-add-cuda-flag.patch
Type: text/x-diff
Size: 1615 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0002.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-Parse-basic-support-for-parsing-CUDA-kernel-calls.patch
Type: text/x-diff
Size: 2783 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0003.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-AST-support-for-extra-subexpressions-on-CallExpr-sub.patch
Type: text/x-diff
Size: 3851 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0004.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0006-AST-add-CUDAKernelCallExpr.patch
Type: text/x-diff
Size: 9455 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0005.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0007-Sema-implement-Sema-PrepareArgument.patch
Type: text/x-diff
Size: 3467 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0006.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0008-Sema-support-for-building-CUDAKernelCallExpr-from-Ac.patch
Type: text/x-diff
Size: 6914 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0007.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0009-Parse-pass-execution-configuration-to-Sema.patch
Type: text/x-diff
Size: 1038 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0008.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0010-Sema-diagnostics-for-too-few-many-exec-config-args-t.patch
Type: text/x-diff
Size: 2664 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0009.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0011-Lex-support-for-CUDA-attributes.patch
Type: text/x-diff
Size: 2411 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101121/c4362b5c/attachment-0010.patch>


More information about the cfe-dev mailing list