[PATCH] D94337: Add cuda header type for cuh files

Mon Jan 11 14:52:25 PST 2021

tra added a comment.

In D94337#2491269 <https://reviews.llvm.org/D94337#2491269>, @rgreenblatt wrote:

> My primary goal for this change was to allow for language servers and other tooling to properly handle cuda header files. From my understanding the way that language servers handle c++ header files is by compiling them with -xc++-header and -fsyntax-only. This is certainly true for ccls and it seems to be true for clangd.
> So this can be accomplished without actually able to produce preprocessed output for cuda headers - it only requires handling the "-fsyntax-only" use case.

That would still require properly defined CUDA macros. CUDA in clang relies on various CUDA attributes, currently wrapped in `__{host|device|global|etc.}__` macros in order to compile the code. Without them, `-fsyntax-only` will not be give you correct results on most of the CUDA code.
Most likely you'll see tons of errors when compiler sees `__device__` and has no idea what to do with it. Hence my suggestion that clang needs at least a minimum subset of CUDA headers to provide the critical subset of macros sufficient to convey critical semantics of CUDA code.

> A secondary goal was to make it so that header tab completion recognizes .cuh files.
> This change doesn't depend on the other changes - it only requires a minor edit to clang/lib/Sema/SemaCodeComplete.cpp.

Maybe. It depends on how well clang can recover from the errors induced by the unexpanded CUDA macros. This could range from OK on simple code to rather badly if we fail to instantiate templated code.

> From my limited testing, the changes made so far are sufficient to allow for language servers to handle cuda headers.

Trivial CUDA headers -- maybe. I have doubts that it would work on something more interesting. E.g. try using it on https://github.com/NVIDIA/cutlass.

> clangd seems to just work (but I haven't tested it much), if --cuda-gpu-arch is supplied (and --cuda-path if needed) via compile_flags.txt.

Yes, if you provide the flags and use it on complete .cu file, then clang driver takes care of the CUDA magic, pre-includes the right headers so the file parses correctly.

If the flags are supplied, then the same process will work for the header files, too. You could be able to process them with `-x cuda` which will apply the same magic pre-include. 
However, applying the same magic to everything that has CUDA header extension as the input is not the right thing to do, IMO, as that would not be what the end user expect.
In other words, the magic would be OK for tooling, but not for the general use by default.

>> Also, AFAICT, there's no such thing as a canonical CUDA header file extension. Some projects use `.cuh`, some use `.cu.h`, some use just `.h` and other C++ extensions, so this change would only help with a subset of the headers.
>
> The way things are setup in Types.def, defining a new file type requires an extension. I think cuh is the best choice for this extension.

I'm OK with this part, if it's necessary in the end.

> Not having a canonical CUDA header file extension is unfortunate, but this could be addressed at the tooling level if desired. For example, one could imagine designating a subset of headers to be built with -xcuda-header via a regex or whitelist.
> As far as I know, ccls and clangd don't currently have a nice way of doing additional header specific compile commands, but I can't imagine this would be particularly difficult to implement. Another option would be to add -xcuda-header as a compiler flag for all headers as c++ headers should generally be valid cuda.

Addressing it by the tooling looks like a good place to do it. Applying CUDA magic to all C++ sources by default is probably not a good idea as all the extra stuff will be observable and some C++ headers that may have checks for CUDA-specific things may get confused. I can't think of a good way to tell if particular source uses CUDA extensions or not, other than by trying to compile it. But then, again, what it it works for both CUDA and C++ mode. We would not know which mode user intended without them explicitly telling us. We may need some sort of knob what to assume if we can't figure it out some other way.

>> In general, `--cuda-path` should probably be treated as a required argument. I do not see a practical way to make `properly handle cuh files without additional arguments` work reliably. For that we'd need Clang to carry all required CUDA headers and that's not going to happen any time soon. That said, it may be worth considering implementing a `stand-alone` CUDA compilation mode which would provide the bare minimum of CUDA headers without having to include CUDA SDK headers. **That** would be useful for various tooling scenarios where we may need to deal with CUDA sources/headers but where we do not have CUDA SDK available and/or do not have access to correct `--cuda-path`.
>
> For the language server use case this isn't necessary much of a problem.
> ccls allows for configuring cuda specific arguments.
> If a compile_commands.json is used, figuring out the correct compiler flags for any type of header file is in general non-trivial.

Agreed. There may not be a definitive single answer in some cases. I.e. the same header could be used from both C++ and CUDA compilations. I guess, ultimately we may need to make tooling aware that the same file may have multiple compiled forms. E.g. CUDA sources have multiple instances -- one for the host and one per GPU we compile for. The AST for each instance is not necessarily identical. Usually it's close, so for now we're getting by by using only host-side compilation, but that may give us incomplete picture. This is somewhat orthogonal to figuring out how to handle CUDA sources in principle, so for now let's still stick with a 1:1 source-to-AST model.  For CUDA/C++ choice, CUDA may indeed be a better choice CUDA-mode AST would usually be a superset of a C++ compilation of the same file.

Hmm. Looks like we'll need some sort of heuristinc (or user knob) to tell us how to parse a header file in a pjoject with mixed C++/CUDA code.

> I think ccls tries to find the closest match in filename using some sort of metric. 
> One could imagine trying to match cuda headers with cuda source file to get the correct values for --cuda-path and --cuda-gpu-arch.

This would work in some cases, but not in others. E.g. a header used from both C++ and CUDA cources.

All that said, I'm all for improving CUDA handling within the tooling. This patch will play its role, but I still believe it's a bit premature. It will make some use-cases work, but I think it may be a good chance to make tooling work with CUDA sources in a more consistent manner.

Here's my straw-man proposal:

- Create a new `lib/Headers/__clang_cuda_standalone_defs.h` and populate it with the bare minimum of macros and declarations: `__host__`, `__device__`, other attributes, builtin vars, `malloc/free/new/delete`, handful of basic types like `dim3`/`uint3`, math functions etc.... Basically as much as we can w/o having to use actual CUDA headers.
- teach tooling to pre-include that  stand-alone header whenever it wants to parse a CUDA source, but has no explicit flags pointing compiler to the CUDA SDK. Pre-include the standard runtime wrapper header otherwise. This may be done in the driver itself as a fallback mechanism in case CUDA SDK is not found. Not sure yet.

We may not need the special CUDA header type after that. The tooling heuristic should be able to deal with detecting whether it deals with CUDA sources better than clang -- compilation database + user input should help. If/when tooling knows it deals with CUDA, it can tell so to clang with `-x cuda`.

Does it make sense?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94337/new/

https://reviews.llvm.org/D94337