[PATCH] D141717: [Clang] Only emit textual LLVM-IR in device only mode

Fri Jan 13 15:49:41 PST 2023

jhuber6 added a comment.

In D141717#4052986 <https://reviews.llvm.org/D141717#4052986>, @tra wrote:

> In D141717#4052824 <https://reviews.llvm.org/D141717#4052824>, @jhuber6 wrote:
>
>> For `-E` we don't embed anything,
>
> That was just an exaggerated example of top-level options affecting sub-compilation output where you can't magically tweak it to produce the kind of output your sub-compilation needs.
>
> The fundamental problem I have is that compiler should not magically fix what the user has specified.  "-S -emit-llvm" has very specific meaning and it's not "produce binary IR".  However, when we're dealing with 'interesting' compilation pipelines like CUDA, things get complicated, as in your example. Presumably the user wants the compiler to produce the textual IR for the "top-level" compilation. In the case of CUDA it would be host-side IR (with embedded GPU binary, which should still be a *binary*). The fact that in your case that binary happens to be binary LLVM IR is an implementation detail. Without the new driver it should be the real GPU fatbin. Hence my argument that what we want is to avoid passing "-S -emit-llvm" (and potentially other output-altering options) to the GPU sub-compilation, which would work regardless of whether the GPU binary is a fatbin or LLVM bitcode.
> For the compilation with --cuda-device-only, GPU compilation is the top-level compilation and it will get all the right options to make it produce textual IR and stop after that.
>
> Does it make sense?

So are you suggesting that we complete the whole pipeline? So `-S -emit-llvm` gives host IR, but the device will go all the way to object?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141717/new/

https://reviews.llvm.org/D141717