[PATCH] D101630: [HIP] Fix device-only compilation

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon May 3 10:22:49 PDT 2021


tra added a subscriber: jdoerfert.
tra added a comment.

In D101630#2730273 <https://reviews.llvm.org/D101630#2730273>, @yaxunl wrote:

> How about an option -fhip-bundle-device-output. If it is on, device output is bundled no matter how many GPU arch there are. By default it is on.

+1 to the option, but I can't say I'm particularly happy about the default. I'd still prefer the default to be a no-bundling + an error in cases when we'd nominally produce multiple outputs.
We could use a third opinion here.

@jdoerfert : Do you have any thoughts on what would be a sensible default when a user uses `-S -o foo.s` for compilations that may produce multiple results? I think OpenMP may have to deal with similar issues.

On one hand it would be convenient for ccache to just work with the CUDA/HIP compilation out of the box. Compiler always produces one output file, regardless of what it does under the hood and ccache may not care what's in it.

On the other, this behavior breaks user expectations. I.e. `clang -S` is supposed to produce the assembly, not an opaque binary bundle blob.
Using an `-S` with multiple sub-compilations is also likely an error on the user's end and should be explicitly diagnosed and that's how it currently work.
Using `-fno-hip-bundle-device-output` to restore the expected behavior puts the burden on the wrong side, IMO.  I believe, it should be ccache which should be using `-fhip-bundle-device-output` to deal with the CUDA/HIP compilations.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101630/new/

https://reviews.llvm.org/D101630



More information about the cfe-commits mailing list