[PATCH] D101630: [HIP] Fix device-only compilation

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri May 7 09:42:43 PDT 2021


yaxunl added a comment.

In D101630#2733761 <https://reviews.llvm.org/D101630#2733761>, @tra wrote:

> In D101630#2730273 <https://reviews.llvm.org/D101630#2730273>, @yaxunl wrote:
>
>> How about an option -fhip-bundle-device-output. If it is on, device output is bundled no matter how many GPU arch there are. By default it is on.
>
> +1 to the option, but I can't say I'm particularly happy about the default. I'd still prefer the default to be a no-bundling + an error in cases when we'd nominally produce multiple outputs.
> We could use a third opinion here.
>
> @jdoerfert : Do you have any thoughts on what would be a sensible default when a user uses `-S -o foo.s` for compilations that may produce multiple results? I think OpenMP may have to deal with similar issues.
>
> On one hand it would be convenient for ccache to just work with the CUDA/HIP compilation out of the box. Compiler always produces one output file, regardless of what it does under the hood and ccache may not care what's in it.
>
> On the other, this behavior breaks user expectations. I.e. `clang -S` is supposed to produce the assembly, not an opaque binary bundle blob.
> Using an `-S` with multiple sub-compilations is also likely an error on the user's end and should be explicitly diagnosed and that's how it currently work.
> Using `-fno-hip-bundle-device-output` to restore the expected behavior puts the burden on the wrong side, IMO.  I believe, it should be ccache which should be using `-fhip-bundle-device-output` to deal with the CUDA/HIP compilations.

I choose to emit the bundled output by default since it is the convention for compiler to have one output. The compilation is like a pipeline. If we break it into stages, users would expect to use the output from one stage as input for the next stage. This is possible only if there is one output. This happens with host compilations and combined device/host compilations. I would see it is a surprise that this is not true for device compilation.

Also, when users do not want the output to be bundled, it is usually for debugging or special purposes. Users need to know the naming convention of the multiple outputs. I think it is justifiable to enable this by an option.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101630/new/

https://reviews.llvm.org/D101630



More information about the cfe-commits mailing list