[PATCH] D60620: [HIP] Support -offloading-target-id

Fri Apr 12 10:22:07 PDT 2019

tra added a subscriber: echristo.
tra added a comment.

It looks like you are solving two problems here.
a) you want to create multiple device passes for the same GPU, but with different options.
b) you may want to pass different compiler options to different device compilations.
The patch effectively hard-codes {gpu, options} tuple into --offloading-target-id variant.
Is that correct?

This looks essentially the same as your previous patch D59863 <https://reviews.llvm.org/D59863>.

We have a limited way to deal with (b), but there's currently no way to deal with (a).

For (a), I think, the real problem is that until now we've assumed that there's only one device-side compilation per target GPU arch. If we need multiple device-side compilations, we need a way to name them.  Using `offloading-target-id` as  a super-set of `--cuda-gpu-arch` is OK with me. However, I'm on the fence about the option serving a double-duty of setting magic compiler flags. On one hand, that's what driver does, so it may be OK. On the other hand, it's unnecessarily strict. I.e. if we provide ability to create multiple compilation passes for the same GPU arch, why limit that to only changing those hard-coded options? A general approach would allow a way to create more than one device-side compilation and provide arbitrary compiler options only to *that* compilation. Thiw will also help solving number of issues we have right now when some host-side compilation options break device-side compilation and we have to work around that by filtering out some of them in the driver.

A while back @echristo and I have discussed how it could be handled in a more generic way.
IIRC we ended up with a strawman proposal that looked roughly like this:

Currently we have rudimentary -Xarch_smXX options implemented for various toolchains in the driver.
E.g. for HIP: https://github.com/llvm-mirror/clang/blob/master/lib/Driver/ToolChains/HIP.cpp#L341
We want to generalize it and make it less awkward to use. One way to do it would be to introduce a `-Xarch TARGET` flag where the option(s) following the flag would apply only to the compilation for that particular target. `TARGET` could have special values like `HOST` and `DEVICE` and `ALL` which would widen the option scope to host/device/all compilation. The `-Xarch` flag could be either sticky (all following options are affected by it, until the next -Xarch option) or only affect one option (the way -X options work now). Make option parser aware of the current compilation target, and it should be fairly straightforward to control compilation options for particular target.

We could add `--offloading-target-id=X` to create and name a device-side compilation and then use that name in `-Xarch X` or `-Xtarget X` to pass appropriate options.
`--cuda-gpu-arch=GPU` would be treated as `--offloading-target-id=GPU -mcpu GPU`. If we had something like that, then your goal could be achieved with something like this:

  ... --offloading-target-id=foo -Xtarget foo -mcpu gfx906 ....
  ... --offloading-target-id=bar -Xtarget bar -mcpu gfx906 -mxnack -msram-ecc

We could also provide target aliases for the 'standard' offloading targets, so users do not have to type *all* options specific to the target, but would still have a way to override them.

This would also give us a flexible way to avoid passing some host-only options to device-side compilation without having to hard code every special case.

That may be a somewhat larger chunk of work.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60620/new/

https://reviews.llvm.org/D60620