[PATCH] D110622: [HIPSPV][3/4] Enable SPIR-V emission for HIP

Mon Oct 4 01:24:03 PDT 2021

linjamaki added a comment.

In D110622#3030792 <https://reviews.llvm.org/D110622#3030792>, @tra wrote:

>> A Cuda GPU architecture ‘generic’ is added. The name is picked from the LLVM SPIR-V Backend. In the HIPSPV code path the architecture name is inserted to the bundle entry ID as target ID. Target ID is expected to be always present so a component in the target triple is not mistaken as target ID.
>
> How generic is 'generic'? If I understand the statement above correctly, it should probably reflect that it's specific to spir-v.
> If it's the only possible spir-v variant, then calling it`spir-v` might be more meaningful.
> If we expect to see other spir-v variants in the future it would allow us to clearly differentiate between them later. 
> E.g. `--offload=spirv-a,spirv-b`. It would be rather odd if we had to use `--offload=generic, spirv-b`.

In this patch the ‘generic’ is meant to be a processor model defined in the SPIR-V backend. Now to come to think of it a bit more, I think it should not be specific to the SPIR-V target but the target at hand if its backend defines one. What I’m seeing is that each entry in the CudaArch has a processor by the same name in the NVPTX and AMGPU backends.

If I need to set different processor other from the SPIR-V backend than what is set as the default in HIP compilation, I thought from the [1] it could be carried out with something like:

  --offload=spirv64 -Xoffload=spirv64 -march=other-spirv-cpu

[1]: https://lists.llvm.org/pipermail/cfe-dev/2020-December/067362.html

In D110622#3031010 <https://reviews.llvm.org/D110622#3031010>, @tra wrote:

>> --offload’ option, which is envisioned in [1], is added for specifying offload targets. This option is used to override default device target (amdgcn-amd-amdhsa) for HIP compilation for emitting device code as SPIR-V binary. The option is handled in getHIPOffloadTargetTriple().
>
> Can you elaborate on what exactly this option does and how it's intended to interact with the existing `--offload-arch`?

I think that the --offload-arch interaction question is for @yaxunl. What is being contributed here is a partial implementation for the unified offloading options. The --offload option in this patch is used to supply the offload device target triple (in HIP compilation mode) for retargeting the device code emission to SPIR-V instead of emitting HSA.

> In general a list of values, combined with the `getLastArg` will potentially be an issue if/when more than one list value will be supported.
> In a large build it's fairly common for the build infrastructure to set the default options and allowing users to extend/override them with *additional* options. `getLastArg` works great for scalar options, not so much for the lists.
> If an option is a list, modifying it requires prior knowledge of preceding values and that may not always be easy.
> E.g. a build configuration may be set to  target gfx900 and gfx908. If I want to *add* an option to target gfx1030, I would need to dig out the options for the currently-enabled architectures and specify all of them again. It's doable once, manually, but it does not scale if this option is expected to be regularly tweaked by the end user, as is the case with `--offload-arch`. If `--offload` is expected to have similar use patterns, you may need to consider allowing it to be adjusted per-list-element.

The use of getLastArg() is an oversight. I’ll fix it with getAllArgValues().

================
Comment at: clang/include/clang/Basic/Cuda.h:106
 static inline bool IsAMDGpuArch(CudaArch A) {
   return A >= CudaArch::GFX600 && A < CudaArch::LAST;
 }
----------------
tra wrote:
> Does this need to be adjusted to exclude SPIR-V? If so, you may want to add another enum range for SPIR-V.
> 
Didn't notice this. I'll fix this.

================
Comment at: clang/include/clang/Driver/Options.td:1136
+def offload_EQ : CommaJoined<["--"], "offload=">, Flags<[NoXarchOption]>,
+  HelpText<"Specify comma-separated list of offloading targets.">;
+
----------------
tra wrote:
> `comma-separated list of offloading targets.` is, unfortunately, somewhat ambiguous.
> Does it mean "how the offload will be done". I.e. HSA, OpenMP, SPIRV, CUDA? 
> Or does it mean specific hardware we need to generate the code for? 
> The code suggests it's a variant of the former, but the option description does not. 
> 
> E.g. `offload_arch_EQ ` also uses the term "offloading target" but with a different meaning.
> 
I’m not sure how to rephrase the option description to be more clear. In the [1] the `--offload` option is envisioned to be quite flexible/expressive - it can take in target triples, offload kinds, processors, aliases for processor sets, etc.

FYI, I have imagined that the `--offload` option would take in explicit offload kind and target triple combinations as the basis. For example, something like this:

```
--offload=hip-amdgcn-amd-amdhsa,openmp-x86_64-pc-linux-gnu
```

And top of that, there would be predefined strings/shortcuts/aliases that expand to the basic form. For example, 
`--offload=sm_70,openmp-host` could expand to something like:

```
--offload=cuda-nvptx64-nvidia-cuda,openmp-x86_64-pc-linux-gnu -Xoffload=cuda-nvptx64-nvidia-cuda -march=sm_70 ...

```
Then there is a feature as discussed in [1] that the offload kind can be dropped if it can be inferred by other means (e.g. from `-x hip` option). 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110622/new/

https://reviews.llvm.org/D110622