[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

Thu Jul 6 16:17:59 PDT 2017

hfinkel added inline comments.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:474
+    for (StringRef Opt : OptList) {
+      AddMArchOption(DAL, Opts, Opt);
+    }
----------------
gtbercea wrote:
> hfinkel wrote:
> > Shouldn't you be adding all of the options, not just the -march= ones?
> I thought that that would be the case but there are a few issues:
> 
> 1. PTXAS and NVLINK each use a different flag for specifying the arch, and, in turn, each flag is different from -march.
> 
> 2. -Xopenmp-target passes a flag to the entire toolchain not to individual components of the toolchain so a translation of flags is required in some cases to adapt the flag to the actual tool. -march is one example, I'm not sure if there are others.
> 
> 3. At this point in the code, in order to add a flag and its value to the DAL list, I need to be able to specify the option type (i.e. options::OPT_march_EQ). I therefore need to manually recognize the flag in the string representing the value of -Xopenmp-target or -Xopenmp-target=triple.
> 
> 4. This patch handles the passing of the arch and can be extended to pass other flags (as is stands, no other flags are passed through to the CUDA toolchain). This can be extended on a flag by flag basis for flags that need translating to a particular tool's flag. If the flag doesn't need translating then the flag and it's value can be appended to the command line as they are.
> 1. PTXAS and NVLINK each use a different flag for specifying the arch, and, in turn, each flag is different from -march.

I don't understand why this is relevant. Don't NVPTX::Assembler::ConstructJob and NVPTX::Linker::ConstructJob handle that in either case?

This seems to be the same comment to point 2 as well.

> 3. At this point in the code, in order to add a flag and its value to the DAL list, I need to be able to specify the option type (i.e. options::OPT_march_EQ). I therefore need to manually recognize the flag in the string representing the value of -Xopenmp-target or -Xopenmp-target=triple.

I don't understand why this is true. Doesn't the code just below this, which handles -Xarch, do the general thing (it calls Opts.ParseOneArg and then adds it to the list of derived arguments)? Can't we handle this like -Xarch?

> This patch handles the passing of the arch and can be extended to pass other flags (as is stands, no other flags are passed through to the CUDA toolchain). This can be extended on a flag by flag basis for flags that need translating to a particular tool's flag. If the flag doesn't need translating then the flag and it's value can be appended to the command line as they are.

I don't understand this either. If we need to extend this on a flag-by-flag basis, then it seems fundamentally broken. How could we append a flag to the command line without it also affecting the host compile?

================
Comment at: test/Driver/openmp-offload.c:607
+
+// CHK-FOPENMP-EQ-TARGET: clang{{.*}} argument unused during compilation: '-Xopenmp-target=powerpc64le-ibm-linux-gnu -march=pwr8'
+
----------------
gtbercea wrote:
> hfinkel wrote:
> > I don't see why you'd check that the arguments are unused. They should be used. One exception might be that you might want to force -Xopenmp-target=foo to be unused if foo is not a currently-targeted offloading triple. There could be a separate test case for that.
> > 
> > Otherwise, I think you should be able to check the relevant backend commands, no? (something like where CHK-COMMANDS is used above in this file).
> Only the CUDA toolchain currently contains code which considers the value of the -Xopenmp-target flag. The CUDA toolchain is not capable of offloading until the next patch lands so any test for how the flag propagates to the CUDA toolchain will have to wait.
> 
> Passing a flag to some other toolchain again doesn't work because the other toolchains have not been instructed to look at this flag so they won't contain the passed flag in their respective command lines.
> 
> For a lack of a better test, what I wanted to show is that the usage of this flag doesn't throw an error such as unknown flag and is correctly recognized: "-Xopenmp-target=powerpc64le-ibm-linux-gnu -march=pwr8". 
> 
> 
> 
> Passing a flag to some other toolchain again doesn't work because the other toolchains have not been instructed to look at this flag so they won't contain the passed flag in their respective command lines.

I think, however, that we need to refactor this so that it works for all toolchains. If you convince me otherwise, then this will be fine as well :-)

https://reviews.llvm.org/D34784