[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

Mon Jul 10 12:11:19 PDT 2017

hfinkel added inline comments.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:474
+    for (StringRef Opt : OptList) {
+      AddMArchOption(DAL, Opts, Opt);
+    }
----------------
gtbercea wrote:
> hfinkel wrote:
> > gtbercea wrote:
> > > hfinkel wrote:
> > > > Shouldn't you be adding all of the options, not just the -march= ones?
> > > I thought that that would be the case but there are a few issues:
> > > 
> > > 1. PTXAS and NVLINK each use a different flag for specifying the arch, and, in turn, each flag is different from -march.
> > > 
> > > 2. -Xopenmp-target passes a flag to the entire toolchain not to individual components of the toolchain so a translation of flags is required in some cases to adapt the flag to the actual tool. -march is one example, I'm not sure if there are others.
> > > 
> > > 3. At this point in the code, in order to add a flag and its value to the DAL list, I need to be able to specify the option type (i.e. options::OPT_march_EQ). I therefore need to manually recognize the flag in the string representing the value of -Xopenmp-target or -Xopenmp-target=triple.
> > > 
> > > 4. This patch handles the passing of the arch and can be extended to pass other flags (as is stands, no other flags are passed through to the CUDA toolchain). This can be extended on a flag by flag basis for flags that need translating to a particular tool's flag. If the flag doesn't need translating then the flag and it's value can be appended to the command line as they are.
> > > 1. PTXAS and NVLINK each use a different flag for specifying the arch, and, in turn, each flag is different from -march.
> > 
> > I don't understand why this is relevant. Don't NVPTX::Assembler::ConstructJob and NVPTX::Linker::ConstructJob handle that in either case?
> > 
> > This seems to be the same comment to point 2 as well.
> > 
> > > 3. At this point in the code, in order to add a flag and its value to the DAL list, I need to be able to specify the option type (i.e. options::OPT_march_EQ). I therefore need to manually recognize the flag in the string representing the value of -Xopenmp-target or -Xopenmp-target=triple.
> > 
> > I don't understand why this is true. Doesn't the code just below this, which handles -Xarch, do the general thing (it calls Opts.ParseOneArg and then adds it to the list of derived arguments)? Can't we handle this like -Xarch?
> > 
> > > This patch handles the passing of the arch and can be extended to pass other flags (as is stands, no other flags are passed through to the CUDA toolchain). This can be extended on a flag by flag basis for flags that need translating to a particular tool's flag. If the flag doesn't need translating then the flag and it's value can be appended to the command line as they are.
> > 
> > I don't understand this either. If we need to extend this on a flag-by-flag basis, then it seems fundamentally broken. How could we append a flag to the command line without it also affecting the host compile?
> @hfinkel 
> 
> The problem is that when using -Xopenmp-target=<triple> -opt=val the value of this flag is a list of two strings:
> 
> ['<triple>', '-opt=val']
> 
> What needs to happen next is to parse the string containing "-opt=val". The reason I have to do this is because if I use -march, I can't pass -march as is to PTXAS and NVLINK which have different flags for expressing the arch. I need to translate the -march=sm_60 flag. I will have to do this for all flags which require translation. There is no way I can just append this string to the PTXAS and NVLINK commands because the flags for the 2 tools are different. A flag which works for one of them, will not work for the other. 
> 
> So I need to actually parse that value to check whether it is a "-march" and create an Arg object with the OPT_march_EQ identifier and the sm_60 value. When invoking the commands for PTXAS and NVLINK, the dervied arg list will be travered and every -march=sm_60 option will be transformed into "--gpu-name" "sm_60" for PTXAS and into "-arch" "sm_60" for NVLINK. 
> 
> In the case of -Xarch, you will see that after we have traversed the entire arg list we still have to special case -march and add it is manually added to the DAL.
> 
> Let me know your thoughts on this.
> 
> Thanks,
> 
> --Doru
> What needs to happen next is to parse the string containing "-opt=val". 

Yes, that's what ParseOneArg will do.

> The reason I have to do this is because if I use -march, I can't pass -march as is to PTXAS and NVLINK which have different flags for expressing the arch. I need to translate the -march=sm_60 flag. I will have to do this for all flags which require translation. There is no way I can just append this string to the PTXAS and NVLINK commands because the flags for the 2 tools are different. A flag which works for one of them, will not work for the other.

> So I need to actually parse that value to check whether it is a "-march" and create an Arg object with the OPT_march_EQ identifier and the sm_60 value. When invoking the commands for PTXAS and NVLINK, the dervied arg list will be travered and every -march=sm_60 option will be transformed into "--gpu-name" "sm_60" for PTXAS and into "-arch" "sm_60" for NVLINK.

We still seem to be talking past each other.  Maybe I'm misreading the code, but it looks like TranslateArgs is called (by Compilation::getArgsForToolChain) and the *translated arguments* are what are processed by NVPTX::Assembler::ConstructJob (for ptxas) and void NVPTX::Linker::ConstructJob to construct the command lines for the relevant tools. So, while I understand that those tools take specific arguments, their respective ConstructJob routines are still responsible for doing the tool-specific translation (as they do currently). Thus, I believe you can just add all arguments here and they'll be interpreted by each tool's ConstructJob function later as necessary.

> In the case of -Xarch, you will see that after we have traversed the entire arg list we still have to special case -march and add it is manually added to the DAL.

Yes, but not way you seem to imply. The Xarch march handling special case is only doing this:

  if (!BoundArch.empty()) {
    DAL->eraseArg(options::OPT_march_EQ);
    DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ), BoundArch);
  }

and that's just overriding the current derived -march if we have BoundArch set (and this special case would apply in addition to the proposed logic for -Xopenmp-target as well).

https://reviews.llvm.org/D34784