[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

Fri Jun 30 07:41:19 PDT 2017

gtbercea added a comment.

In https://reviews.llvm.org/D34784#795988, @hfinkel wrote:

> In https://reviews.llvm.org/D34784#795980, @gtbercea wrote:
>
> > In https://reviews.llvm.org/D34784#795934, @hfinkel wrote:
> >
> > > In https://reviews.llvm.org/D34784#795871, @gtbercea wrote:
> > >
> > > > In https://reviews.llvm.org/D34784#795367, @hfinkel wrote:
> > > >
> > > > > In https://reviews.llvm.org/D34784#795353, @gtbercea wrote:
> > > > >
> > > > > > In https://reviews.llvm.org/D34784#795287, @hfinkel wrote:
> > > > > >
> > > > > > > What happens if you have multiple targets? Maybe this should be -fopenmp-targets-arch=foo,bar,whatever?
> > > > > > >
> > > > > > > Once this all lands, please make sure that you add additional test cases here. Make sure that the arch is passed through to the ptx and cuda tools as it should be. Make sure that the defaults work. Make sure that something reasonable happens if the user specifies the option more than once (if they're all the same).
> > > > > >
> > > > > >
> > > > > > Hi Hal,
> > > > > >
> > > > > > At the moment only one arch is supported and it would apply to all the target triples under -fopenmp-targets.
> > > > > >
> > > > > > I was planning to address the multiple archs problem in a future patch.
> > > > > >
> > > > > > I am assuming that in the case of multiple archs, each arch in -fopenmp-targets-arch=A1,A2,A3 will bind to a corresponding triple in -fopenmp-targets=T1,T2,T3 like so: T1 with A1, T2 with A2 etc. Is this a practical interpretation of what should happen?
> > > > >
> > > > >
> > > > > Yea, that's what I was thinking. I'm a bit concerned that none of this generalizes well. To take a step back, under what circumstances do we support multiple targets right now?
> > > >
> > > >
> > > > We allow -fopenmp-targets to get a list of triples. I am not aware of any limitations in terms of how many of these triples you can have. Even in the test file of this patch we have the following: "-targets=openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu,host-powerpc64le--linux"
> > > >
> > > > > 
> > > > > 
> > > > >> Regarding tests: more tests can be added as a separate patch once offloading is enabled by the patch following this one (i.e. https://reviews.llvm.org/D29654). There actually is a test in https://reviews.llvm.org/D29654 where I check that the arch is passed to ptxas and nvlink correctly using this flag. I will add some more test cases to cover the other situations you mentioned.
> > > > > 
> > > > > Sounds good.
> > > > > 
> > > > >> Thanks,
> > > > >> 
> > > > >> --Doru
> > > >
> > > > In our previous solution there might be a problem.  The same triple might be used multiple times just so that you can have several archs in the other flag (T1 and T2 being the same). There are some alternatives which I have discussed with @ABataev.
> > > >
> > > > One solution could be to associate an arch with each triple to avoid positional matching of triples in one flag with archs in another flag:
> > > >
> > > >   -fopenmp-targets=T1:A1,T2,T3:A2
> > > >
> > > >
> > > > ":A1" is optional, also, in the future, we can pass other things to the toolchain such as "-L/a/b/c/d":
> > > >
> > > >   -fopenmp-targets=T1:A1: -L/a/b/c/d,T2,T3:A2
> > > >
> > >
> > >
> > > Okay, good, this is exactly where I was going when I said I was worried about generalization. -march seems like one of many flags I might want to pass to the target compilation. Moreover, it doesn't seem special in what regard.
> > >
> > > We have -Xclang and -mllvm, etc. to pass flags through to other stages of compilation. Could we do something similar here? Maybe something like: ``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr7``. That's unfortunately long, but if there's only one target, we could omit the triple?
> >
> >
> > The triple could be omitted, absolutely.
> >
> > If you have the following:
> >
> > -fopenmp-targets=openmp-powerpc64le-ibm-linux-gnu ``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr7`` ``-Xopenmp-target:openmp-powerpc64le-ibm-linux-gnu -march=pwr8``
> >
> > This would end up having a toolchain called for each one of the -Xopenmp-target sets of flags even though a single triple was specified under the -fopenmp-targets. Would this be ok?
>
>
> Why? That does not sound desirable. And could you even use these multiple outputs? I think you'd want to pass all of the arguments for each target triple to the one toolchain invocation for that target triple. Is that possible?

I agree, I don't think that is something we want (i.e. having one triple lead to two toolchains), with the current flags you can't do that today. I wanted to check with you though that's why i mentioned it.

I think appending all options for a particular triple together is more desirable.

> 
> 
>> 
>> 
>>> 
>>> 
>>>> An actual example:
>>>> 
>>>>   -fopenmp-targets=nvptx64-nvidia-cuda:sm_35,openmp-powerpc64le-ibm-linux-gnu

Repository:
  rL LLVM

https://reviews.llvm.org/D34784