[PATCH] D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading
Gheorghe-Teodor Bercea via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Jun 29 14:36:02 PDT 2017
gtbercea added a comment.
In https://reviews.llvm.org/D34784#795367, @hfinkel wrote:
> In https://reviews.llvm.org/D34784#795353, @gtbercea wrote:
> > In https://reviews.llvm.org/D34784#795287, @hfinkel wrote:
> > > What happens if you have multiple targets? Maybe this should be -fopenmp-targets-arch=foo,bar,whatever?
> > >
> > > Once this all lands, please make sure that you add additional test cases here. Make sure that the arch is passed through to the ptx and cuda tools as it should be. Make sure that the defaults work. Make sure that something reasonable happens if the user specifies the option more than once (if they're all the same).
> > Hi Hal,
> > At the moment only one arch is supported and it would apply to all the target triples under -fopenmp-targets.
> > I was planning to address the multiple archs problem in a future patch.
> > I am assuming that in the case of multiple archs, each arch in -fopenmp-targets-arch=A1,A2,A3 will bind to a corresponding triple in -fopenmp-targets=T1,T2,T3 like so: T1 with A1, T2 with A2 etc. Is this a practical interpretation of what should happen?
> Yea, that's what I was thinking. I'm a bit concerned that none of this generalizes well. To take a step back, under what circumstances do we support multiple targets right now?
We allow -fopenmp-targets to get a list of triples. I am not aware of any limitations in terms of how many of these triples you can have. Even in the test file of this patch we have the following: "-targets=openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu,host-powerpc64le--linux"
>> Regarding tests: more tests can be added as a separate patch once offloading is enabled by the patch following this one (i.e. https://reviews.llvm.org/D29654). There actually is a test in https://reviews.llvm.org/D29654 where I check that the arch is passed to ptxas and nvlink correctly using this flag. I will add some more test cases to cover the other situations you mentioned.
> Sounds good.
In our previous solution there might be a problem. The same triple might be used multiple times just so that you can have several archs in the other flag (T1 and T2 being the same). There are some alternatives which I have discussed with @ABataev.
One solution could be to associate an arch with each triple to avoid positional matching of triples in one flag with archs in another flag:
":A1" is optional, also, in the future, we can pass other things to the toolchain such as "-L/a/b/c/d":
An actual example:
More information about the cfe-commits