[cfe-dev] [RFC] Unified offloading option for CUDA/HIP/OpenMP
Artem Belevich via cfe-dev
cfe-dev at lists.llvm.org
Mon Dec 14 14:04:43 PST 2020
On Mon, Dec 14, 2020 at 12:20 PM Doerfert, Johannes <jdoerfert at anl.gov>
> Hi Sam, thanks for driving this, I really like the idea!
> Here are some thoughts:
> Make the "kind" optional if it can be deduced, as Alexey noted. So
> should work fine if we have an -x c and -fopenmp set. Error out if
> it is ambiguous.
> Allow multiple -offload occurrences.
> Keep the support of the old ways for now as well.
> Allow to pass the kind + triple + arch as first part of the new -offload
> flag and any options as a second part, so:
> works but also does
> -offload="amd-gfx906 -fvectorize" -x hip
> as well as
> -offload="amd -march=gfx906 -fno-vectorize" -fopenmp
> This will make it way easier to use.
Naming things consistently is hard. We need to consider that we'll need to
pass an arbitrary complex set of options for each offload instance,
whatever it may be. There may be a lot of options per individual offload
instance that would differ only minimally. Having to repeat all of them
will be tedious at best. We may also need to pass options further down the
compilation stack. E.g. we may want different ptxas options for each CUDA
Perhaps we could enhance the option parser to create a notion of argument
scope? "Arguments in a string" approach sort of does it already in a
limited way, but it would still need CLI parser changes to handle the
If we implement one scope level, making it hierarchical should not be that
Having such CLI model would allow us to do thing like this:
--offload=hip-gfx9* --something-common-to-all-gfx9xx targets>
--offload-end 2 // pops two levels of CLI scopes. 1 level if no argument is
The identifier could be a regex/glob match on an arbitrary string. We don't
need it to carry any specific paramenters itself, they should just be
meaningful enough for the parts of the code that care about particular
scope to provide their 'scope string' to match against.
I.e. for example above, HIP toolchain would set CLI scope(s) to
be hip-gfx999. There will be implicit top-level '--offload=.*' which would
always match and then the parser would reparse the options taking into
account only the matching scopes. This could allow us to specify both OMP
and CUDA/HIP options for the same compilation -- we could conceivably
benefit from OMP offload to multiple threads in the host-side compilation.
It all may be an utter overkill, too. WDYT?
> I hope some of these make some sense :)
> ~ Johannes
> On 12/12/20 8:11 AM, Liu, Yaxun (Sam) wrote:
> > [AMD Public Use]
> > Currently CUDA/HIP and OpenMP has different offloading options, e.g.
> > clang++ -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa
> -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx900 test.cpp
> > clang++ -offload-arch=gfx906 test.hip
> > Our users request to have a concise way to specify offloading options
> for OpenMP. Ideally, one option to convey offloading kind, offloading
> triple, and offloading device arch.
> > On the other hand, there are some limitations of the current offloading
> option for CUDA/HIP:
> > 1. It does not specify offloading kind whereas relies on file type to
> infer offloading kind. If input file is not CUDA/HIP source code (e.g.
> bundled LLVM bit code), there needs a way to specify offloading kind.
> > 2. It does not specify offloading target triple whereas relies on device
> arch to infer target triple. As HIP is ported to different targets, there
> needs a way to specify offloading target triple.
> > In summary, a unified offloading option is preferred, which conveys
> offloading kind, offloading target triple and offloading device arch.
> > I would like to propose to either have a new option or extend the
> existing -offload-arch option for that, in the format kind-triple-arch, e.g.
> > -offload=omp-amd-gfx900
> > -offload=hip-amd-gfx906
> > Whereas kind and triple can be abbreviations for conciseness, e.g. omp
> expands to openmp, amd expands to amdgcn-amd-amdhsa. Arch can be omitted,
> in which case clang will use the default arch for the triple.
> > Your feedbacks are welcome.
> > Thanks.
> > Sam
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev