[Openmp-dev] Multi arch deviceRTL status

Wed Dec 11 17:33:22 PST 2019

On 12/12, Jon Chesterfield wrote:
> The amdgcn deviceRTL needs a shim around atomics and a copy of libcall.cu
> to be broadly functional. That seems minor.

Minor, agreed. With the "copy" part only if copy means you split it into
common and target code and call target code through target_impl.h from
the common code ;)

> There's some refactoring work going on in the aomp branch to reduce the
> libraries it depends on. The nvptx/cuda openmp needs an entire second
> toolchain installed. I don't want that to be true for amdgcn as well.

Sounds good, though that is a secondary goal (IMHO). If we get support
up and running people will install another toolchain ;)

> The hsa plugin is about 1200 lines total, already working, with a few
> outstanding todos and stylistic improvements available. Ron is looking at
> the todos at present. I'd be equally happy to iterate on that in tree -
> it's not really code that can be used for other architectures so making it
> beautiful isn't strictly necessary. It may also get reimplemented in terms
> of a different underlying API at some point next year.

That sounds fair. Reusing plugin code was never a top priority.

> Aside from that... it's down to the clang/llvm support, and how much
> customisation it takes to target nvptx & amdgcn from the same code path.
> Hopefully the differences largely lie in the runtime. I need to pull down a
> copy of your patches and see what needs to be tweaked to get a second gpu
> target working.

I will try to put a rebase of TRegion IRBuilder stuff on phab tomorrow.
We can start with unit tests to trigger it. The runtime patches should
apply but the interface is not "up to date".

> Getting support in prior to the clang fork would make me happy. Up for
> working pretty long days to hit that. After the Christmas party tomorrow at
> least :)

Same here, though no Christmas party ;)

> One hazard - the runtime makes use of function pointers, which the llvm
> amdgcn back end (i.e. llc) doesn't support. We inline very aggressively so
> that mostly works out anyway, but there are a couple of places that route
> the function pointer through memory (reduction iirc), and the aomp work
> around for that is not pretty. I'm looking for better options.

The better option are TRegion reductions that never made it to Phab.
Function pointers will stay for some things but we won't need to store
them making constant propagation and inlining easy. Let's talk about
that later though.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20191212/05703f70/attachment-0001.sig>