[llvm-dev] [RFC] Heterogeneous LLVM-IR Modules

Tue Jul 28 12:25:43 PDT 2020

On Tue, Jul 28, 2020 at 12:07 PM Johannes Doerfert <
johannesdoerfert at gmail.com> wrote:

> [I removed all but the data layout question, that is an important topic]
> On 7/28/20 1:03 PM, Mehdi AMINI wrote:
>  > TL;DR
>  >> -----
>  >>
>  >> Let's allow to merge to LLVM-IR modules for different targets (with
>  >> compatible data layouts) into a single LLVM-IR module to facilitate
>  >> host-device code optimizations.
>  >>
>  >
>  > I think the main question I have is with respect to this limitation
> on the
>  > datalayout: isn't it too limiting in practice?
>  > I understand that this is much easier to implement in LLVM today, but it
>  > may get us into a fairly limited place in terms of what can be
> supported in
>  > the future.
>  > Have you looked into what would it take to have heterogeneous modules
> that
>  > have their own DL?
>
>
> Let me share some thoughts on the data layouts situation, not all of
> which are
> fully matured but I guess we have to start somewhere:
>
> If we look at the host-device interface there has to be some agreement
> on parts of the datalayout, namely what the data looks like the host
> sends over and expects back. If I'm not mistaken, GPUs will match the
> host in things like padding, endianness, etc. because you cannot
> translate things "on the fly". That said, here might be additional
> "address spaces" on either side that the other one is not matching/aware
> of. Long story short, I think host & device need to, and in practice do,
> agree on the data layout of the address space they use to communicate.
>
> The above is for me a strong hint that we could use address spaces to
> identify/distinguish differences when we link the modules. However,
> there might be the case that this is not sufficient, e.g., if the
> default alloca address space differs. In that case I don't see a reason
> to not pull the same "trick" as with the triple. We can specify
> additional data layouts, one per device, and if you retrieve the data
> layout, or triple, you need to pass a global symbol as a "anchor". For
> all intraprocedural passes this should be sufficient as they are only
> interested in the DL and triple of the function they look at. For IPOs
> we have to distinguish the ones that know about the host-device calls
> and the ones that don't. We might have to teach all of them about these
> calls but as long as they are callbacks through a driver routine I don't
> even think we need to.
>
> I'm curious if you or others see an immediate problem with both a device
> specific DL and triple (optionally) associated with every global symbol.
>

Having a triple/DL per global symbols would likely solve everything, I
didn't get from your original email that this was considered.
If I understand correctly what you're describing, the DL on the Module
would be a "default" and we'd need to make the DL/triple APIs on the Module
"private" to force queries to go through an API on GlobalValue to get the
DL/triple?

>
>
> ~ Johannes
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200728/830dc2ab/attachment.html>