[llvm-dev] [RFC] Heterogeneous LLVM-IR Modules

Tue Jul 28 12:24:06 PDT 2020

On Tue, 28 Jul 2020 at 20:07, Johannes Doerfert via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Long story short, I think host & device need to, and in practice do,
> agree on the data layout of the address space they use to communicate.

You can design APIs that call functions into external hardware that
have completely different data layout, you just need to properly pack
and unpack the arguments and results. IIUC, that's what you call
"agree on the DL"?

In an LLVM module, with the single-DL requirement, this wouldn't work.
But if we had multiple named DLs and attributes to functions and
globals tagged with those DLs, then you could have multiple DLs on the
same module, as long as their control flow never reaches the other
(only through specific API calls), it should be "fine". However, this
is hardly well defined and home to unlimited corner cases to handle.
Using namespaces would work for addresses, but other type sizes and
alignment would have to be defined anyway, then we're back to the
multiple-DL tags scenario.

Given that we're not allowing them to inline or interact, I wonder if
a "simpler" approach would be to allow more than one module per
"compile unit"? Those are some very strong quotes, mind you, but it
would "solve" the DL problem entirely. Since both modules are in
memory, perhaps even passing through different pipelines (CPU, GPU,
FPGA), we can do constant propagation, kernel specialisation and
strong DCE by identifying the contact points, but still treating them
as separate modules. In essence, it would be the same as having them
on the same module, but without having to juggle function attributes
and data layout compatibility issues.

The big question is, obviously, how many things would break if we had
two or more modules live at the same time. Global contexts would have
to be rewritten, but if each module passes on their own optimisation
pipelines, then the hardest part would be building the bridge between
them (call graph and other analysis) and keep that up-to-date as all
modules walk through their pipelines, so that passes like constant
propagation can "see" through the module barrier.

cheers,
--renato