[LLVMdev] [PROPOSAL] LLVM multi-module support

Thu Jul 26 04:23:24 PDT 2012

Hi Dmitry,

> In our project we combine regular binary code and LLVM IR code for kernels,
> embedded as a special data symbol of ELF object. The LLVM IR for kernel existing
> at compile-time is preliminary, and may be optimized further during runtime
> (pointers analysis, polly, etc.). During application startup, runtime system
> builds an index of all kernels sources embedded into the executable. Host and
> kernel code interact by means of special "launch" call, which does not only
> optimize&compile&execute the kernel, but first makes an estimation if it is
> worth to, or better to fall back to host code equivalent.

in your case it doesn't sound like any modifications to what a module can hold
are needed, it's more a question of building stuff on top of the existing
infrastructure.

> Proposal made by Tobias is very elegant, but it seems to be addressing the case
> when host and sub-architectures' code exist in the same time. May I kindly point
> out that to our experience the really efficient deeply specialized
> sub-architectures code may simply not exist at compile time, while the generic
> baseline host code always can.

I can't help feeling that Tobias is reinventing "tar", only upside down, and
rather than stuffing an archive inside modules he should be stuffing modules
inside an archive.  But most likely I just completely failed to understand
where he's going.

Ciao, Duncan.

>
> Best,
> - Dima.
>
> 2012/7/26 Duncan Sands <baldrick at free.fr <mailto:baldrick at free.fr>>
>
>     Hi Tobias, I didn't really get it.  Is the idea that the same bitcode is
>     going to be codegen'd for different architectures, or is each sub-module
>     going to contain different bitcode?  In the later case you may as well
>     just use multiple modules, perhaps in conjunction with a scheme to store
>     more than one module in the same file on disk as a convenience.
>
>     Ciao, Duncan.
>
>      > a couple of weeks ago I discussed with Peter how to improve LLVM's
>      > support for heterogeneous computing. One weakness we (and others) have
>      > seen is the absence of multi-module support in LLVM. Peter came up with
>      > a nice idea how to improve here. I would like to put this idea up for
>      > discussion.
>      >
>      > ## The problem ##
>      >
>      > LLVM-IR modules can currently only contain code for a single target
>      > architecture. However, there are multiple use cases where one
>      > translation unit could contain code for several architectures.
>      >
>      > 1) CUDA
>      >
>      > cuda source files can contain both host and device code. The absence of
>      > multi-module support complicates adding CUDA support to clang, as clang
>      > would need to perform multi-module compilation on top of a single-module
>      > based compiler framework.
>      >
>      > 2) C++ AMP
>      >
>      > C++ AMP [1] contains - similarly to CUDA - both host code and device
>      > code in the same source file. Even if C++ AMP is a Microsoft extension
>      > the use case itself is relevant to clang. It would be great if LLVM
>      > would provide infrastructure, such that front-ends could easily target
>      > accelerators. This would probably yield a lot of interesting experiments.
>      >
>      > 3) Optimizers
>      >
>      > To fully automatically offload computations to an accelerator an
>      > optimization pass needs to extract the computation kernels and schedule
>      > them as separate kernels on the device. Such kernels are normally
>      > LLVM-IR modules for different architectures. At the moment, passes have
>      > no way to create and store new LLVM-IR modules. There is also no way
>      > to reference kernel LLVM-IR modules from a host module (which is
>      > necessary to pass them to the accelerator run-time).
>      >
>      > ## Goals ##
>      >
>      > a) No major changes to existing tools and LLVM based applications
>      >
>      > b) Human readable and writable LLVM-IR
>      >
>      > c) FileCheck testability
>      >
>      > d) Do not force a specific execution model
>      >
>      > e) Unlimited number of embedded modules
>      >
>      > ## Detailed Goals
>      >
>      > a)
>      >    o No changes should be required, if a tool does not use multi-module
>      >      support. Each LLVM-IR file valid today, should remain valid.
>      >
>      >    o Major tools should support basic heterogeneous modules without large
>      >      changes. Some of the commands that should work after smaller
>      >      adaptions:
>      >
>      >      clang -S -emit-llvm -o out.ll
>      >      opt -O3 out.ll -o out.opt.ll
>      >      llc out.opt.ll
>      >      lli out.opt.ll
>      >      bugpoint -O3 out.opt.ll
>      >
>      > b) All (sub)modules should be directly human readable/writable.
>      >      There should be no need to extract single modules before modifying
>      >      them.
>      >
>      > c) The LLVM-IR generated from a heterogeneous multi-module should
>      >      easily be 'FileCheck'able. The same is true, if a multi-module is
>      >      the result of an optimization.
>      >
>      > d) In CUDA/OpenCL/C++ AMP kernels are scheduled from within the host
>      >      code. This means arbitrary host code can decide under which
>      >      conditions kernels are scheduled for execution. It is therefore
>      >      necessary to reference individual sub-modules from within the host
>      >      module.
>      >
>      > e) CUDA/OpenCL allows to compile and schedule an arbitrary number of
>      >      kernels. We do not want to put an artificial limit on the number of
>      >      modules they are represented in. This means a single embedded
>      >      submodule is not enough.
>      >
>      > ## Non Goals ##
>      >
>      > o Modeling sub-architectures on a per-function basis
>      >
>      > Functions could be specialized for a certain sub-architecture. This is
>      > helpful to have certain functions optimized e.g. with AVX2 enabled, but
>      > the general program being compiled for a more generic architecture.
>      > We do not address per-function annotations in this proposal.
>      >
>      > ## Proposed solution ##
>      >
>      > To bring multi-module support to LLVM, we propose to add a new type
>      > called 'llvmir' to LLVM-IR. It can be used to embed LLVM-IR submodules
>      > as global variables.
>      >
>      > ------------------------------------------------------------------------
>      > target datalayout = ...
>      > target triple = "x86_64-unknown-linux-gnu"
>      >
>      > @llvm_kernel = private unnamed_addr constant llvm_kernel {
>      >     target triple = nvptx64-unknown-unknown
>      >     define internal ptx_kernel void @gpu_kernel(i8* %Array) {
>      >       ...
>      >     }
>      > }
>      > ------------------------------------------------------------------------
>      >
>      > By default the global will be compiled to a llvm string stored in the
>      > object file. We could also think about translating it to PTX or AMD's
>      > HSA-IL, such that e.g. PTX can be passed to a run-time library.
>      >
>      >   From my point of view, Peters idea allows us to add multi-module
>      > support in a way that allows us to reach the goals described above.
>      > However, to properly design and implement it, early feedback would be
>      > valuable.
>      >
>      > Cheers
>      > Tobi
>      >
>      > [1] http://msdn.microsoft.com/en-us/library/hh265137%28v=vs.110%29
>      > [2]
>      >
>     http://www.amd.com/us/press-releases/Pages/amd-arm-computing-innovation-2012june12.aspx
>      > _______________________________________________
>      > LLVM Developers mailing list
>      > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu
>      > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>      >
>
>     _______________________________________________
>     LLVM Developers mailing list
>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu
>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>