[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Tue May 8 11:50:17 PDT 2012

On 05/08/2012 07:08 PM, Justin Holewinski wrote:
> On Tue, May 8, 2012 at 9:29 AM, <dag at cray.com <mailto:dag at cray.com>> wrote:
>
>     Tobias Grosser <tobias at grosser.es <mailto:tobias at grosser.es>> writes:
>
>      > So why the intrinsic? I want to create the PTX string from an LLVM-IR
>      > optimizer pass, that should be loaded into clang, dragonegg, opt, ..
>
>     You want to codegen in the optimizer?  I'm confused.
>
>
>      > An LLVM-IR optimizer pass does not have access to the file system and
>      > it can not link to the LLVM back ends to directly create PTX.
>     Creating
>      > PTX in an optimizer pass would be an ugly hack.
>
>     So you _don't_ want to codegen in the optimizer.  Now I'm really
>     confused.
>
>
> The device code IR would be generated in the optimization pass, and
> codegen'd when the host module is codegen'd.
>
> The word "codegen" is overloaded here, as we're talking about IR codegen
> during optimization, and device codegen during host codegen.  Confusing,
> no? :)

Correct.

>      > The cleaner solution is to store an LLVM-IR string in the host module
>      > and to mark it with the llvm.codegen() intrinsic. When the module is
>      > processed by the backend, the string is automatically translated to
>      > PTX. This requires no additional file writing, introduces no layering
>      > violations and seems to be very simple.
>
>     Why do you need to store IR in a string?  It's already in the IR file or
>     you can put it into another file.  All you need is an _external_ tool to
>     drive llc to process and codegen these multiple files (to multiple
>     targets) and then another tool to suck up the accelerator code into a
>     string in the host assembly file.  Then you assemble into an object.
>
>     No IR changes and you end up with one object file.  No changes to build
>     systems at all, it's all handled by a driver.
>
>     llvm.codegen is completely unnecessary.
>
>
> I believe the point Tobias is trying to make is that he wants to retain
> the ability to pipe modules between tools and not worry about the
> modules ever hitting disk, e.g.
>
> opt -load GPUOptimizer.so -gpu-opt | llc -march=x86
>
> where the module coming in to opt is just unoptimized host code, and the
> module coming out of opt has embedded GPU IR.

True.

> The llvm.codegen() does solve this problem, but at the cost of too much
> ambiguity.

Can this be solved by documentation changes?

Tobi