[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Justin Holewinski justin.holewinski at gmail.com
Tue May 8 08:08:20 PDT 2012


On Tue, May 8, 2012 at 2:20 AM, Tobias Grosser <tobias at grosser.es> wrote:

> On 05/08/2012 12:14 AM, dag at cray.com wrote:
>
>> Tobias Grosser<tobias at grosser.es>  writes:
>>
>>  I forgot to address this one.  With current OpenCL and CUDA
>>>> specifications, there's no need to do multiple .o files.  In my mind,
>>>> llc should output one .o (one .s, etc.).  Anything else wreaks havoc on
>>>> build systems.
>>>>
>>>
>>> Yes, that's what I am advocating for. There is no need for all this
>>> complexity. Both standards store the embedded code as a string in the
>>> host module. That is exactly what the llvm.codegen intrinsic
>>> models. It requires zero further changes to the code generation
>>> backend.
>>>
>>
>> But why do you need an intrinsic to do that?  Just generate the code to
>> a file and suck it into a string, maybe with an external "linker" tool.
>>
>> If you just want something to work, that should be sufficient.  If you
>> want some long-term design/implementation I don't think llvm.codegen is
>> it.
>>
>
> OK. I think we are on the same track. Yes, there is no need for a lot of
> infrastructure. Storing PTX in a string of the host module, is the only
> thing needed.
>
> So why the intrinsic? I want to create the PTX string from an LLVM-IR
> optimizer pass, that should be loaded into clang, dragonegg, opt, ..
> An LLVM-IR optimizer pass does not have access to the file system and it
> can not link to the LLVM back ends to directly create PTX. Creating PTX in
> an optimizer pass would be an ugly hack. The cleaner solution is to store
> an LLVM-IR string in the host module and to mark it with the llvm.codegen()
> intrinsic. When the module is processed by the backend, the string is
> automatically translated to PTX. This requires no additional file writing,
> introduces no layering violations and seems to be very simple.
>
> I don't see a better way to translate LLVM-IR to PTX. Do you stil believe
> introducing file writing to an optimizer module is a good and portable
> solution?
>

Until any new infrastructure is implemented, I don't see it being any worse
of a solution.  Don't get me wrong, I think the llvm.codegen() intrinsic is
a fast way to get things up and running for the GSoC project; but I also
agree with Dan and Evan that it's not appropriate for LLVM mainline. There
are just too many subtle details and this really only handles the case of
host code needing the device code as text assembly.

To support opt-level transforms, you could just embed the generated IR as
text in the module, then invoke a separate tool to extract that you into a
separate module.  The more I think about this, the more I become convinced
that we could benefit from a module "container," similar to a Mac
fat/universal binary.  Something like this probably wouldn't be too hard to
implement; the main problem I see if what llc outputs, or maybe a single
llc invocation would only process one module in the container.




>
> Cheers
> Tobi
>



-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120508/d2eb8008/attachment.html>


More information about the llvm-dev mailing list