[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Tobias Grosser
tobias at grosser.es
Sun Apr 29 06:37:06 PDT 2012
On 04/29/2012 01:21 AM, Justin Holewinski wrote:
>
>
> On Sat, Apr 28, 2012 at 8:27 AM, Tobias Grosser <tobias at grosser.es
> <mailto:tobias at grosser.es>> wrote:
> regalloc= is different. It is global and consequently influences
> both host and device code generation. However, to me it is rather a
> debugging option. It is never set by clang and targets provide a
> reasonable default based on the optimization level. I believe we can
> assume that for our use case it is not set. In case it is really
> necessary to explicitly set the register allocator, the right
> solution would be to make regalloc a target option.
>
>
> The regalloc= option was just an example of the types of flags that can
> be passed to llc, which are handled as global options instead of target
> options.
Yes, thanks for pointing us to this problem. For now I think we can
ignore them as they are mostly debugging options and they can be
included in the target options if needed.
> The implicit assumption seems to be that the host code wants the device
> code as assembly text. What happens when you need to link the device
> binary and upload it separately? Think automatic SPU codegen on Cell.
> Is it up to the host program to invoke the other target's linker?
OK, I get what you mean. The intrinsic is currently targeted at the
OpenCL/CUDA model. It is the most widely used. Stuff like cell sounds
interesting, but probably needs further thoughts. Even with OpenCL/CUDA,
this intrinsic works currently only for PTX code generation, but I hope
we can gain support for other GPU devices later on.
> I agree that future work can be useful here. However, before
> spending a large amount of time to engineer a complex solution, I
> propose to start with the proposed light-weight approach. It is
> sufficient for our needs and will allow us to get the experience and
> infrastructure that can help us to choose and implement a more
> complex later on.
>
>
> I agree that this approach is the best way to get short-term results,
> especially for the GSoC project.
OK, let's go ahead.
Yabin, can you update the patch with the following changes:
- Remove the Arch flag
- Document that we require a triple
- Add two new arguments that take a feature string and a mcpu
flag (can be set to "", which means we use the default)
Cheers
Tobi
More information about the llvm-dev
mailing list