<div class="gmail_quote">On Tue, May 1, 2012 at 8:22 AM,  <span dir="ltr"><<a href="mailto:dag@cray.com" target="_blank">dag@cray.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im">Justin Holewinski <<a href="mailto:justin.holewinski@gmail.com">justin.holewinski@gmail.com</a>> writes:<br>

<br>

>     I don't think the code base changes are all that bad.  We have a number<br>

>     of them to support generating code one function at a time rather than a<br>

>     whole module together.  They've been sitting around waiting for us to<br>

>     send them upstream.  It would be an easy matter to simply annotate each<br>

>     function with its target.  We don't currently do that because we never<br>

>     write out such IR files but it seems like a simple problem to solve to<br>

>     me.<br>

><br>

> If such changes are almost ready to be up-streamed, then great!<br>

<br>

</div>Just to clariofy, the current changes simply allow a function to be<br>

completely processed (including asm generation) before the next function<br>

is sent to codegen.<br>

<div class="im"><br>

> It just seems like a fairly non-trivial task to actually implement<br>

> function-level target selection, especially when you consider function<br>

> call semantics, taking the address of a function, etc.<br>

<br>

</div>For something like PTX, runtime calls take care of the call semantics so<br>

it is either up to the user or the frontend to set up the runtime calls<br>

correctly.  We don't need to completely solve this problem.  Yet.  :)<br></blockquote><div><br></div><div>But there has to be some interface that allows an LLVM IR function from one architecture to get at the code or name of a function from another architecture.  This could be handled in the front-end, but it seems like we could design some abstraction.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im"><br>

> If you have a global variable, what target "sees" it?  Does it need to<br>

> be annotated along with the function?<br>

<br>

</div>For a tool like llc, wouldn't it be simply a matter of changing<br>

TheTarget and reconstituting the various passes?  The changes we have<br>

waiting to upstream already allow us to reconstitute passes.  I<br>

sometimes use this to turn on/off debugging on a function-level basis.<br>

<br>

The way we've constructed our backend interface should just allow us to<br>

switch the target and reinitialize everything.  I'm sure I'm glossing<br>

over tons of details but I don't see a fundamental architectural problem<br>

in LLVM that would prevent this.<br></blockquote><div><br></div><div>Sorry, I meant global variables in the LLVM IR.  Are they valid for only one architecture in the IR module?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div class="im"><br>

> Can functions from two different targets share this pointer?<br>

<br>

</div>Again, in the case of PTX it's the runtime's responsibility to ensure<br>

this.  I agree passing pointers around complicates things in the general<br>

case but I also think it's a solvable problem.<br>

<div class="im"><br>

> For Yabin's use-case, the X86 portions need to be compiled to<br>

> assembly, or even an object file, while the PTX portions need to be<br>

> lowered to an assembly string and embedded in the X86 source (or<br>

> written to disk somewhere).<br>

<br>

</div>I think it's just a matter of switching to a different AsmWriter.  The<br>

PTX runtime can load objects from files.  The code doesn't have to be a<br>

string in the x86 object file.<br>

<div class="im"><br>

> If you're targeting Cell, in contrast, you'd want to compile both down<br>

> to object files.<br>

<br>

</div>I think we probably want to do that for PTX as well.<br></blockquote><div><br></div><div>Maybe, maybe not.  It may make sense to rely on run-time JIT'ing of the PTX.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div class="im"><br>

> For me, the bigger question is: do we extend the IR to support<br>

> multiple targets, or do we keep the one-target-per-module philosophy<br>

> and derive some other way of representing how the modules fit<br>

> together?  I can see pros and cons for both approaches.<br>

<br>

</div>Me too.<br>

<div class="im"><br>

> What if instead of per-function annotations, we implement something<br>

> like module file sections?  You could organize a module file into<br>

> logical sections based on target architecture.  I'm just throwing that<br>

> out there.<br>

<br>

</div>Do we allow more than one Module per file?  If not, that seems like an<br>

arbitrary limitation.  If we allowed that we could have each module<br>

specify a different target.<br></blockquote><div><br></div><div>That could work.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

                                 -Dave<br>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><br><div>Thanks,</div><div><br></div><div>Justin Holewinski</div><br>