<div class="gmail_quote">On Tue, May 1, 2012 at 8:22 AM, <span dir="ltr"><<a href="mailto:dag@cray.com" target="_blank">dag@cray.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">Justin Holewinski <<a href="mailto:justin.holewinski@gmail.com">justin.holewinski@gmail.com</a>> writes:<br>
<br>
> I don't think the code base changes are all that bad. We have a number<br>
> of them to support generating code one function at a time rather than a<br>
> whole module together. They've been sitting around waiting for us to<br>
> send them upstream. It would be an easy matter to simply annotate each<br>
> function with its target. We don't currently do that because we never<br>
> write out such IR files but it seems like a simple problem to solve to<br>
> me.<br>
><br>
> If such changes are almost ready to be up-streamed, then great!<br>
<br>
</div>Just to clariofy, the current changes simply allow a function to be<br>
completely processed (including asm generation) before the next function<br>
is sent to codegen.<br>
<div class="im"><br>
> It just seems like a fairly non-trivial task to actually implement<br>
> function-level target selection, especially when you consider function<br>
> call semantics, taking the address of a function, etc.<br>
<br>
</div>For something like PTX, runtime calls take care of the call semantics so<br>
it is either up to the user or the frontend to set up the runtime calls<br>
correctly. We don't need to completely solve this problem. Yet. :)<br></blockquote><div><br></div><div>But there has to be some interface that allows an LLVM IR function from one architecture to get at the code or name of a function from another architecture. This could be handled in the front-end, but it seems like we could design some abstraction.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im"><br>
> If you have a global variable, what target "sees" it? Does it need to<br>
> be annotated along with the function?<br>
<br>
</div>For a tool like llc, wouldn't it be simply a matter of changing<br>
TheTarget and reconstituting the various passes? The changes we have<br>
waiting to upstream already allow us to reconstitute passes. I<br>
sometimes use this to turn on/off debugging on a function-level basis.<br>
<br>
The way we've constructed our backend interface should just allow us to<br>
switch the target and reinitialize everything. I'm sure I'm glossing<br>
over tons of details but I don't see a fundamental architectural problem<br>
in LLVM that would prevent this.<br></blockquote><div><br></div><div>Sorry, I meant global variables in the LLVM IR. Are they valid for only one architecture in the IR module?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im"><br>
> Can functions from two different targets share this pointer?<br>
<br>
</div>Again, in the case of PTX it's the runtime's responsibility to ensure<br>
this. I agree passing pointers around complicates things in the general<br>
case but I also think it's a solvable problem.<br>
<div class="im"><br>
> For Yabin's use-case, the X86 portions need to be compiled to<br>
> assembly, or even an object file, while the PTX portions need to be<br>
> lowered to an assembly string and embedded in the X86 source (or<br>
> written to disk somewhere).<br>
<br>
</div>I think it's just a matter of switching to a different AsmWriter. The<br>
PTX runtime can load objects from files. The code doesn't have to be a<br>
string in the x86 object file.<br>
<div class="im"><br>
> If you're targeting Cell, in contrast, you'd want to compile both down<br>
> to object files.<br>
<br>
</div>I think we probably want to do that for PTX as well.<br></blockquote><div><br></div><div>Maybe, maybe not. It may make sense to rely on run-time JIT'ing of the PTX.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im"><br>
> For me, the bigger question is: do we extend the IR to support<br>
> multiple targets, or do we keep the one-target-per-module philosophy<br>
> and derive some other way of representing how the modules fit<br>
> together? I can see pros and cons for both approaches.<br>
<br>
</div>Me too.<br>
<div class="im"><br>
> What if instead of per-function annotations, we implement something<br>
> like module file sections? You could organize a module file into<br>
> logical sections based on target architecture. I'm just throwing that<br>
> out there.<br>
<br>
</div>Do we allow more than one Module per file? If not, that seems like an<br>
arbitrary limitation. If we allowed that we could have each module<br>
specify a different target.<br></blockquote><div><br></div><div>That could work.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
-Dave<br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><br><div>Thanks,</div><div><br></div><div>Justin Holewinski</div><br>