<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">There's a lot of questions in your post, so I'll focus on the technical questions about specific IR passes in this first reply...</div><div class="gmail_quote">


<br></div><div class="gmail_quote">On 4 March 2014 15:17, Chandler Carruth <span dir="ltr"><<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">


<div>On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <span dir="ltr"><<a href="mailto:mseaborn@chromium.org" target="_blank">mseaborn@chromium.org</a>></span> wrote:<br>


</div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">Some background:  There are two related use cases for these IR simplification passes:<br>

</div></blockquote></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div><br></div>

<div> 1) Simplifying the task of writing a new LLVM backend.  This is Emscripten's use case.  The IR simplification passes reduce the number of cases a backend has to handle, so they would be useful for anyone else creating a new backend.</div>


</div></blockquote><div><br></div></div><div>If these simplify writing a backend, why wouldn't the patches include commensurate simplifications to LLVM's backends? That would both give them an in-tree customer, and more immediate value to the community and project as a whole.</div>


</div></div></div></blockquote><div><br></div><div>That's a good question.  I'll have to have a look around in the LLVM backend code and see what parts could be replaced by one of PNaCl's simplification passes.</div>

<div><br></div>

<div>One answer is that, in some cases, such as calling conventions and global constructor arrays, LLVM's backend is constrained to follow the ABIs for particular OSes and architectures.  Compatibility makes complexity harder to remove.  I'll elaborate more below.  This only applies to a few of PNaCl's IR passes though.</div>


<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">


<div>

<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">

<div> 2) Using a subset of LLVM IR as a stable distribution format for portable executables.  This is PNaCl's use case.  PNaCl's IR subset omits various complex IR features, which we lower using the IR simplification passes [2].  Renderscript is an example of another project that uses IR as a stable distribution format, though I think currently Renderscript is not subsetting IR much.<br>

</div>


</div></blockquote><div><br></div></div><div>Given that the bitcode is stable, I don't understand why this is important.</div></div></div></div></blockquote><div><br></div><div>Is the bitcode format stable now?  I heard talk that LLVM is trying to do this now, but I don't remember seeing an llvmdev thread stating that for sure.  Was there a thread about it that I missed?  I just remember hearing complaints last year that the format was still getting changed. :-)</div>

<div><br></div>

<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">


<div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div> * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower varargs and by-value argument passing respectively.  They would be useful for any backend that doesn't want to implement varargs or by-value calling conventions.<br>

</div>


</div></blockquote><div><br></div></div><div>Why wouldn't these be applicable to existing backends? What is hard about the existing representations?</div></div></div></div></blockquote><div><br></div><div>For the calling conventions lowering passes, you wouldn't want to use them in backends that have to match some existing architecture-specific ABI for calling conventions.  For example, if you use ExpandVarArgs on x86, your .o file won't be able to successfully call the printf() function provided by libc.so, because the varargs calling conventions won't match.</div>


<div><br></div><div>But for many targets that is not an issue, either because:</div><div><br></div><div> * there is no existing architecture-specific ABI that LLVM must match, or</div><div> * you're using static linking, or can make similar "closed world" assumptions, so that a module can use any calling conventions as long as they're used consistently within the module.</div>


<div><br></div><div>Both of these are true for PNaCl and Emscripten.</div><div><br></div><div>My suspicion is that one or both of these conditions will be true for other novel backends, such as for specialised architectures like GPUs.</div>

<div>

<br></div><div>Aside from PNaCl and Emscripten, I am less familiar with other novel backends.  So one of the things I had hoped to learn from this discussion was whether other backends would find these passes useful.  So far we've had some people say that yes, they would.</div>


<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">


<div class="gmail_quote"><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">


<div dir="ltr">

<div> * Instruction-level lowering:<br></div><div>    * ExpandStructRegs splits up struct values into scalars, removing the "insertvalue" and "extractvalue" instructions.</div></div></blockquote>


<div><br></div></div><div>There are already passes that do this outside of function arguments and return values. Why is a new one needed?</div></div></div></div></blockquote><div><br></div><div>Are you referring to the work that SelectionDAGBuilder.cpp does to convert insertvalue/extractvalue to a SelectionDAG?  I don't think there's an IR-to-IR pass in LLVM for doing this, is there?</div>

<div><br></div><div>The reason PNaCl needs an IR-to-IR pass is that PNaCl's stable IR omits insertvalue/extractvalue, in order to keep the format simple and reduce the set of constructs that a PNaCl translator implementation needs to handle.  The reason Emscripten's fastcomp uses ExpandStructRegs is to keep Emscripten's backend simple, in the context that it doesn't use lib/CodeGen.</div>

<div><br></div><div>And the reason we have to handle insertvalue/extractvalue at all is largely that Clang outputs them for uses of C++ method pointers.  Otherwise, structs-as-registers aren't really used.  At least, that was the case in 3.3 -- maybe some more uses have appeared since then.</div>

<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>How do you handle the overflow-detecting operations?</div></div></div></div></blockquote><div><br></div><div>PNaCl has the ExpandArithWithOverflow pass, which lowers uses of llvm.*.with.overflow.*.</div>


<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">


<div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div dir="ltr"><div>    * PromoteIntegers legalizes integer types (e.g. i30 is converted to i32).</div></div></blockquote><div><br></div></div><div>Does it split up too-wide integers?</div></div></div></div></blockquote>


<div><br></div><div>PNaCl's version currently doesn't.  Emscripten's fastcomp has a version which splits up 64-bit integer operations into 32-bit operations, which they need because Javascript doesn't support 64-bit integer arithmetic.</div>


<div><br></div><div>PNaCl's version didn't need to do that because we were happy to support 64-bit arithmetic in PNaCl's stable ABI.  However, we did find that unusual C bitfields caused Clang to generate integer types larger than 64-bit (which we don't support in PNaCl's stable ABI), so we started implementing a pass to split those up.  We should probably sync up with Emscripten and reuse their code for that.</div>

<div><br></div>

<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">


<div>Do we really want another integer legalization framework in LLVM?</div></div></div></div></blockquote><div><br></div><div>At the risk of not answering your question directly, LLVM already has two instruction selectors, SelectionDAG and FastISel.  So another question might be, when is it OK to have multiple implementations that perform similar tasks using different approaches, and when is it not OK?  What are the trade-offs involved here?</div>

<div><br></div>

<div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">


<div>I am actually interested in doing (partial) legalization in the IR during lowering (codegenprep time) in order to simplify the backend, but I don't think we should develop such a framework independently of the legalization currently used in the backends.</div>


<div>

<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">

<div><br></div><div> * Module-level lowering:  This implements, at the IR level, functionality that is traditionally provided by "ld".  e.g. ExpandCtors lowers llvm.global_ctors to the __init_array_start and __init_array_end symbols that are used by C libraries at startup.</div>


</div></blockquote><div><br></div></div><div>This doesn't make any sense to me. The IR representation is strictly simpler. It is trivially lowered in a backend. I don't understand what this would benefit.</div></div>


</div></div></blockquote><div><br></div><div>To elaborate:  In PNaCl, pexes are statically linked modules in which running global constructors is handled by user code inside the pexe.  The special llvm.global_ctors array isn't part of PNaCl's stable subset of IR, because there's no need for it to be.  Running constructors is done in normal IR by the pexe's entry point, without constructors needing to be handled specially by PNaCl's IR format.</div>


<div><br></div><div>LLVM's global_ctors construct is incomplete:  it provides a mechanism, at the IR level, to declare functions to be run at startup, but it assumes that running these functions will be done by a runtime library.  At the IR level, LLVM doesn't provide a way to implement a runtime library that can read that constructor list.  ld linker scripts provide a way to do that -- e.g. on Linux, see /usr/lib/ldscripts/elf_i386.x, which defines __init_array_{start,end} -- but that's not at the IR level.</div>


<div><br></div><div>ExpandCtors just provides a mechanism for a runtime library to list the constructor functions, purely at the IR level, without constructors having to be a special feature in the PNaCl ABI or in the Emscripten backend.</div>


<div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">

<div class="gmail_quote">

<div>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div>There seems to be plenty of precedent for IR-to-IR lowering passes -- LLVM already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.<br>

</div>


</div></blockquote><div><br></div></div><div>Note that these are quite different -- they lower from a front-end convenient form toward the canonical IR form.</div></div></div></div></blockquote><div><br></div><div>Those three passes don't lower towards canonical IR form -- unless we are taking "canonical IR form" to mean quite different things?<br>


</div><div><br></div><div>LowerInvoke and LowerAtomic both strip out information irreversibly.</div><div><br></div><div>LowerAtomic "lowers atomic intrinsics to non-atomic form for use in a known non-preemptible environment".  LowerInvoke strips out exception handling by converting invokes to calls, so that landingpads, resumes, etc. become dead and can be removed by a later pass.<br>


</div><div><br></div><div>(As an aside, LowerInvoke has an option for using SJLJ exception handling, but that option appears to be unused and replaced by lib/CodeGen/SjLjEHPrepare.cpp.)</div><div><br></div><div>LowerSwitch "rewrites switch instructions with a sequence of branches, which allows targets to get away with not implementing the switch instruction until it is convenient".<br>


</div><div><br></div><div>These three are very similar in function to PNaCl's IR simplification passes, since they reduce the set of language features that must be supported by a backend or by a stable IR format.</div>

<div>

<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">


<div class="gmail_quote"><div>You are talking about something totally different that deals with target-oriented lowering. The correct place to look for analogies is CodeGenPrep.</div></div></div></div></blockquote><div><br>


</div><div>CodeGenPrepare.cpp just contains optimisations, doesn't it?  It doesn't lower any language features such that the feature is removed from the module, so it doesn't seem to be analogous to PNaCl's IR simplification passes, which do do that.  e.g. LowerAtomic strips out atomicrmw entirely so that anything processing LowerAtomic's output doesn't have to handle atomicrmw at all.  Similarly, ExpandByVal expands out "byval" entirely.</div>


<div><br></div><div>If you're looking for backend IR-to-IR passes which lower language features, DwarfEHPrepare and SjLjEHPrepare are analogous to PNaCl's passes.  DwarfEHPrepare only lowers resume instructions, while SjLjEHPrepare handles more.</div>


<div><br></div><div>Cheers,</div><div>Mark</div><div><br></div></div></div></div>