<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jul 24, 2017 at 9:25 PM, River Riddle <span dir="ltr"><<a href="mailto:riddleriver@gmail.com" target="_blank">riddleriver@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hey Sean, </div><div>  The bit about the attributes is good to know. When LTO is enabled the early run will still run per-TU but the late run will be shifted to work on the full LTO bitcode. Also I don't have specific numbers on how often parameterization is utilized but I can assure you that it's a majority of the time.</div></div></blockquote><div><br></div><div>Interesting. It would be good to have some specific data on this (is "majority" actually 65% or 99%? How does it vary across benchmarks?), because this is something that can't be done in the current post-RA machine outliner (even in principle).</div><div><br></div><div>A pre-RA MIR-level outliner would have similar issues to your LLVM IR outliner w.r.t. estimating instruction sizes.</div><div>For example, pre-RA you might see a virtual register but it might actually end up spilled so there will be spill/fill code size that is unaccounted for (in fact, I wouldn't be surprised if this inaccuracy of the cost model was actually greater than the inaccuracy from assuming a fixed cost per instruction either at pre-RA MIR level or LLVM IR level; I don't have data supporting one way or the other though).</div><div>Also, outlining pre-RA inherently constrains the register allocator, so there will be indirect effects on code size. It's not clear that doing things at MIR level pre-RA will really allow avoiding this any more than marking LLVM IR level outlined functions using an appropriate calling convention (and maybe a target hook or two).</div><div><br></div><div>-- Sean Silva </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div> I'll look into transforming the data into a visual format since its in google docs anyways.</div><div>Thanks,</div><div> River Riddle<div><div class="h5"><br><div class="gmail_quote"><div>On Mon, Jul 24, 2017 at 9:17 PM Sean Silva <<a href="mailto:chisophugis@gmail.com" target="_blank">chisophugis@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="gmail_extra"><div class="gmail_quote">On Thu, Jul 20, 2017 at 3:47 PM, River Riddle via llvm-dev <span><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><span id="m_-8416307079047273452m_-1663956593680671482m_-5274560702495700611gmail-m_1023322599362061652gmail-docs-internal-guid-128cceb0-622c-6ca0-9998-bc07500bb3bc"><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">  I’m River and I’m a compiler engineer at PlayStation. Recently, I’ve been working on an interprocedural outlining (code folding) pass for code size improvement at the IR level. We hit a couple of use cases that the current code size solutions didn’t handle well enough. Outlining is one of the avenues that seemed potentially beneficial.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">-- Algorithmic Approach --</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The general implementation can be explained in stages:</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Candidate Selection:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Each instruction in the module is mapped to a congruence class based upon relaxed equivalency constraints. For most instructions this is simply: The type, opcode, and operand types. The constraints are tightened for instructions with special state that require more exact equivalence (e.g. ShuffleVector requires a constant mask vector). Candidates are then found by constructing a suffix array and lcp(longest common prefix) array. Walking the lcp array gives us congruent chains of instructions within the module.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Candidate Analysis:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">A verification step splits candidates that have different internal input sequences or incompatible parent function attributes between occurrences. An example of incompatible internal inputs sequences is:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">X = W + 6;   vs    X = W + 6;</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Y = X + 4;            Y = W + 4;</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The above two occurrences would need special control flow to exist within the same outlined function.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">During analysis candidates have their inputs and outputs computed along with an estimated benefit from extraction. During input calculation a constant folding step removes all inputs that are the same amongst all occurrences.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Candidate Pruning:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Overlapping candidates are pruned with a generic greedy algorithm that picks occurrences starting from the most beneficial candidates.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Candidate Outlining:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Non pruned candidates are then outlined. Outputs from a candidate are returned via an output parameter, except for one output that is promoted to a return value. During outlining the inputs into the candidate are condensed by computing the equivalencies between the arguments at each occurrence. An example of this is:</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">outlinedFn(1,6,1);  ->  outlinedFn(1,6);</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">outlinedFn(2,4,2);  ->  outlinedFn(2,4);</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">In the above, parameters 1 and 3 were found to be equivalent for all occurrences, thus the amount of inputs was reduced to 2.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Debug Info:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Debug information is preserved for the calls to functions which have been outlined but all debug info from the original outlined portions is removed, making them harder to debug.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Profile Info:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">If the pass is running at Os the outliner will only consider cold blocks, whereas Oz considers all blocks that are not marked as hot.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Location in Pipeline:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The pass is currently configured to run very late in the optimization pipeline. It is intended to run at Oz but will also run at Os if there is profile data available. The pass can optionally be run twice, once before function simplification and then again at the default location. This run is optional because you are gambling the potential benefits of redundancy elimination vs the potential benefits from function simplification. This can lead to large benefits or regressions depending on the benchmark (though the benefits tend to outnumber the regressions). The performance section contains data for both on a variety of benchmarks.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">-- Why at the IR level --</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The decision to place the outliner at the IR level comes from a couple of major factors:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap"> - Desire to be target independent</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap"> - Most opportunities for congruency</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The downside to having this type of transformation be at the IR level is it means there will be less accuracy in the cost model -  we can somewhat accurately model the cost per instruction but we can’t get information on how a window of instructions may lower. This can cause regressions depending on the platform/codebase, therefore to help alleviate this there are several tunable parameters for the cost model.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">-- Performance --</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">More results including clang, llvm-tblgen, and more specific numbers about benefits/regressions can be found in the notes section below.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Size Reduction:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Test Suite(X86_64):</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Early+Late outlining provides a geomean of 10.5% reduction over clang Oz, with a largest improvement of ~67% and largest regression of ~7.5%.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Late outlining provides a geomean of 4.65% reduction, with a largest improvement of ~51% and largest regression of ~6.4%.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Spec 2006(X86_64)</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Early+Late outlining provides a geomean reduction of 2.08%.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Late outlining provides 2.09%.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - CSiBE(AArch64)</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Early+Late outlining shows geomean reduction of around 3.5%.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Late outlining shows 3.1%.</span></p></span></div></blockquote><div><br></div><div><br></div></div></div></div><div><div class="gmail_extra"><div class="gmail_quote"><div>It would be good to visualize these results. Maybe a bar chart like <a href="https://goo.gl/qN2HqA" target="_blank">https://goo.gl/qN2HqA</a> from <a href="http://blog.llvm.org/2016/06/thinlto-scalable-and-incremental-lto.html" target="_blank">http://blog.llvm.org/2016<wbr>/06/thinlto-scalable-and-incre<wbr>mental-lto.html</a> for SPEC?</div></div></div></div><div><div class="gmail_extra"><div class="gmail_quote"><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><span id="m_-8416307079047273452m_-1663956593680671482m_-5274560702495700611gmail-m_1023322599362061652gmail-docs-internal-guid-128cceb0-622c-6ca0-9998-bc07500bb3bc"><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Compile Time:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Compile time was tested under test-suite with a multisample of 5.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Early+Late outlining</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Many improvements with > 40% compile time reduction.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Few regressions.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Late outlining</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Greatest improvement is ~7.8%</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Greatest regression is ~4% with a difference of <0.02s</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Our explanation for the compile time reduction during early outlining is that due to the amount of redundancy reduction early in the optimization pipeline there is a reduction in the amount of instruction processing during the rest of the compilation.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Execution Time:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Ran with a multisample of 5.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Test Suite:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Early+Late outlining has many regressions up to 97%. The greatest improvement was around 7.5%.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Late outlining also has several regressions up to 44% and a greatest improvement of around 5.3%.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Spec:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Early+Late has a geomean regression of 3.5%.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - Late outlining has a geomean regression of 1.25%.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">The execution time results are to be expected given that the outliner, without profile data, will extract from whatever region it deems profitable. Extracting from the hot path can lead to a noticeable performance regression on any platform, which can be somewhat avoided by providing profile data during outlining.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">-- Tested Improvements --</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* MergeFunctions:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - No noticeable benefit.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* LTO:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - LTO doesn’t have a code size pipeline, but %reductions over LTO are comparable to non LTO.</span></p></span></div></blockquote><div><br></div></div></div></div><div><div class="gmail_extra"><div class="gmail_quote"><div>-Os/-Oz are communicated through the optsize and minsize attributes. There isn't a specific code size pipeline per se (I think this is true for per-TU compilation as well, though I would have to check).</div><div><br></div><div>Also, can you clarify what you mean by "LTO"? I assume this means that the outliner did not run during per-TU compilation but did run on the combined FullLTO bitcode, but want to check to be sure.</div></div></div></div><div><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>-- Sean Silva</div><div> </div></div></div></div><div><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><span id="m_-8416307079047273452m_-1663956593680671482m_-5274560702495700611gmail-m_1023322599362061652gmail-docs-internal-guid-128cceb0-622c-6ca0-9998-bc07500bb3bc"><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Input/Output Partitioning:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    -This identifies inputs/outputs that may be folded by splitting a candidate. The benefit is minimal for the computations required.</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Similar Candidate Merging:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">    - The benefit to be gained is currently not worth the large complexity required to catch the desired cases.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">-- Potential Improvements --</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Suffix&LCP array construction: The pass currently uses a very basic implementation that could be improved. There are definitely faster algorithms and some can construct both simultaneously. We will investigate this as a potential benefit for compile time in the future.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Os performance tuning: Under -Os the pass currently only runs on cold blocks. Ideally we could expand this to be a little more lax on less frequently executed blocks that aren’t cold.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Candidate Selection: The algorithm currently focuses on the longest common sequences. More checks could be added to see if shortening the sequence length produces a larger benefit(e.g less inputs/outputs). This would likely lead to an increase in compile time but should remain less than the baseline.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Non Exact Functions: The outliner currently ignores functions that do not have an exact definition.</span></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">-- --</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* CSiBE(Code Size Benchmark):</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a href="http://www.csibe.org/" style="text-decoration-line:none" target="_blank"><span style="font-size:11pt;font-family:Arial;background-color:transparent;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">www.csibe.org</span></a></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* More detailed performance data:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a href="http://goo.gl/5k6wsP" style="text-decoration-line:none" target="_blank"><span style="font-size:11pt;font-family:Arial;background-color:transparent;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">goo.gl/5k6wsP</span></a></p><br><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Implementation:</span></p><p style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="text-decoration-line:underline;font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap"><a href="https://github.com/River707/llvm/blob/outliner/lib/Transforms/IPO/CodeSizeOutliner.cpp" style="text-decoration-line:none" target="_blank">https://github.com/River707/ll<wbr>vm/blob/outliner/lib/Transform<wbr>s/IPO/CodeSizeOutliner.cpp</a></span></p><div><br></div></span></div>
<br>______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
<br></blockquote></div></div></div></blockquote></div></div></div></div></div>
</blockquote></div><br></div></div>