<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
</head>
<body>
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
<div id="divtagdefaultwrapper" style="font-size:11pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;" dir="ltr">
<p>Hi River,</p>
<p><br>
</p>
<p>Do you know LLVM has the MachineOutliner pass (<span>lib/CodeGen/MachineOutliner.cpp</span>)? </p>
<p>It would be interesting to compare your approach with it.</p>
<p><br>
</p>
<p>Thanks,</p>
<p>Evgeny Astigeevich</p>
</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> llvm-dev <llvm-dev-bounces@lists.llvm.org> on behalf of River Riddle via llvm-dev <llvm-dev@lists.llvm.org><br>
<b>Sent:</b> Thursday, July 20, 2017 11:47:57 PM<br>
<b>To:</b> llvm-dev@lists.llvm.org<br>
<b>Subject:</b> [llvm-dev] [RFC] Add IR level interprocedural outliner for code size.</font>
<div> </div>
</div>
<div>
<div dir="ltr"><span id="gmail-docs-internal-guid-128cceb0-622c-6ca0-9998-bc07500bb3bc">
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap"> I’m River and I’m a compiler engineer at PlayStation. Recently, I’ve been working on an interprocedural
outlining (code folding) pass for code size improvement at the IR level. We hit a couple of use cases that the current code size solutions didn’t handle well enough. Outlining is one of the avenues that seemed potentially beneficial.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">-- Algorithmic Approach --</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The general implementation can be explained in stages:</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Candidate Selection:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Each instruction in the module is mapped to a congruence class based upon relaxed equivalency
constraints. For most instructions this is simply: The type, opcode, and operand types. The constraints are tightened for instructions with special state that require more exact equivalence (e.g. ShuffleVector requires a constant mask vector). Candidates are
then found by constructing a suffix array and lcp(longest common prefix) array. Walking the lcp array gives us congruent chains of instructions within the module.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Candidate Analysis:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">A verification step splits candidates that have different internal input sequences or incompatible
parent function attributes between occurrences. An example of incompatible internal inputs sequences is:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">X = W + 6; vs X = W + 6;</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Y = X + 4; Y = W + 4;</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The above two occurrences would need special control flow to exist within the same outlined function.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">During analysis candidates have their inputs and outputs computed along with an estimated benefit
from extraction. During input calculation a constant folding step removes all inputs that are the same amongst all occurrences.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Candidate Pruning:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Overlapping candidates are pruned with a generic greedy algorithm that picks occurrences starting
from the most beneficial candidates.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Candidate Outlining:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Non pruned candidates are then outlined. Outputs from a candidate are returned via an output parameter,
except for one output that is promoted to a return value. During outlining the inputs into the candidate are condensed by computing the equivalencies between the arguments at each occurrence. An example of this is:</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">outlinedFn(1,6,1); -> outlinedFn(1,6);</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">outlinedFn(2,4,2); -> outlinedFn(2,4);</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">In the above, parameters 1 and 3 were found to be equivalent for all occurrences, thus the amount
of inputs was reduced to 2.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Debug Info:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">Debug information is preserved for the calls to functions which have been outlined but all debug
info from the original outlined portions is removed, making them harder to debug.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Profile Info:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">If the pass is running at Os the outliner will only consider cold blocks, whereas Oz considers
all blocks that are not marked as hot.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Location in Pipeline:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The pass is currently configured to run very late in the optimization pipeline. It is intended
to run at Oz but will also run at Os if there is profile data available. The pass can optionally be run twice, once before function simplification and then again at the default location. This run is optional because you are gambling the potential benefits
of redundancy elimination vs the potential benefits from function simplification. This can lead to large benefits or regressions depending on the benchmark (though the benefits tend to outnumber the regressions). The performance section contains data for both
on a variety of benchmarks.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">-- Why at the IR level --</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The decision to place the outliner at the IR level comes from a couple of major factors:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">- Desire to be target independent</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">- Most opportunities for congruency</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">The downside to having this type of transformation be at the IR level is it means there will be
less accuracy in the cost model - we can somewhat accurately model the cost per instruction but we can’t get information on how a window of instructions may lower. This can cause regressions depending on the platform/codebase, therefore to help alleviate
this there are several tunable parameters for the cost model.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">-- Performance --</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">More results including clang, llvm-tblgen, and more specific numbers
about benefits/regressions can be found in the notes section below.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Size Reduction:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">- Test Suite(X86_64):</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Early+Late outlining provides a geomean of 10.5% reduction over
clang Oz, with a largest improvement of ~67% and largest regression of ~7.5%.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Late outlining provides a geomean of 4.65% reduction, with a
largest improvement of ~51% and largest regression of ~6.4%.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">- Spec 2006(X86_64)</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Early+Late outlining provides a geomean reduction of 2.08%.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Late outlining provides 2.09%.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">- CSiBE(AArch64)</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Early+Late outlining shows geomean reduction of around 3.5%.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Late outlining shows 3.1%.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Compile Time:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Compile time was tested under test-suite with a multisample of 5.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">- Early+Late outlining</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Many improvements with > 40% compile time reduction.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Few regressions.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">- Late outlining</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Greatest improvement is ~7.8%</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Greatest regression is ~4% with a difference of <0.02s</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Our explanation for the compile time reduction during early outlining
is that due to the amount of redundancy reduction early in the optimization pipeline there is a reduction in the amount of instruction processing during the rest of the compilation.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Execution Time:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Ran with a multisample of 5.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">- Test Suite:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Early+Late outlining has many regressions up to 97%. The greatest
improvement was around 7.5%.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Late outlining also has several regressions up to 44% and a
greatest improvement of around 5.3%.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">- Spec:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Early+Late has a geomean regression of 3.5%.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - Late outlining has a geomean regression of 1.25%.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">The execution time results are to be expected given that the outliner,
without profile data, will extract from whatever region it deems profitable. Extracting from the hot path can lead to a noticeable performance regression on any platform, which can be somewhat avoided by providing profile data during outlining.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">-- Tested Improvements --</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* MergeFunctions:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - No noticeable benefit.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* LTO:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - LTO doesn’t have a code size pipeline, but %reductions over
LTO are comparable to non LTO.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Input/Output Partitioning:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> -This identifies inputs/outputs that may be folded by splitting
a candidate. The benefit is minimal for the computations required.</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Similar Candidate Merging:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> - The benefit to be gained is currently not worth the large complexity
required to catch the desired cases.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">-- Potential Improvements --</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Suffix&LCP array construction: The pass currently uses a very basic
implementation that could be improved. There are definitely faster algorithms and some can construct both simultaneously. We will investigate this as a potential benefit for compile time in the future.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Os performance tuning: Under -Os the pass currently only runs on
cold blocks. Ideally we could expand this to be a little more lax on less frequently executed blocks that aren’t cold.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Candidate Selection: The algorithm currently focuses on the longest
common sequences. More checks could be added to see if shortening the sequence length produces a larger benefit(e.g less inputs/outputs). This would likely lead to an increase in compile time but should remain less than the baseline.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* Non Exact Functions: The outliner currently ignores functions that
do not have an exact definition.</span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">-- --</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* CSiBE(Code Size Benchmark):</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a href="http://www.csibe.org/" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">www.csibe.org</span></a></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap">* More detailed performance data:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a href="http://goo.gl/5k6wsP" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;background-color:transparent;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">goo.gl/5k6wsP</span></a></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap">* Implementation:</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="text-decoration-line:underline;font-size:11pt;font-family:Arial;vertical-align:baseline;white-space:pre-wrap"><a href="https://github.com/River707/llvm/blob/outliner/lib/Transforms/IPO/CodeSizeOutliner.cpp" style="text-decoration-line:none">https://github.com/River707/llvm/blob/outliner/lib/Transforms/IPO/CodeSizeOutliner.cpp</a></span></p>
<div><br>
</div>
</span></div>
</div>
</body>
</html>