<div dir="ltr">Hi everyone,<div><br></div><div>I'm currently working on a new function merging optimization that is more general than the current LLVM function merging optimization that works only on identical functions.</div><div><br></div><div>I would like to know if the community has any interest in having a more powerful function merging optimization.</div><div><br></div><div>---- More Details ----</div><div><br></div><div>Up until now, I have been focusing on the quality of the code reduction.</div><div><br></div><div>Some preliminary result on SPEC'06 in a full LTO fashion:</div><div>The baseline has no function merge but the optimization pipeline is the same, and I am comparing my function merge with LLVM's identical function merge, where everything else in the optimization pipeline is the same as the baseline.</div><div>Average reduction in the final exectuable file over the baseline: 5.55% compared to 0.49% of the identical merge.</div><div><div><span class="inbox-inbox-gr_ inbox-inbox-gr_366 inbox-inbox-gr-alert inbox-inbox-gr_gramm inbox-inbox-gr_inline_cards inbox-inbox-gr_run_anim inbox-inbox-Grammar inbox-inbox-only-ins inbox-inbox-doubleReplace inbox-inbox-replaceWithoutSep" id="inbox-inbox-366" style="display:inline;border-bottom:2px solid transparent;background-repeat:no-repeat">Average</span> reduction in the total number of instructions over the baseline: 7.04% compared to 0.47% of the identical merge.</div></div><div><br></div><div>The highest reduction on the executable file is of about 20% (both 429.mcf and 447.dealII) and the highest reduction on the total number of instructions is of about 37% (447.dealII).</div><div><br></div><div>It has an average slowdown of about 1%, but having no statistical difference from the baseline in most of the benchmarks in the SPEC'06 suite.<br></div><div><br></div><div><br></div><div>Because this new function merging technique is able to merge any pair of functions, except for a few restrictions, the exploration strategy is critical for having an acceptable compilation time.</div><div><br></div><div>At the moment I'm starting to focus more on simplifying the optimization and reducing the overhead in compilation time.</div><div>My optimization has an exploration threshold which can be tuned to trade-off compilation time overhead for a more aggressive merge.</div><div>It does *not* perform n^2 merge operations. Instead, I have a ranking strategy based on a similarity metric computed from the function's "fingerprint".</div><div>The threshold limits the exploration to focus on the top functions of the rank.</div><div>The idea is to make the ranking mechanism as lightweight as possible.</div><div><br></div><div>Cheers,</div><div><br></div><div>Rodrigo Rocha</div></div>