<div dir="ltr">Hi Hal,<div><br></div><div>Because my function merging strategy is able to merge any two function, allowing for different CFGs, different parameters, etc.</div><div>I am unable to use just a simple hash value to compare whether or not two functions are similar.</div><div><br>Therefore, the idea is to have an infrastructure which allows me to compare whether or not two functions are similar without having traverse the two function (basically performing a merge for all pairs).</div><div>I'm precomputing a fingerprint of all functions, which is then cached for later use (this might also be useful to enable this function merging with ThinLTO).</div><div>At the moment, this fingerprint is just a map of opcode -> number of occurrences in the function, which is just a ~64-int-array.</div><div><br></div><div>Then, for each functions being considered for a merge, I'm able to rank the candidates with a PriorityQueue.</div><div><br></div><div>Hopefully, we are able to do that in a very lightweight manner.</div><div><br></div><div>After that, the more expensive bit will be actually performing the merge and then checking for profitability, using the TTI for code-size.</div><div><br></div><div>I haven't given much thought about adapting this infrastructure for the SLP Vectorizer, but perhaps something similar could also work there.</div><div><br></div><div>Cheers,</div><div><br></div><div>Rodrigo Rocha<br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Thu, 2 Aug 2018 at 16:43 Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF">

    <br>

    <div class="m_-4787981321377026353moz-cite-prefix">On 08/02/2018 10:25 AM, Rodrigo Caetano

      Rocha via llvm-dev wrote:<br>

    </div>

    <blockquote type="cite">

      

      <div dir="ltr">Hi everyone,

        <div><br>

        </div>

        <div>I'm currently working on a new function merging

          optimization that is more general than the current LLVM

          function merging optimization that works only on identical

          functions.</div>

        <div><br>

        </div>

        <div>I would like to know if the community has any interest in

          having a more powerful function merging optimization.</div>

      </div>

    </blockquote>

    <br></div><div text="#000000" bgcolor="#FFFFFF">

    Yes, I think there is certainly interest in this space.</div><div text="#000000" bgcolor="#FFFFFF"><br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div><br>

        </div>

        <div>---- More Details ----</div>

        <div><br>

        </div>

        <div>Up until now, I have been focusing on the quality of the

          code reduction.</div>

        <div><br>

        </div>

        <div>Some preliminary result on SPEC'06 in a full LTO fashion:</div>

        <div>The baseline has no function merge but the optimization

          pipeline is the same, and I am comparing my function merge

          with LLVM's identical function merge, where everything else in

          the optimization pipeline is the same as the baseline.</div>

        <div>Average reduction in the final exectuable file over the

          baseline: 5.55% compared to 0.49% of the identical merge.</div>

        <div>

          <div><span class="m_-4787981321377026353inbox-inbox-gr_ m_-4787981321377026353inbox-inbox-gr_366 m_-4787981321377026353inbox-inbox-gr-alert m_-4787981321377026353inbox-inbox-gr_gramm m_-4787981321377026353inbox-inbox-gr_inline_cards m_-4787981321377026353inbox-inbox-gr_run_anim m_-4787981321377026353inbox-inbox-Grammar m_-4787981321377026353inbox-inbox-only-ins m_-4787981321377026353inbox-inbox-doubleReplace m_-4787981321377026353inbox-inbox-replaceWithoutSep" id="m_-4787981321377026353inbox-inbox-366" style="display:inline;border-bottom:2px solid transparent;background-repeat:no-repeat">Average</span> reduction

            in the total number of instructions over the baseline: 7.04%

            compared to 0.47% of the identical merge.</div>

        </div>

        <div><br>

        </div>

        <div>The highest reduction on the executable file is of about

          20% (both 429.mcf and 447.dealII) and the highest reduction on

          the total number of instructions is of about 37% (447.dealII).</div>

        <div><br>

        </div>

        <div>It has an average slowdown of about 1%, but having no

          statistical difference from the baseline in most of the

          benchmarks in the SPEC'06 suite.<br>

        </div>

        <div><br>

        </div>

        <div><br>

        </div>

        <div>Because this new function merging technique is able to

          merge any pair of functions, except for a few restrictions,

          the exploration strategy is critical for having an acceptable

          compilation time.</div>

        <div><br>

        </div>

        <div>At the moment I'm starting to focus more on simplifying the

          optimization and reducing the overhead in compilation time.</div>

        <div>My optimization has an exploration threshold which can be

          tuned to trade-off compilation time overhead for a more

          aggressive merge.</div>

        <div>It does *not* perform n^2 merge operations. Instead, I have

          a ranking strategy based on a similarity metric computed from

          the function's "fingerprint".</div>

      </div>

    </blockquote>

    <br></div><div text="#000000" bgcolor="#FFFFFF">

    Can you explain in more detail how this works?<br>

    <br>

    (I'm also, as a side note, keeping my eye out for things like this

    that might also help us efficiently do more-compile-time-efficient

    SLP vetorization).<br>

    <br>

     -Hal<br>

    <br>

    <blockquote type="cite"></blockquote></div><div text="#000000" bgcolor="#FFFFFF"><blockquote type="cite">

      <div dir="ltr">

        <div>The threshold limits the exploration to focus on the top

          functions of the rank.</div>

        <div>The idea is to make the ranking mechanism as lightweight as

          possible.</div>

        <div><br>

        </div>

        <div>Cheers,</div>

        <div><br>

        </div>

        <div>Rodrigo Rocha</div>

      </div>

      <br>

      <fieldset class="m_-4787981321377026353mimeAttachmentHeader"></fieldset>

      <br>

      </blockquote></div><div text="#000000" bgcolor="#FFFFFF"><blockquote type="cite"><pre>_______________________________________________

LLVM Developers mailing list

<a class="m_-4787981321377026353moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>

<a class="m_-4787981321377026353moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote></div><div text="#000000" bgcolor="#FFFFFF">

    <br>

    <pre class="m_-4787981321377026353moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre>

  </div></blockquote></div>