<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p>I'd definitely support having a memcmp intrinsic for the reasons
      previously specified.  However, this is somewhat orthogonal from
      the original direction of the patch.  We can easily improve the
      lowering of the existing target function and then introduce the
      intrinsic.  Porting the existing lowering code over should be
      straight forward.  I'm only point this out so that we don't get
      blocked on the eventual end goal and fail to make progress.</p>
    <p>Philip<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 12/30/2016 02:27 AM, Martin J.
      O'Riordan wrote:<br>
    </div>
    <blockquote cite="mid:004201d26287$4834e9e0$d89ebda0$@movidius.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"Book Antiqua";
        panose-1:2 4 6 2 5 3 5 3 3 4;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0cm;
        mso-margin-bottom-alt:auto;
        margin-left:0cm;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}
pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}
span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:Consolas;
        mso-fareast-language:EN-IE;}
span.EmailStyle20
        {mso-style-type:personal-reply;
        font-family:"Book Antiqua",serif;
        color:#943634;
        font-weight:normal;
        font-style:normal;
        text-decoration:none none;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri",sans-serif;
        mso-fareast-language:EN-US;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal"><span style="font-family:"Book
            Antiqua",serif;color:#002060;mso-fareast-language:EN-US">With
            the intrinsic support for ‘</span><span
            style="font-family:"Courier
            New";color:black;mso-fareast-language:EN-US">memcpy</span><span
            style="font-family:"Book
            Antiqua",serif;color:#002060;mso-fareast-language:EN-US">’
            and ‘</span><span style="font-family:"Courier
            New";color:black;mso-fareast-language:EN-US">memset</span><span
            style="font-family:"Book
            Antiqua",serif;color:#002060;mso-fareast-language:EN-US">’
            the operands also have associated alignment operands.  I
            think that ‘</span><span style="font-family:"Courier
            New";color:black;mso-fareast-language:EN-US">memcmp</span><span
            style="font-family:"Book
            Antiqua",serif;color:#002060;mso-fareast-language:EN-US">’
            should also provide the alignment information for each of
            the source operands (when statically known).  In some cases
            this will lead to more optimal alignment aware lowering, and
            for targets for which unaligned access is costly or fatal,
            it can be lowered safely.<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-family:"Book
            Antiqua",serif;color:#002060;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="font-family:"Book
            Antiqua",serif;color:#002060;mso-fareast-language:EN-US">           
            MartinO<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-family:"Book
            Antiqua",serif;color:#002060;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
        <p class="MsoNormal"><b><span
              style="font-size:11.0pt;font-family:"Calibri",sans-serif"
              lang="EN-US">From:</span></b><span
            style="font-size:11.0pt;font-family:"Calibri",sans-serif"
            lang="EN-US"> llvm-dev
            [<a class="moz-txt-link-freetext" href="mailto:llvm-dev-bounces@lists.llvm.org">mailto:llvm-dev-bounces@lists.llvm.org</a>] <b>On Behalf Of </b>David
            Jones via llvm-dev<br>
            <b>Sent:</b> 30 December 2016 00:28<br>
            <b>To:</b> Philip Reames <a class="moz-txt-link-rfc2396E" href="mailto:listmail@philipreames.com"><listmail@philipreames.com></a><br>
            <b>Cc:</b> llvm-dev <a class="moz-txt-link-rfc2396E" href="mailto:llvm-dev@lists.llvm.org"><llvm-dev@lists.llvm.org></a>; Zaara
            Syeda <a class="moz-txt-link-rfc2396E" href="mailto:syzaara@ca.ibm.com"><syzaara@ca.ibm.com></a><br>
            <b>Subject:</b> Re: [llvm-dev] RFC: Inline expansion of
            memcmp vs call to standard library<o:p></o:p></span></p>
        <p class="MsoNormal"><o:p> </o:p></p>
        <div>
          <div>
            <p class="MsoNormal" style="margin-bottom:12.0pt">Can I make
              another suggestion: create an intrinsic for memory
              equality, e.g. llvm.memcmp_eq.p0i8.p0i8.i64(i8*a, i8*b,
              i64 len).  This intrinsic would return zero if the memory
              regions are equal, and nonzero otherwise. However, it does
              NOT return any notion of "greater" or "less".<o:p></o:p></p>
          </div>
          <p class="MsoNormal" style="margin-bottom:12.0pt">Many
            applications require only determining equality, rather than
            a total ordering. Given that "greater" and "less" also
            require some knowledge of endianness, even a fancy lowered
            version of memcmp can be slower than an equality-only
            compare.<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"><o:p> </o:p></p>
          <div>
            <p class="MsoNormal">On Thu, Dec 29, 2016 at 4:14 PM, Philip
              Reames via llvm-dev <<a moz-do-not-send="true"
                href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
              wrote:<o:p></o:p></p>
            <blockquote style="border:none;border-left:solid #CCCCCC
              1.0pt;padding:0cm 0cm 0cm
              6.0pt;margin-left:4.8pt;margin-right:0cm">
              <div>
                <div>
                  <p class="MsoNormal">Improving lowering for memcmp is
                    definitely something we should do for all targets. 
                    Doing it in a target specific way is decidedly
                    non-ideal.  <br>
                    <br>
                    It looks like we already have some code in
                    SelectionDAGBuilder which tries to optimize the
                    lowering for the memcpy library call.  I am a bit
                    confused by the problem you are trying to solve. 
                    Are you specifically interested in lowering for
                    constant lengths greater than a legal size?  (i.e.
                    do you need the loop?)<br>
                    <br>
                    If so, there are two approaches you might consider:<br>
                    - Expand the memcmp call into the loop form in
                    CodeGenPrep (or a similar timed pass) where working
                    with multiple basic blocks is much easier.  Long
                    term, the "right place" for this type of thing is
                    clearly GlobalISEL, but we have a number of other
                    such hacks in lowering today and continuing to build
                    off of that seems reasonable.<br>
                    - Emit the non-early exit form for small constant
                    values (a[0] == b[0] && a[1] == b[1] ...). 
                    Assuming your backend has handling for efficiently
                    lowering and chains using branches, you may very
                    well get the code you want.  <br>
                    <br>
                    Using the psuedo instruction here feels messy.  In
                    particular, I don't like the fact it basically opts
                    out of all of the combines which might further
                    improve the lowering.<br>
                    <br>
                    Philip<br>
                    <br>
                    <br>
                    On 12/29/2016 11:35 AM, Zaara Syeda via llvm-dev
                    wrote:<o:p></o:p></p>
                </div>
                <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
                  <p><span
                      style="font-family:"Calibri",sans-serif">Currently
                      on PowerPC, calls to memcmp are not expanded and
                      are left as library calls. In certain conditions,
                      expansion can improve performance rather than
                      calling the library function as done for functions
                      like memcpy, memmove, etc. This patch <b>(</b></span><a
                      moz-do-not-send="true"
                      href="https://reviews.llvm.org/D28163"
                      target="_blank"><b><span
                          style="font-family:"Calibri",sans-serif"><a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D28163">https://reviews.llvm.org/D28163</a></span></b></a><b><span
style="font-family:"Calibri",sans-serif">)</span></b><span
                      style="font-family:"Calibri",sans-serif">
                      is an initial implementation for PowerPC to expand
                      memcmp when the size is an 8 byte multiple.</span><o:p></o:p></p>
                  <p><span
                      style="font-family:"Calibri",sans-serif">The
                      approach currently added for this expansion tries
                      to use the existing infrastructure by overriding
                      the virtual function EmitTargetCodeForMemcmp. This
                      function works on the SelectionDAG, but the
                      expansion requires control flow for early exit.
                      So, instead of implementing the expansion within
                      EmitTargetCodeForMemcmp, a new pseudo instruction
                      is added for memcmp and a SelectionDAG node for
                      this new pseudo is created in
                      EmitTargetCodeForMemcmp. This pseudo instruction
                      is then expanded during lowering in
                      EmitInstrWithCustomInserter.<br>
                      <br>
                      The advantage of this approach is that it uses the
                      existing infrastructure and does not impact other
                      targets. If other targets would like to expand
                      memcmp, they can also override
                      EmitTargetCodeForMemcmp and create their own
                      expansion. </span><o:p></o:p></p>
                  <p><span
                      style="font-family:"Calibri",sans-serif">Another
                      option to consider is adding a new optimization
                      pass for this expansion that isn’t target specific
                      if other targets would benefit from a more general
                      infrastructure. </span><o:p></o:p></p>
                  <p><span
                      style="font-family:"Calibri",sans-serif">Please
                      provide feedback if this approach should be
                      continued to implement the PowerPC specific memcmp
                      expansions or whether the community is interested
                      in devising a more general approach</span>.<br>
                    <br>
                    Thanks,<o:p></o:p></p>
                  <p style="margin-bottom:12.0pt">Zaara Syeda<o:p></o:p></p>
                  <p class="MsoNormal"><o:p> </o:p></p>
                  <pre>_______________________________________________<o:p></o:p></pre>
                  <pre>LLVM Developers mailing list<o:p></o:p></pre>
                  <pre><a moz-do-not-send="true" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><o:p></o:p></pre>
                  <pre><a moz-do-not-send="true" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><o:p></o:p></pre>
                </blockquote>
                <p><o:p> </o:p></p>
              </div>
              <p class="MsoNormal" style="margin-bottom:12.0pt"><br>
                _______________________________________________<br>
                LLVM Developers mailing list<br>
                <a moz-do-not-send="true"
                  href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
                <a moz-do-not-send="true"
                  href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                  target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><o:p></o:p></p>
            </blockquote>
          </div>
          <p class="MsoNormal"><o:p> </o:p></p>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>