<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <div class="moz-cite-prefix">On 08/05/2014 03:46 PM, Robin Morisset
      wrote:<br>
    </div>
    <blockquote
cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Hello
            everyone,</span></p>
        <br style="font-family:arial,sans-serif;font-size:13px">
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">I
            have recently started on optimizing C11/C++11 atomics in
            LLVM, and plan to focus on that for the next two months as
            an intern in the PNaCl team. I’ve sent two patches on this
            topic to Phabricator that fix </span><a
            moz-do-not-send="true"
            href="http://llvm.org/bugs/show_bug.cgi?id=17281"
            target="_blank" style="text-decoration:none"><span
style="font-size:15px;font-family:Arial;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">http://llvm.org/bugs/show_bug.cgi?id=17281</span></a><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">:</span></p>
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;margin-top:0pt;margin-bottom:0pt"><span
            style="vertical-align:baseline;background-color:transparent"><font
              color="#000000" face="Arial"><span
                style="font-size:15px;line-height:17.25px;white-space:pre-wrap"><a
                  moz-do-not-send="true"
                  href="http://reviews.llvm.org/D4796" target="_blank">http://reviews.llvm.org/D4796</a></span></font><br>
          </span></p>
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;margin-top:0pt;margin-bottom:0pt"><span
            style="vertical-align:baseline;background-color:transparent"><font
              color="#000000" face="Arial"><span
                style="font-size:15px;line-height:17.25px;white-space:pre-wrap"><a
                  moz-do-not-send="true"
                  href="http://reviews.llvm.org/D4797" target="_blank">http://reviews.llvm.org/D4797</a></span><br>
            </font></span></p>
      </div>
    </blockquote>
    <font face="Arial">First, welcome!  This is definitely an area which
      needs some attention.  As you've already seen, I'll be happy to
      help review some of these patches.  We will need to find other
      qualified reviewers though.  As you've already seen, some of these
      issues are *extremely* subtle.  <br>
      <br>
      Fair warning, given how challenging this is to get right, and how
      hard it is to debug when wrong, you should expect very slow
      progress and quite a bit of time required for reviews.  Split your
      reviews into the smallest possible pieces you can to give reviewers
      a chance at understanding them.  <br>
    </font>
    <blockquote
cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;margin-top:0pt;margin-bottom:0pt"><span
            style="vertical-align:baseline;background-color:transparent"><font
              color="#000000" face="Arial">
            </font></span></p>
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br>
          </span></p>
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">The
            first patch is X86-specific, and tries to apply operations
            with immediates to atomics without going through a register.
            The main trouble here is that the X86 backend appears to
            respect LLVM memory model instead of the x86-TSO memory
            model, and may reorder instructions. In order to prevent
            illegal reordering of atomic accesses, the backend converts
            atomic accesses to pseudo-instructions in
            X86InstrCompiler.td (RELEASE_MOV* and ACQUIRE_MOV*) that are
            opaque to most of the rest of the backend, and only lowers
            those at the very end of the pipeline. I have decided to
            follow the same approach, just adding some more RELEASE_*
            pseudo-instructions rather than trying to find every
            possibly misbehaving part of the backend in order to do
            early lowering. This lowers the risk and complexity of the
            patch, but at the cost of possibly missing some optimization
            possibilities.</span></p>
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Another
            trouble I had with this patch is a failure of TableGen type
            inference when adding negate/not to the scheme. As a result
            I have left these two instructions commented out in this
            patch. Does anyone have an idea for how to proceed with this
            ?</span></p>
        <br style="font-family:arial,sans-serif;font-size:13px">
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">The
            second patch is more straightforward: in the C11/C++11
            memory model (that LLVM basically borrows), optimizations
            like DSE can safely fire across atomic accesses in many
            cases, basically as long as they are not operating across a
            release-acquire pair (see paper referenced in the comments).
            So I tweaked MemoryDependenceAnalysis to track such pairs
            and only return a clobber result in such cases.</span></p>
      </div>
    </blockquote>
    I've already made comments on these in the respective review
    threads.  <br>
    <br>
    Other readers - Please, we need extra reviewers on these!  If you
    have experience reasoning about memory model issues, please chime
    in.  <br>
    <blockquote
cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <br style="font-family:arial,sans-serif;font-size:13px">
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">My
            next steps will probably be to improve the compilation of
            acquire atomics in the ARM backend. In particular, they are
            currently compiled by a load + dmb, while a load + dependant
            useless branch + isb is also valid (see </span><a
            moz-do-not-send="true"
            href="http://www.cl.cam.ac.uk/%7Epes20/cpp/cpp0xmappings.html"
            target="_blank" style="text-decoration:none"><span
style="font-size:15px;font-family:Arial;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html</span></a><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">
            for example) and may be faster. Even better: if there is
            already a dependant branch (such as the loop for the
            lowering of CAS), it is just a cheap isb. The main step will
            be switching off the InsertFencesForAtomic flag, and do the
            lowering of atomics in the backend, because once an acquire
            load has been transformed in an acquire fence, too much
            information has been lost to apply this mapping.</span></p>
      </div>
    </blockquote>
    I'm unfamiliar with the details of the ARM architecture, so I won't
    be able to help you much here.  <br>
    <blockquote
cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Longer
            term, I hope to improve the fence elimination of the ARM
            backend with a kind of PRE algorithm. Both of these
            improvements to the ARM backend should be fairly
            straightforward to port to the POWER architecture later, and
            I hope to also do that.</span></p>
      </div>
    </blockquote>
    Any reason these couldn't be done at the IR level? 
    <blockquote
cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"
      type="cite">
      <div dir="ltr"><br
          style="font-family:arial,sans-serif;font-size:13px">
        <p dir="ltr"
style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span
style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Does
            this approach seem worthwhile to you ? Can I do anything to
            help the review process ?</span></p>
      </div>
    </blockquote>
    Small patches.  Lots of justification.  Many test cases.  Ping folks
    periodically.  <br>
    <br>
    Philip<br>
  </body>
</html>