<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <div class="moz-cite-prefix">On 08/05/2014 03:46 PM, Robin Morisset

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Hello

            everyone,</span></p>

        <br style="font-family:arial,sans-serif;font-size:13px">

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">I

            have recently started on optimizing C11/C++11 atomics in

            LLVM, and plan to focus on that for the next two months as

            an intern in the PNaCl team. I’ve sent two patches on this

            topic to Phabricator that fix </span><a

            moz-do-not-send="true"

            href="http://llvm.org/bugs/show_bug.cgi?id=17281"

            target="_blank" style="text-decoration:none"><span

style="font-size:15px;font-family:Arial;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">http://llvm.org/bugs/show_bug.cgi?id=17281</span></a><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">:</span></p>

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;margin-top:0pt;margin-bottom:0pt"><span

            style="vertical-align:baseline;background-color:transparent"><font

              color="#000000" face="Arial"><span

                style="font-size:15px;line-height:17.25px;white-space:pre-wrap"><a

                  moz-do-not-send="true"

                  href="http://reviews.llvm.org/D4796" target="_blank">http://reviews.llvm.org/D4796</a></span></font><br>

          </span></p>

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;margin-top:0pt;margin-bottom:0pt"><span

            style="vertical-align:baseline;background-color:transparent"><font

              color="#000000" face="Arial"><span

                style="font-size:15px;line-height:17.25px;white-space:pre-wrap"><a

                  moz-do-not-send="true"

                  href="http://reviews.llvm.org/D4797" target="_blank">http://reviews.llvm.org/D4797</a></span><br>

            </font></span></p>

      </div>

    </blockquote>

    <font face="Arial">First, welcome!  This is definitely an area which

      needs some attention.  As you've already seen, I'll be happy to

      help review some of these patches.  We will need to find other

      qualified reviewers though.  As you've already seen, some of these

      issues are *extremely* subtle.  <br>

      <br>

      Fair warning, given how challenging this is to get right, and how

      hard it is to debug when wrong, you should expect very slow

      progress and quite a bit of time required for reviews.  Split your

      reviews into the smallest possible pieces you can to give reviewers

      a chance at understanding them.  <br>

    </font>

    <blockquote

cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;margin-top:0pt;margin-bottom:0pt"><span

            style="vertical-align:baseline;background-color:transparent"><font

              color="#000000" face="Arial">

            </font></span></p>

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br>

          </span></p>

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">The

            first patch is X86-specific, and tries to apply operations

            with immediates to atomics without going through a register.

            The main trouble here is that the X86 backend appears to

            respect LLVM memory model instead of the x86-TSO memory

            model, and may reorder instructions. In order to prevent

            illegal reordering of atomic accesses, the backend converts

            atomic accesses to pseudo-instructions in

            X86InstrCompiler.td (RELEASE_MOV* and ACQUIRE_MOV*) that are

            opaque to most of the rest of the backend, and only lowers

            those at the very end of the pipeline. I have decided to

            follow the same approach, just adding some more RELEASE_*

            pseudo-instructions rather than trying to find every

            possibly misbehaving part of the backend in order to do

            early lowering. This lowers the risk and complexity of the

            patch, but at the cost of possibly missing some optimization

            possibilities.</span></p>

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Another

            trouble I had with this patch is a failure of TableGen type

            inference when adding negate/not to the scheme. As a result

            I have left these two instructions commented out in this

            patch. Does anyone have an idea for how to proceed with this

            ?</span></p>

        <br style="font-family:arial,sans-serif;font-size:13px">

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">The

            second patch is more straightforward: in the C11/C++11

            memory model (that LLVM basically borrows), optimizations

            like DSE can safely fire across atomic accesses in many

            cases, basically as long as they are not operating across a

            release-acquire pair (see paper referenced in the comments).

            So I tweaked MemoryDependenceAnalysis to track such pairs

            and only return a clobber result in such cases.</span></p>

      </div>

    </blockquote>

    I've already made comments on these in the respective review

    threads.  <br>

    <br>

    Other readers - Please, we need extra reviewers on these!  If you

    have experience reasoning about memory model issues, please chime

    in.  <br>

    <blockquote

cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <br style="font-family:arial,sans-serif;font-size:13px">

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">My

            next steps will probably be to improve the compilation of

            acquire atomics in the ARM backend. In particular, they are

            currently compiled by a load + dmb, while a load + dependant

            useless branch + isb is also valid (see </span><a

            moz-do-not-send="true"

            href="http://www.cl.cam.ac.uk/%7Epes20/cpp/cpp0xmappings.html"

            target="_blank" style="text-decoration:none"><span

style="font-size:15px;font-family:Arial;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html</span></a><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">

            for example) and may be faster. Even better: if there is

            already a dependant branch (such as the loop for the

            lowering of CAS), it is just a cheap isb. The main step will

            be switching off the InsertFencesForAtomic flag, and do the

            lowering of atomics in the backend, because once an acquire

            load has been transformed in an acquire fence, too much

            information has been lost to apply this mapping.</span></p>

      </div>

    </blockquote>

    I'm unfamiliar with the details of the ARM architecture, so I won't

    be able to help you much here.  <br>

    <blockquote

cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Longer

            term, I hope to improve the fence elimination of the ARM

            backend with a kind of PRE algorithm. Both of these

            improvements to the ARM backend should be fairly

            straightforward to port to the POWER architecture later, and

            I hope to also do that.</span></p>

      </div>

    </blockquote>

    Any reason these couldn't be done at the IR level? 

    <blockquote

cite="mid:CAKUVmHotXx6AvUry+1AG2bFUzLwcxiiF1wR7muATMBu1G8F-1Q@mail.gmail.com"

      type="cite">

      <div dir="ltr"><br

          style="font-family:arial,sans-serif;font-size:13px">

        <p dir="ltr"

style="font-family:arial,sans-serif;font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span

style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Does

            this approach seem worthwhile to you ? Can I do anything to

            help the review process ?</span></p>

      </div>

    </blockquote>

    Small patches.  Lots of justification.  Many test cases.  Ping folks

    periodically.  <br>

    <br>

    Philip<br>

  </body>

</html>