<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 3/20/21 7:26 PM, Juneyoung Lee

      wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAGwnbJRKs7Sjb-9X4HEpDjWZ=nkVaGsvD8DoTKAXaGYWYew8yg@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">+1<br>

        <div><br>

        </div>

        <div><br>

        </div>

        <div>I have one minor question: according to <a

            href="https://llvm.org/docs/Atomics.html#optimization-outside-atomic"

            moz-do-not-send="true">https://llvm.org/docs/Atomics.html#optimization-outside-atomic</a>

          , introducing a store is problematic even if the pointer is

          known to be dereferenceable. Does this also apply to

          dereferenceable+nosync+nofree?</div>

      </div>

    </blockquote>

    <p>This proposal is *only* discussing dereferenceability.  This

      question involves concurrency and the memory model which is not

      changing at all.  So, no, there's no change to when it's safe to

      insert a store to a potentially shared location.  <br>

    </p>

    <blockquote type="cite"

cite="mid:CAGwnbJRKs7Sjb-9X4HEpDjWZ=nkVaGsvD8DoTKAXaGYWYew8yg@mail.gmail.com">

      <div dir="ltr">

        <div><br>

        </div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Thu, Mar 18, 2021 at 6:22

          AM Philip Reames via llvm-dev <<a

            href="mailto:llvm-dev@lists.llvm.org" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

          <div>

            <p>TLDR: We should change the existing dereferenceability

              related attributes to imply point in time facts only, and

              re-infer stronger global dereferenceability facts where

              needed.</p>

            <h2><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1"

                target="_blank" moz-do-not-send="true">Meta</a></h2>

            <p>If you prefer to read proposals in a browser, you can

              read this email <a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst"

                target="_blank" moz-do-not-send="true">here</a>.</p>

            <p>This proposal greatly benefited from multiple rounds of

              feedback from Johannes, Artur, and Nick. All remaining

              mistakes are my own.</p>

            <p>Johannes deserves a lot of credit for driving previous

              iterations on this design. In particular, I want to note

              that we've basically returned to something Johannes first

              proposed several years ago, before we had specified the

              nofree attribute family.</p>

            <h2><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2"

                target="_blank" moz-do-not-send="true">The Basic Problem</a></h2>

            <p>We have a long standing semantic problem with the way we

              define dereferenceability facts which makes it difficult

              to express C++ references, or more generally,

              dereferenceability on objects which may be freed at some

              point in the program. The current structure does lend

              itself well to memory which can't be freed. As discussed

              in detail a bit later, we want to seamlessly support both

              use cases.</p>

            <p>The basic statement of the problem is that a piece of

              memory marked with deref(N) is assumed to remain

              dereferenceable indefinitely. For an object which can be

              freed, marking it as deref can enable unsound

              transformations in cases like the following:</p>

            <pre>o = deref(N) alloc();

if (c) free(o)

while(true) {

  if (c) break;

  // With the current semantics, we will hoist o.f above the loop

  v = o.f;

}

</pre>

            <p>Despite this, Clang does emit the existing

              dereferenceable attribute in some problematic cases. We

              have observed miscompiles as a result, and optimizer has

              an assortment of hacks to try not to be too aggressive and

              miscompile too widely.</p>

            <h2><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3"

                target="_blank" moz-do-not-send="true">Haven't we

                already solved this?</a></h2>

            <p>This has been discussed relatively extensively in the

              past, included an accepted review (<a

                href="https://reviews.llvm.org/D61652" rel="nofollow"

                target="_blank" moz-do-not-send="true">https://reviews.llvm.org/D61652</a>)

              which proposed splitting the dereferenceable attribute

              into two to adress this. However, this change never landed

              and recent findings reveal that we both need a broader

              solution, and have an interesting oppurtunity to take

              advantage of other recent work.</p>

            <p>The need for a broader solution comes from the

              observation that deref(N) is not the only attribute with

              this problem. deref_or_null(N) is a fairly obvious case

              we'd known about with the previous proposal, but it was

              recently realized that other allocation related facts have

              this problem as well. We now have specific examples with

              allocsize(N,M) - and the baked in variants in

              MemoryBuiltins - and suspect there are other attributes,

              either current or future, with the same challenge.</p>

            <p>The opportunity comes from the addition of "nofree"

              attribute. Up until recently, we really didn't have a good

              notion of "free"ing an allocation in the abstract machine

              model. We used to comingle this with our notion of

              capture. (i.e. We'd assume that functions which could free

              must also capture.) With the explicit notion of "nofree",

              we have an approach available to us we didn't before.</p>

            <h2><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4"

                target="_blank" moz-do-not-send="true">The Proposal

                Itself</a></h2>

            <p>The basic idea is that we're going to redefine the

              currently globally scoped attributes (deref,

              deref_or_null, and allocsize) such that they imply a point

              in time fact only and then combine that with nofree to

              recover the previous global semantics.</p>

            <p>More specifically:</p>

            <ul>

              <li>A deref attribute on a function parameter will imply

                that the memory is dereferenceable for a specified

                number of bytes at the instant the function call occurs.</li>

              <li>A deref attribute on a function return will imply that

                the memory is dereferenceable at the moment of return.</li>

            </ul>

            <p>We will then use the point in time fact combined with

              other information to drive inference of the global facts.

              While in principle we may loose optimization potential, we

              believe this is sufficient to infer the global facts in

              all practical cases we care about.</p>

            <p>Sample inference cases:</p>

            <ul>

              <li>A deref(N) argument to a function with the nofree and

                nosync function attribute is known to be globally

                dereferenceable within the scope of the function call.

                We need the nosync to ensure that no other thread is

                freeing the memory on behalf of the callee in a

                coordinated manner.</li>

              <li>An argument with the attributes deref(N), noalias, and

                nofree is known to be globally dereferenceable within

                the scope of the function call. This relies on the fact

                that free is modeled as writing to the memory freed, and

                thus noalias ensures there is no other argument which

                can be freed. (See discussion below.)</li>

              <li>A memory allocation in a function with a garbage

                collector which guarantees collection occurs only at

                explicit safepoints and uses the gc.statepoint

                infrastructure, is known to be globally dereferenceable

                if there are no calls to gc.statepoint anywhere in the

                module. This effectively refines the abstract machine

                model used for garbage collection before lowering by

                RS4GC to disallow explicit deallocation (for collectors

                which opt in).</li>

            </ul>

            <p>The items above are described in terms of deref(N) for

              ease of description. The other attributes are handle

              analogously.</p>

            <p><strong>Explanation</strong></p>

            <p>The "deref(N), noalias, + nofree" argument case requires

              a bit of explanation as it involves a bunch of subtleties.</p>

            <p>First, the current wording of nofree argument attribute

              implies that the callee can not arrange for another thread

              to free the object on it's behalf. This is different than

              the specification of the nofree function attribute. There

              is no "nosync" equivalent for function attributes.</p>

            <p>Second, the noalias argument attribute is subtle. There's

              a couple of sub-cases worth discussing:</p>

            <ul>

              <li>If the noalias argument is written to (reminder: free

                is modeled as a write), then it must be the only copy of

                the pointer passed to the function and there can be no

                copies passed through memory used in the scope of

                function.</li>

              <li>If the noalias argument is only read from, then there

                may be other copies of the pointer. However, all of

                those copies must also be read only. If the object was

                freed through one of those other copies, then we must

                have at least one writeable copy and having the noalias

                on the read copy was undefined behavior to begin with.</li>

            </ul>

            <p>Essentially, what we're doing with noalias is using it to

              promote a fact about the pointer to a fact about the

              object being pointed to. Code structure wise, we should

              probably write it exactly that way.</p>

            <p><strong>Result</strong></p>

            <p>It's important to acknowledge that with this change, we

              will lose the ability to specify global dereferenceability

              of arguments and return values in the general case. We

              believe the current proposal allows us to recover that

              fact for all interesting cases, but if we've missed an

              important use case we may need to iterate a bit.</p>

            <p>We've discussed a few alternatives (below) which could be

              revisited if it turns out we are missing an important use

              case.</p>

            <h2><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5"

                target="_blank" moz-do-not-send="true">Use Cases</a></h2>

            <p><strong>C++ References</strong> -- A C++ reference

              implies that the value pointed to is dereferenceable at

              point of declaration, and that the reference itself is

              non-null. Of particular note, an object pointed to through

              a reference can be freed without introducing UB.</p>

            <div>

              <pre><span>class</span> <span>A</span> { <span>int</span> f; };

<span>void</span> <span>ugly_delete</span>(A &a) { <span>delete</span> &a; }

<span>ugly_delete</span>(*<span>new</span> A());

<span>void</span> <span>ugly_delete2</span>(A &a, A *a2) {

  <span>if</span> (unknown)

    <span><span>//</span> a.f can be *proven* deref here as it's deref on entry,</span>

    <span><span>//</span> and no free on path from entry to here.</span>

    x = a.<span>f</span>;

  <span>delete</span> a2;

}

<span>auto</span> *a = <span>new</span> A();

<span>ugly_delete2</span>(*a, a);

A &<span>foo</span>() {...}

A &a = foo();

<span>if</span> (unknown)

  <span>delete</span> b;

<span><span>//</span> If a and b point to the same object, a.f may not be deref here</span>

<span>if</span> (unknown2)

  a.f;</pre>

            </div>

            <p><strong>Garbage Collected Objects (Java)</strong> -- LLVM

              supports two models of GCed objects, the abstract machine

              and the physical machine model. The later is essentially

              the same as that for c++ as deallocation points (at

              safepoints) are explicit. The former has objects

              conceptually live forever (i.e. reclaimation is handled

              outside the model).</p>

            <div>

              <pre><span>class</span> <span>A</span> { <span>int</span> f; }

<span>void</span> foo(<span>A</span> a) {

  <span>...</span>

  <span><span>//</span> a.f is trivially deref anywhere in foo</span>

  x <span>=</span> a<span>.</span>f;

}

<span>A</span> a <span>=</span> <span>new</span> <span>A</span>();

<span>...</span>

<span><span>//</span> a.f is trivially deref following it's definition</span>

x <span>=</span> a<span>.</span>f;

<span>A</span> foo();

a <span>=</span> foo();

<span>...</span>

<span><span>//</span> a.f is (still) trivially deref</span>

x <span>=</span> a<span>.</span>f;</pre>

            </div>

            <p><strong>Rust Borrows</strong> -- A rust reference

              argument (e.g. "borrow") points to an object whose

              lifetime is guaranteed to be longer than the reference's

              defining scope. As such, the object is dereferenceable

              through the scope of the function. Today, rustc does emit

              a dereferenceable attribute using the current globally

              dereferenceable semantic.</p>

            <div>

              <pre><span>pub</span> <span>fn</span> <span>square</span>(num: <span>&</span><span>i32</span>) -> <span>i32</span> {

  num <span>*</span> num

}

<span>square</span>(<span>&</span><span>5</span>);

<span>// a could be noalias, but isn't today</span>

<span>pub</span> <span>fn</span> <span>bar</span>(a: <span>&</span><span>mut</span> <span>i32</span>, b: <span>&</span><span>i32</span>) {

  <span>*</span>a <span>=</span> a <span>*</span> b

}

<span>bar</span>(<span>&</span><span>mut</span> <span>5</span>, <span>&</span><span>2</span>);

<span>// At first appearance, rust does not allow returning references.  So return</span>

<span>// attributes are not relevant.  This seems like a major language hole, so this</span>

<span>// should probably be checked with a language expert.</span></pre>

            </div>

            <h2><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6"

                target="_blank" moz-do-not-send="true">Migration</a></h2>

            <p>Existing bytecode will be upgraded to the weaker

              non-global semantics. This provides forward compatibility,

              but does lose optimization potential for previously

              compiled bytecode.</p>

            <p>C++ and GC'd language frontends don't change.</p>

            <p>Rustc should emit noalias where possible. In particular,

              'a' in the case 'bar' above is currently not marked

              noalias and results in lost optimization potential as a

              result of this change. According to the rustc code, this

              is legal, but currently blocked on a noalias related

              miscompile. See <a

                href="https://github.com/rust-lang/rust/issues/54462"

                target="_blank" moz-do-not-send="true">https://github.com/rust-lang/rust/issues/54462</a>

              and <a

                href="https://github.com/rust-lang/rust/issues/54878"

                target="_blank" moz-do-not-send="true">https://github.com/rust-lang/rust/issues/54878</a>

              for further details. (My current belief is that all llvm

              side blockers have been resolved.)</p>

            <p>Frontends which want the global semantics should emit

              noalias, nofree, and nosync where appropriate. If this is

              not enough to recover optimizations in common cases,

              please explain why not. It's possible we've failed to

              account for something.</p>

            <h2><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7"

                target="_blank" moz-do-not-send="true">Alternative

                Designs</a></h2>

            <p>All of the alternate designs listed focus on recovering

              the full global deref semantics. Our hope is that any

              common case we've missed can be resolved with additional

              inference rules instead.</p>

            <h3><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8"

                target="_blank" moz-do-not-send="true">Extend nofree to

                object semantics</a></h3>

            <p>The nofree argument attribute current describes whether

              an object can freed through some particular copy of the

              pointer. We could strength the semantics to imply that the

              object is not freed through any copy of the pointer in the

              specified scope.</p>

            <p>Doing so greatly weakens our ability to infer the nofree

              property. The current nofree property when combined with

              capture tracking in the caller is enough to prove interest

              deref facts over calls. We don't want to loose the ability

              to infer that since it enables interesting transforms

              (such as code reordering over calls).</p>

            <h3><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9"

                target="_blank" moz-do-not-send="true">Add a separate

                nofreeobj attribute</a></h3>

            <p>Rather than change nofree, we could add a parallel

              attribute with the stronger object property. This -

              combined with deref(N) as a point in time fact - would be

              enough to recover the current globally deferenceable

              semantics.</p>

            <p>The downside of this alternative is a) possible overkill,

              and b) the "ugly" factor of having two similar but not

              quite identical attributes.</p>

            <h3><a

href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10"

                target="_blank" moz-do-not-send="true">Add an orthogonal

                attribute to promote pointer facts to object ones</a></h3>

            <p>To address the weakness of the former alternative, we

              could specify an attribute which strengthens arbitrary

              pointer facts to object facts. Examples of current pointer

              facts are attributes such as readonly, and writeonly.</p>

            <p>This has not been well explored; there's a huge possible

              design space here.</p>

          </div>

          _______________________________________________<br>

          LLVM Developers mailing list<br>

          <a href="mailto:llvm-dev@lists.llvm.org" target="_blank"

            moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>

          <a

            href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"

            rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

        </blockquote>

      </div>

      <br clear="all">

      <div><br>

      </div>

      -- <br>

      <div dir="ltr" class="gmail_signature">

        <div dir="ltr">

          <div><br>

          </div>

          <font size="1">Juneyoung Lee</font>

          <div><font size="1">Software Foundation Lab, Seoul National

              University</font></div>

        </div>

      </div>

    </blockquote>

  </body>

</html>