[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

Wed Jul 14 01:09:41 PDT 2021

On Tue, Jul 13, 2021 at 12:15 PM Nuno Lopes via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Regarding nofree, I think one of the confusions was around implementation.
> There was a mix between semantics and static analyses. These are different
> things and static analyses implement a sound approximation of the
> semantics. Of course it’s useful to define the semantics such that we can
> develop a useful approximation, but that’s it.
>
> For nofree, I believe we concluded that it’s strictly more expressive to
> not have to rely on nosync. Nosync is all or nothing; we couldn’t mark a
> function that uses a mutex as nofree if relying on nosync. Keeping things
> separated strictly allows more functions to be marked nofree. Which doesn’t
> mean the first implementation will. But the semantics isn’t written to
> track the first implementation, but to be future proof.
>
> That’s not to say nosync isn’t useful. I can totally see clang exposing a
> -fsingle-threaded flag or whatever so we get it for free.
>

That is mostly my recollection as well, except that I ended up feeling less
certain about how nofree+nosync should work together. My intuition agrees
with Nuno, but I felt like the part of the discussion with Johannes hadn't
reached a conclusion. We definitely need a decision on this semantics
question either way, and I would want it to be consistent across all
affected attributes, to reduce the complexity of LLVM IR overall.
argmemonly and perhaps others have the same problem, and last time I
checked, miscompilations can be triggered on main because of this confusion.

I don't have much time to spend on this, but I'm willing to put in at least
some work to try to drive a consensus on this if needed. I'm curious to
hear from Johannes.

Until we can disentangle implementation (of static analyses) and semantics
> I don’t think we will be able to reach any consensus around attributes.
>
>
>
> I don’t think there’s anything to rollback. The changes that you did were
> a step in the right direction. It’s half backed, yes, but hopefully that
> work will be finished at some point. So option 2. seems the way to go.
> Thank you!
>

Agreed.

Cheers,
Nicolai

>
>
> Nuno
>
>
>
>
>
> *From:* Philip Reames via llvm-dev
> *Sent:* 12 July 2021 17:05
> *Subject:* Re: [llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree
>
>
>
> At this point, I find myself needing to declare that the proposal below is
> a failure, and ask the community what next steps we'd prefer.
>
> This effort stumbled into the fact that we don't seem to have any actual
> agreement on what the semantics of various attributes are.  In particular,
> the semantics of nofree don't appear to be in a usable state, and my
> attempts at driving consensus have failed.  I am not willing to continue
> investing effort in that direction.
>
> Given that, I see three options, and need input from the community as to
> which we should chose.
>
> Option 1 - Back out the couple of changes which have landed, update
> LangRef to be explicit about the scoped dereferenceability we had
> historically, and consider this effort a failure.
>
> Option 2 - Change the semantic of the attributes to the point in time
> semantic *without* attempting any further inference of the scoped
> semantics.  At the current moment, the Java use case is covered (via the GC
> rule), no one seems to care about the lost optimization power for C/C++,
> and I am unclear on the practical impact (if any) on rust.
>
> Option 3 - Introduce a new 'nofreeobj' attribute whose semantics would be
> specifically that an object is not freed in the dynamic scope of the
> function through any mechanism (including concurrency).  This attribute
> would be basically uninferrable, and would exist only to support language
> guarantees being encoded by frontends.
>
> My recommendation would be for option 2, than 3, than 1.  It's worth
> noting that we could also chose option 2, then implement option 3 lazily if
> anyone reports a practical performance regression.
>
> Philip
>
> On 3/17/21 2:22 PM, Philip Reames via llvm-dev wrote:
>
> TLDR: We should change the existing dereferenceability related attributes
> to imply point in time facts only, and re-infer stronger global
> dereferenceability facts where needed.
> Meta
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1>
>
> If you prefer to read proposals in a browser, you can read this email here
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst>.
>
> This proposal greatly benefited from multiple rounds of feedback from
> Johannes, Artur, and Nick. All remaining mistakes are my own.
>
> Johannes deserves a lot of credit for driving previous iterations on this
> design. In particular, I want to note that we've basically returned to
> something Johannes first proposed several years ago, before we had
> specified the nofree attribute family.
> The Basic Problem
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2>
>
> We have a long standing semantic problem with the way we define
> dereferenceability facts which makes it difficult to express C++
> references, or more generally, dereferenceability on objects which may be
> freed at some point in the program. The current structure does lend itself
> well to memory which can't be freed. As discussed in detail a bit later, we
> want to seamlessly support both use cases.
>
> The basic statement of the problem is that a piece of memory marked with
> deref(N) is assumed to remain dereferenceable indefinitely. For an object
> which can be freed, marking it as deref can enable unsound transformations
> in cases like the following:
>
> o = deref(N) alloc();
>
> if (c) free(o)
>
> while(true) {
>
>   if (c) break;
>
>   // With the current semantics, we will hoist o.f above the loop
>
>   v = o.f;
>
> }
>
> Despite this, Clang does emit the existing dereferenceable attribute in
> some problematic cases. We have observed miscompiles as a result, and
> optimizer has an assortment of hacks to try not to be too aggressive and
> miscompile too widely.
> Haven't we already solved this?
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3>
>
> This has been discussed relatively extensively in the past, included an
> accepted review (https://reviews.llvm.org/D61652) which proposed
> splitting the dereferenceable attribute into two to adress this. However,
> this change never landed and recent findings reveal that we both need a
> broader solution, and have an interesting oppurtunity to take advantage of
> other recent work.
>
> The need for a broader solution comes from the observation that deref(N)
> is not the only attribute with this problem. deref_or_null(N) is a fairly
> obvious case we'd known about with the previous proposal, but it was
> recently realized that other allocation related facts have this problem as
> well. We now have specific examples with allocsize(N,M) - and the baked in
> variants in MemoryBuiltins - and suspect there are other attributes, either
> current or future, with the same challenge.
>
> The opportunity comes from the addition of "nofree" attribute. Up until
> recently, we really didn't have a good notion of "free"ing an allocation in
> the abstract machine model. We used to comingle this with our notion of
> capture. (i.e. We'd assume that functions which could free must also
> capture.) With the explicit notion of "nofree", we have an approach
> available to us we didn't before.
> The Proposal Itself
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4>
>
> The basic idea is that we're going to redefine the currently globally
> scoped attributes (deref, deref_or_null, and allocsize) such that they
> imply a point in time fact only and then combine that with nofree to
> recover the previous global semantics.
>
> More specifically:
>
>    - A deref attribute on a function parameter will imply that the memory
>    is dereferenceable for a specified number of bytes at the instant the
>    function call occurs.
>    - A deref attribute on a function return will imply that the memory is
>    dereferenceable at the moment of return.
>
> We will then use the point in time fact combined with other information to
> drive inference of the global facts. While in principle we may loose
> optimization potential, we believe this is sufficient to infer the global
> facts in all practical cases we care about.
>
> Sample inference cases:
>
>    - A deref(N) argument to a function with the nofree and nosync
>    function attribute is known to be globally dereferenceable within the scope
>    of the function call. We need the nosync to ensure that no other thread is
>    freeing the memory on behalf of the callee in a coordinated manner.
>    - An argument with the attributes deref(N), noalias, and nofree is
>    known to be globally dereferenceable within the scope of the function call.
>    This relies on the fact that free is modeled as writing to the memory
>    freed, and thus noalias ensures there is no other argument which can be
>    freed. (See discussion below.)
>    - A memory allocation in a function with a garbage collector which
>    guarantees collection occurs only at explicit safepoints and uses the
>    gc.statepoint infrastructure, is known to be globally dereferenceable if
>    there are no calls to gc.statepoint anywhere in the module. This
>    effectively refines the abstract machine model used for garbage collection
>    before lowering by RS4GC to disallow explicit deallocation (for collectors
>    which opt in).
>
> The items above are described in terms of deref(N) for ease of
> description. The other attributes are handle analogously.
>
> *Explanation*
>
> The "deref(N), noalias, + nofree" argument case requires a bit of
> explanation as it involves a bunch of subtleties.
>
> First, the current wording of nofree argument attribute implies that the
> callee can not arrange for another thread to free the object on it's
> behalf. This is different than the specification of the nofree function
> attribute. There is no "nosync" equivalent for function attributes.
>
> Second, the noalias argument attribute is subtle. There's a couple of
> sub-cases worth discussing:
>
>    - If the noalias argument is written to (reminder: free is modeled as
>    a write), then it must be the only copy of the pointer passed to the
>    function and there can be no copies passed through memory used in the scope
>    of function.
>    - If the noalias argument is only read from, then there may be other
>    copies of the pointer. However, all of those copies must also be read only.
>    If the object was freed through one of those other copies, then we must
>    have at least one writeable copy and having the noalias on the read copy
>    was undefined behavior to begin with.
>
> Essentially, what we're doing with noalias is using it to promote a fact
> about the pointer to a fact about the object being pointed to. Code
> structure wise, we should probably write it exactly that way.
>
> *Result*
>
> It's important to acknowledge that with this change, we will lose the
> ability to specify global dereferenceability of arguments and return values
> in the general case. We believe the current proposal allows us to recover
> that fact for all interesting cases, but if we've missed an important use
> case we may need to iterate a bit.
>
> We've discussed a few alternatives (below) which could be revisited if it
> turns out we are missing an important use case.
> Use Cases
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5>
>
> *C++ References* -- A C++ reference implies that the value pointed to is
> dereferenceable at point of declaration, and that the reference itself is
> non-null. Of particular note, an object pointed to through a reference can
> be freed without introducing UB.
>
> class A { int f; };
>
>
>
> void ugly_delete(A &a) { delete &a; }
>
> ugly_delete(*new A());
>
>
>
> void ugly_delete2(A &a, A *a2) {
>
>   if (unknown)
>
>     // a.f can be *proven* deref here as it's deref on entry,
>
>     // and no free on path from entry to here.
>
>     x = a.f;
>
>   delete a2;
>
> }
>
> auto *a = new A();
>
> ugly_delete2(*a, a);
>
>
>
> A &foo() {...}
>
> A &a = foo();
>
> if (unknown)
>
>   delete b;
>
> // If a and b point to the same object, a.f may not be deref here
>
> if (unknown2)
>
>   a.f;
>
> *Garbage Collected Objects (Java)* -- LLVM supports two models of GCed
> objects, the abstract machine and the physical machine model. The later is
> essentially the same as that for c++ as deallocation points (at safepoints)
> are explicit. The former has objects conceptually live forever (i.e.
> reclaimation is handled outside the model).
>
> class A { int f; }
>
>
>
> void foo(A a) {
>
>   ...
>
>   // a.f is trivially deref anywhere in foo
>
>   x = a.f;
>
> }
>
>
>
> A a = new A();
>
> ...
>
> // a.f is trivially deref following it's definition
>
> x = a.f;
>
>
>
> A foo();
>
> a = foo();
>
> ...
>
> // a.f is (still) trivially deref
>
> x = a.f;
>
> *Rust Borrows* -- A rust reference argument (e.g. "borrow") points to an
> object whose lifetime is guaranteed to be longer than the reference's
> defining scope. As such, the object is dereferenceable through the scope of
> the function. Today, rustc does emit a dereferenceable attribute using the
> current globally dereferenceable semantic.
>
> pub fn square(num: &i32) -> i32 {
>
>   num * num
>
> }
>
> square(&5);
>
>
>
> // a could be noalias, but isn't today
>
> pub fn bar(a: &mut i32, b: &i32) {
>
>   *a = a * b
>
> }
>
>
>
> bar(&mut 5, &2);
>
>
>
> // At first appearance, rust does not allow returning references.  So return
>
> // attributes are not relevant.  This seems like a major language hole, so this
>
> // should probably be checked with a language expert.
>
> Migration
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6>
>
> Existing bytecode will be upgraded to the weaker non-global semantics.
> This provides forward compatibility, but does lose optimization potential
> for previously compiled bytecode.
>
> C++ and GC'd language frontends don't change.
>
> Rustc should emit noalias where possible. In particular, 'a' in the case
> 'bar' above is currently not marked noalias and results in lost
> optimization potential as a result of this change. According to the rustc
> code, this is legal, but currently blocked on a noalias related miscompile.
> See https://github.com/rust-lang/rust/issues/54462 and
> https://github.com/rust-lang/rust/issues/54878 for further details. (My
> current belief is that all llvm side blockers have been resolved.)
>
> Frontends which want the global semantics should emit noalias, nofree, and
> nosync where appropriate. If this is not enough to recover optimizations in
> common cases, please explain why not. It's possible we've failed to account
> for something.
> Alternative Designs
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7>
>
> All of the alternate designs listed focus on recovering the full global
> deref semantics. Our hope is that any common case we've missed can be
> resolved with additional inference rules instead.
> Extend nofree to object semantics
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8>
>
> The nofree argument attribute current describes whether an object can
> freed through some particular copy of the pointer. We could strength the
> semantics to imply that the object is not freed through any copy of the
> pointer in the specified scope.
>
> Doing so greatly weakens our ability to infer the nofree property. The
> current nofree property when combined with capture tracking in the caller
> is enough to prove interest deref facts over calls. We don't want to loose
> the ability to infer that since it enables interesting transforms (such as
> code reordering over calls).
> Add a separate nofreeobj attribute
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9>
>
> Rather than change nofree, we could add a parallel attribute with the
> stronger object property. This - combined with deref(N) as a point in time
> fact - would be enough to recover the current globally deferenceable
> semantics.
>
> The downside of this alternative is a) possible overkill, and b) the
> "ugly" factor of having two similar but not quite identical attributes.
> Add an orthogonal attribute to promote pointer facts to object ones
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10>
>
> To address the weakness of the former alternative, we could specify an
> attribute which strengthens arbitrary pointer facts to object facts.
> Examples of current pointer facts are attributes such as readonly, and
> writeonly.
>
> This has not been well explored; there's a huge possible design space here.
>
>
>
> _______________________________________________
>
> LLVM Developers mailing list
>
> llvm-dev at lists.llvm.org
>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210714/5859e226/attachment-0001.html>