[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Sun Mar 21 11:13:53 PDT 2021
On 3/20/21 7:26 PM, Juneyoung Lee wrote:
> +1
>
>
> I have one minor question: according to
> https://llvm.org/docs/Atomics.html#optimization-outside-atomic
> <https://llvm.org/docs/Atomics.html#optimization-outside-atomic> ,
> introducing a store is problematic even if the pointer is known to be
> dereferenceable. Does this also apply to dereferenceable+nosync+nofree?
This proposal is *only* discussing dereferenceability. This question
involves concurrency and the memory model which is not changing at all.
So, no, there's no change to when it's safe to insert a store to a
potentially shared location.
>
>
> On Thu, Mar 18, 2021 at 6:22 AM Philip Reames via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
> TLDR: We should change the existing dereferenceability related
> attributes to imply point in time facts only, and re-infer
> stronger global dereferenceability facts where needed.
>
>
> Meta
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1>
>
> If you prefer to read proposals in a browser, you can read this
> email here
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst>.
>
> This proposal greatly benefited from multiple rounds of feedback
> from Johannes, Artur, and Nick. All remaining mistakes are my own.
>
> Johannes deserves a lot of credit for driving previous iterations
> on this design. In particular, I want to note that we've basically
> returned to something Johannes first proposed several years ago,
> before we had specified the nofree attribute family.
>
>
> The Basic Problem
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2>
>
> We have a long standing semantic problem with the way we define
> dereferenceability facts which makes it difficult to express C++
> references, or more generally, dereferenceability on objects which
> may be freed at some point in the program. The current structure
> does lend itself well to memory which can't be freed. As discussed
> in detail a bit later, we want to seamlessly support both use cases.
>
> The basic statement of the problem is that a piece of memory
> marked with deref(N) is assumed to remain dereferenceable
> indefinitely. For an object which can be freed, marking it as
> deref can enable unsound transformations in cases like the following:
>
> o = deref(N) alloc();
> if (c) free(o)
> while(true) {
> if (c) break;
> // With the current semantics, we will hoist o.f above the loop
> v = o.f;
> }
>
> Despite this, Clang does emit the existing dereferenceable
> attribute in some problematic cases. We have observed miscompiles
> as a result, and optimizer has an assortment of hacks to try not
> to be too aggressive and miscompile too widely.
>
>
> Haven't we already solved this?
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3>
>
> This has been discussed relatively extensively in the past,
> included an accepted review (https://reviews.llvm.org/D61652
> <https://reviews.llvm.org/D61652>) which proposed splitting the
> dereferenceable attribute into two to adress this. However, this
> change never landed and recent findings reveal that we both need a
> broader solution, and have an interesting oppurtunity to take
> advantage of other recent work.
>
> The need for a broader solution comes from the observation that
> deref(N) is not the only attribute with this problem.
> deref_or_null(N) is a fairly obvious case we'd known about with
> the previous proposal, but it was recently realized that other
> allocation related facts have this problem as well. We now have
> specific examples with allocsize(N,M) - and the baked in variants
> in MemoryBuiltins - and suspect there are other attributes, either
> current or future, with the same challenge.
>
> The opportunity comes from the addition of "nofree" attribute. Up
> until recently, we really didn't have a good notion of "free"ing
> an allocation in the abstract machine model. We used to comingle
> this with our notion of capture. (i.e. We'd assume that functions
> which could free must also capture.) With the explicit notion of
> "nofree", we have an approach available to us we didn't before.
>
>
> The Proposal Itself
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4>
>
> The basic idea is that we're going to redefine the currently
> globally scoped attributes (deref, deref_or_null, and allocsize)
> such that they imply a point in time fact only and then combine
> that with nofree to recover the previous global semantics.
>
> More specifically:
>
> * A deref attribute on a function parameter will imply that the
> memory is dereferenceable for a specified number of bytes at
> the instant the function call occurs.
> * A deref attribute on a function return will imply that the
> memory is dereferenceable at the moment of return.
>
> We will then use the point in time fact combined with other
> information to drive inference of the global facts. While in
> principle we may loose optimization potential, we believe this is
> sufficient to infer the global facts in all practical cases we
> care about.
>
> Sample inference cases:
>
> * A deref(N) argument to a function with the nofree and nosync
> function attribute is known to be globally dereferenceable
> within the scope of the function call. We need the nosync to
> ensure that no other thread is freeing the memory on behalf of
> the callee in a coordinated manner.
> * An argument with the attributes deref(N), noalias, and nofree
> is known to be globally dereferenceable within the scope of
> the function call. This relies on the fact that free is
> modeled as writing to the memory freed, and thus noalias
> ensures there is no other argument which can be freed. (See
> discussion below.)
> * A memory allocation in a function with a garbage collector
> which guarantees collection occurs only at explicit safepoints
> and uses the gc.statepoint infrastructure, is known to be
> globally dereferenceable if there are no calls to
> gc.statepoint anywhere in the module. This effectively refines
> the abstract machine model used for garbage collection before
> lowering by RS4GC to disallow explicit deallocation (for
> collectors which opt in).
>
> The items above are described in terms of deref(N) for ease of
> description. The other attributes are handle analogously.
>
> *Explanation*
>
> The "deref(N), noalias, + nofree" argument case requires a bit of
> explanation as it involves a bunch of subtleties.
>
> First, the current wording of nofree argument attribute implies
> that the callee can not arrange for another thread to free the
> object on it's behalf. This is different than the specification of
> the nofree function attribute. There is no "nosync" equivalent for
> function attributes.
>
> Second, the noalias argument attribute is subtle. There's a couple
> of sub-cases worth discussing:
>
> * If the noalias argument is written to (reminder: free is
> modeled as a write), then it must be the only copy of the
> pointer passed to the function and there can be no copies
> passed through memory used in the scope of function.
> * If the noalias argument is only read from, then there may be
> other copies of the pointer. However, all of those copies must
> also be read only. If the object was freed through one of
> those other copies, then we must have at least one writeable
> copy and having the noalias on the read copy was undefined
> behavior to begin with.
>
> Essentially, what we're doing with noalias is using it to promote
> a fact about the pointer to a fact about the object being pointed
> to. Code structure wise, we should probably write it exactly that way.
>
> *Result*
>
> It's important to acknowledge that with this change, we will lose
> the ability to specify global dereferenceability of arguments and
> return values in the general case. We believe the current proposal
> allows us to recover that fact for all interesting cases, but if
> we've missed an important use case we may need to iterate a bit.
>
> We've discussed a few alternatives (below) which could be
> revisited if it turns out we are missing an important use case.
>
>
> Use Cases
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5>
>
> *C++ References* -- A C++ reference implies that the value pointed
> to is dereferenceable at point of declaration, and that the
> reference itself is non-null. Of particular note, an object
> pointed to through a reference can be freed without introducing UB.
>
> class A {int f; };
>
> void ugly_delete(A &a) {delete &a; }
> ugly_delete(*new A());
>
> void ugly_delete2(A &a, A *a2) {
> if (unknown)
> // a.f can be *proven* deref here as it's deref on entry,
> // and no free on path from entry to here.
> x = a.f;
> delete a2;
> }
> auto *a =new A();
> ugly_delete2(*a, a);
>
> A &foo() {...}
> A &a = foo();
> if (unknown)
> delete b;
> // If a and b point to the same object, a.f may not be deref here
> if (unknown2)
> a.f;
>
> *Garbage Collected Objects (Java)* -- LLVM supports two models of
> GCed objects, the abstract machine and the physical machine model.
> The later is essentially the same as that for c++ as deallocation
> points (at safepoints) are explicit. The former has objects
> conceptually live forever (i.e. reclaimation is handled outside
> the model).
>
> class A {int f; }
>
> void foo(A a) {
> ...
> // a.f is trivially deref anywhere in foo
> x= a.f;
> }
>
> A a= new A();
> ...
> // a.f is trivially deref following it's definition
> x= a.f;
>
> A foo();
> a= foo();
> ...
> // a.f is (still) trivially deref
> x= a.f;
>
> *Rust Borrows* -- A rust reference argument (e.g. "borrow") points
> to an object whose lifetime is guaranteed to be longer than the
> reference's defining scope. As such, the object is dereferenceable
> through the scope of the function. Today, rustc does emit a
> dereferenceable attribute using the current globally
> dereferenceable semantic.
>
> pub fn square(num:&i32) ->i32 {
> num* num
> }
> square(&5);
>
> // a could be noalias, but isn't today
> pub fn bar(a:&mut i32, b:&i32) {
> *a= a* b
> }
>
> bar(&mut 5,&2);
>
> // At first appearance, rust does not allow returning references.
> So return
> // attributes are not relevant. This seems like a major language
> hole, so this
> // should probably be checked with a language expert.
>
>
> Migration
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6>
>
> Existing bytecode will be upgraded to the weaker non-global
> semantics. This provides forward compatibility, but does lose
> optimization potential for previously compiled bytecode.
>
> C++ and GC'd language frontends don't change.
>
> Rustc should emit noalias where possible. In particular, 'a' in
> the case 'bar' above is currently not marked noalias and results
> in lost optimization potential as a result of this change.
> According to the rustc code, this is legal, but currently blocked
> on a noalias related miscompile. See
> https://github.com/rust-lang/rust/issues/54462
> <https://github.com/rust-lang/rust/issues/54462> and
> https://github.com/rust-lang/rust/issues/54878
> <https://github.com/rust-lang/rust/issues/54878> for further
> details. (My current belief is that all llvm side blockers have
> been resolved.)
>
> Frontends which want the global semantics should emit noalias,
> nofree, and nosync where appropriate. If this is not enough to
> recover optimizations in common cases, please explain why not.
> It's possible we've failed to account for something.
>
>
> Alternative Designs
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7>
>
> All of the alternate designs listed focus on recovering the full
> global deref semantics. Our hope is that any common case we've
> missed can be resolved with additional inference rules instead.
>
>
> Extend nofree to object semantics
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8>
>
> The nofree argument attribute current describes whether an object
> can freed through some particular copy of the pointer. We could
> strength the semantics to imply that the object is not freed
> through any copy of the pointer in the specified scope.
>
> Doing so greatly weakens our ability to infer the nofree property.
> The current nofree property when combined with capture tracking in
> the caller is enough to prove interest deref facts over calls. We
> don't want to loose the ability to infer that since it enables
> interesting transforms (such as code reordering over calls).
>
>
> Add a separate nofreeobj attribute
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9>
>
> Rather than change nofree, we could add a parallel attribute with
> the stronger object property. This - combined with deref(N) as a
> point in time fact - would be enough to recover the current
> globally deferenceable semantics.
>
> The downside of this alternative is a) possible overkill, and b)
> the "ugly" factor of having two similar but not quite identical
> attributes.
>
>
> Add an orthogonal attribute to promote pointer facts to
> object ones
> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10>
>
> To address the weakness of the former alternative, we could
> specify an attribute which strengthens arbitrary pointer facts to
> object facts. Examples of current pointer facts are attributes
> such as readonly, and writeonly.
>
> This has not been well explored; there's a huge possible design
> space here.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
>
> --
>
> Juneyoung Lee
> Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210321/a009d0c9/attachment-0001.html>
More information about the llvm-dev
mailing list