[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Wed Sep 29 11:28:21 PDT 2021
The consensus in the responses to my previous email was that we should
go ahead and redefine dereferenceable to mean dereferenceable at the
point-in-time the instruction executes (i.e. my option 2 below). I
finally got the patch which does this posted for review. Interested
readers should see https://reviews.llvm.org/D110745.
Philip
On 7/12/21 9:04 AM, Philip Reames wrote:
>
> At this point, I find myself needing to declare that the proposal
> below is a failure, and ask the community what next steps we'd prefer.
>
> This effort stumbled into the fact that we don't seem to have any
> actual agreement on what the semantics of various attributes are. In
> particular, the semantics of nofree don't appear to be in a usable
> state, and my attempts at driving consensus have failed. I am not
> willing to continue investing effort in that direction.
>
> Given that, I see three options, and need input from the community as
> to which we should chose.
>
> Option 1 - Back out the couple of changes which have landed, update
> LangRef to be explicit about the scoped dereferenceability we had
> historically, and consider this effort a failure.
>
> Option 2 - Change the semantic of the attributes to the point in time
> semantic *without* attempting any further inference of the scoped
> semantics. At the current moment, the Java use case is covered (via
> the GC rule), no one seems to care about the lost optimization power
> for C/C++, and I am unclear on the practical impact (if any) on rust.
>
> Option 3 - Introduce a new 'nofreeobj' attribute whose semantics would
> be specifically that an object is not freed in the dynamic scope of
> the function through any mechanism (including concurrency). This
> attribute would be basically uninferrable, and would exist only to
> support language guarantees being encoded by frontends.
>
> My recommendation would be for option 2, than 3, than 1. It's worth
> noting that we could also chose option 2, then implement option 3
> lazily if anyone reports a practical performance regression.
>
> Philip
>
> On 3/17/21 2:22 PM, Philip Reames via llvm-dev wrote:
>>
>> TLDR: We should change the existing dereferenceability related
>> attributes to imply point in time facts only, and re-infer stronger
>> global dereferenceability facts where needed.
>>
>>
>> Meta
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1>
>>
>> If you prefer to read proposals in a browser, you can read this email
>> here
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst>.
>>
>> This proposal greatly benefited from multiple rounds of feedback from
>> Johannes, Artur, and Nick. All remaining mistakes are my own.
>>
>> Johannes deserves a lot of credit for driving previous iterations on
>> this design. In particular, I want to note that we've basically
>> returned to something Johannes first proposed several years ago,
>> before we had specified the nofree attribute family.
>>
>>
>> The Basic Problem
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2>
>>
>> We have a long standing semantic problem with the way we define
>> dereferenceability facts which makes it difficult to express C++
>> references, or more generally, dereferenceability on objects which
>> may be freed at some point in the program. The current structure does
>> lend itself well to memory which can't be freed. As discussed in
>> detail a bit later, we want to seamlessly support both use cases.
>>
>> The basic statement of the problem is that a piece of memory marked
>> with deref(N) is assumed to remain dereferenceable indefinitely. For
>> an object which can be freed, marking it as deref can enable unsound
>> transformations in cases like the following:
>>
>> o = deref(N) alloc();
>> if (c) free(o)
>> while(true) {
>> if (c) break;
>> // With the current semantics, we will hoist o.f above the loop
>> v = o.f;
>> }
>>
>> Despite this, Clang does emit the existing dereferenceable attribute
>> in some problematic cases. We have observed miscompiles as a result,
>> and optimizer has an assortment of hacks to try not to be too
>> aggressive and miscompile too
>> widely.<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#havent-we-already-solved-this>
>>
>>
>> Haven't we already solved this?
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3>
>>
>> This has been discussed relatively extensively in the past, included
>> an accepted review (https://reviews.llvm.org/D61652
>> <https://reviews.llvm.org/D61652>) which proposed splitting the
>> dereferenceable attribute into two to adress this. However, this
>> change never landed and recent findings reveal that we both need a
>> broader solution, and have an interesting oppurtunity to take
>> advantage of other recent work.
>>
>> The need for a broader solution comes from the observation that
>> deref(N) is not the only attribute with this problem.
>> deref_or_null(N) is a fairly obvious case we'd known about with the
>> previous proposal, but it was recently realized that other allocation
>> related facts have this problem as well. We now have specific
>> examples with allocsize(N,M) - and the baked in variants in
>> MemoryBuiltins - and suspect there are other attributes, either
>> current or future, with the same challenge.
>>
>> The opportunity comes from the addition of "nofree" attribute. Up
>> until recently, we really didn't have a good notion of "free"ing an
>> allocation in the abstract machine model. We used to comingle this
>> with our notion of capture. (i.e. We'd assume that functions which
>> could free must also capture.) With the explicit notion of "nofree",
>> we have an approach available to us we didn't before.
>>
>>
>> The Proposal Itself
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4>
>>
>> The basic idea is that we're going to redefine the currently globally
>> scoped attributes (deref, deref_or_null, and allocsize) such that
>> they imply a point in time fact only and then combine that with
>> nofree to recover the previous global semantics.
>>
>> More specifically:
>>
>> * A deref attribute on a function parameter will imply that the
>> memory is dereferenceable for a specified number of bytes at the
>> instant the function call occurs.
>> * A deref attribute on a function return will imply that the memory
>> is dereferenceable at the moment of return.
>>
>> We will then use the point in time fact combined with other
>> information to drive inference of the global facts. While in
>> principle we may loose optimization potential, we believe this is
>> sufficient to infer the global facts in all practical cases we care
>> about.
>>
>> Sample inference cases:
>>
>> * A deref(N) argument to a function with the nofree and nosync
>> function attribute is known to be globally dereferenceable within
>> the scope of the function call. We need the nosync to ensure that
>> no other thread is freeing the memory on behalf of the callee in
>> a coordinated manner.
>> * An argument with the attributes deref(N), noalias, and nofree is
>> known to be globally dereferenceable within the scope of the
>> function call. This relies on the fact that free is modeled as
>> writing to the memory freed, and thus noalias ensures there is no
>> other argument which can be freed. (See discussion below.)
>> * A memory allocation in a function with a garbage collector which
>> guarantees collection occurs only at explicit safepoints and uses
>> the gc.statepoint infrastructure, is known to be globally
>> dereferenceable if there are no calls to gc.statepoint anywhere
>> in the module. This effectively refines the abstract machine
>> model used for garbage collection before lowering by RS4GC to
>> disallow explicit deallocation (for collectors which opt in).
>>
>> The items above are described in terms of deref(N) for ease of
>> description. The other attributes are handle analogously.
>>
>> *Explanation*
>>
>> The "deref(N), noalias, + nofree" argument case requires a bit of
>> explanation as it involves a bunch of subtleties.
>>
>> First, the current wording of nofree argument attribute implies that
>> the callee can not arrange for another thread to free the object on
>> it's behalf. This is different than the specification of the nofree
>> function attribute. There is no "nosync" equivalent for function
>> attributes.
>>
>> Second, the noalias argument attribute is subtle. There's a couple of
>> sub-cases worth discussing:
>>
>> * If the noalias argument is written to (reminder: free is modeled
>> as a write), then it must be the only copy of the pointer passed
>> to the function and there can be no copies passed through memory
>> used in the scope of function.
>> * If the noalias argument is only read from, then there may be
>> other copies of the pointer. However, all of those copies must
>> also be read only. If the object was freed through one of those
>> other copies, then we must have at least one writeable copy and
>> having the noalias on the read copy was undefined behavior to
>> begin with.
>>
>> Essentially, what we're doing with noalias is using it to promote a
>> fact about the pointer to a fact about the object being pointed to.
>> Code structure wise, we should probably write it exactly that way.
>>
>> *Result*
>>
>> It's important to acknowledge that with this change, we will lose the
>> ability to specify global dereferenceability of arguments and return
>> values in the general case. We believe the current proposal allows us
>> to recover that fact for all interesting cases, but if we've missed
>> an important use case we may need to iterate a bit.
>>
>> We've discussed a few alternatives (below) which could be revisited
>> if it turns out we are missing an important use case.
>>
>>
>> Use Cases
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5>
>>
>> *C++ References* -- A C++ reference implies that the value pointed to
>> is dereferenceable at point of declaration, and that the reference
>> itself is non-null. Of particular note, an object pointed to through
>> a reference can be freed without introducing UB.
>>
>> class A {int f; };
>>
>> void ugly_delete(A &a) {delete &a; }
>> ugly_delete(*new A());
>>
>> void ugly_delete2(A &a, A *a2) {
>> if (unknown)
>> // a.f can be *proven* deref here as it's deref on entry,
>> // and no free on path from entry to here.
>> x = a.f;
>> delete a2;
>> }
>> auto *a =new A();
>> ugly_delete2(*a, a);
>>
>> A &foo() {...}
>> A &a = foo();
>> if (unknown)
>> delete b;
>> // If a and b point to the same object, a.f may not be deref here
>> if (unknown2)
>> a.f;
>>
>> *Garbage Collected Objects (Java)* -- LLVM supports two models of
>> GCed objects, the abstract machine and the physical machine model.
>> The later is essentially the same as that for c++ as deallocation
>> points (at safepoints) are explicit. The former has objects
>> conceptually live forever (i.e. reclaimation is handled outside the
>> model).
>>
>> class A {int f; }
>>
>> void foo(A a) {
>> ...
>> // a.f is trivially deref anywhere in foo
>> x= a.f;
>> }
>>
>> A a= new A();
>> ...
>> // a.f is trivially deref following it's definition
>> x= a.f;
>>
>> A foo();
>> a= foo();
>> ...
>> // a.f is (still) trivially deref
>> x= a.f;
>>
>> *Rust Borrows* -- A rust reference argument (e.g. "borrow") points to
>> an object whose lifetime is guaranteed to be longer than the
>> reference's defining scope. As such, the object is dereferenceable
>> through the scope of the function. Today, rustc does emit a
>> dereferenceable attribute using the current globally dereferenceable
>> semantic.
>>
>> pub fn square(num:&i32) ->i32 {
>> num* num
>> }
>> square(&5);
>>
>> // a could be noalias, but isn't today
>> pub fn bar(a:&mut i32, b:&i32) {
>> *a= a* b
>> }
>>
>> bar(&mut 5,&2);
>>
>> // At first appearance, rust does not allow returning references. So
>> return
>> // attributes are not relevant. This seems like a major language
>> hole, so this
>> // should probably be checked with a language expert.
>>
>>
>> Migration
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6>
>>
>> Existing bytecode will be upgraded to the weaker non-global
>> semantics. This provides forward compatibility, but does lose
>> optimization potential for previously compiled bytecode.
>>
>> C++ and GC'd language frontends don't change.
>>
>> Rustc should emit noalias where possible. In particular, 'a' in the
>> case 'bar' above is currently not marked noalias and results in lost
>> optimization potential as a result of this change. According to the
>> rustc code, this is legal, but currently blocked on a noalias related
>> miscompile. See https://github.com/rust-lang/rust/issues/54462
>> <https://github.com/rust-lang/rust/issues/54462> and
>> https://github.com/rust-lang/rust/issues/54878
>> <https://github.com/rust-lang/rust/issues/54878> for further details.
>> (My current belief is that all llvm side blockers have been resolved.)
>>
>> Frontends which want the global semantics should emit noalias,
>> nofree, and nosync where appropriate. If this is not enough to
>> recover optimizations in common cases, please explain why not. It's
>> possible we've failed to account for something.
>>
>>
>> Alternative Designs
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7>
>>
>> All of the alternate designs listed focus on recovering the full
>> global deref semantics. Our hope is that any common case we've missed
>> can be resolved with additional inference rules instead.
>>
>>
>> Extend nofree to object semantics
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8>
>>
>> The nofree argument attribute current describes whether an object can
>> freed through some particular copy of the pointer. We could strength
>> the semantics to imply that the object is not freed through any copy
>> of the pointer in the specified scope.
>>
>> Doing so greatly weakens our ability to infer the nofree property.
>> The current nofree property when combined with capture tracking in
>> the caller is enough to prove interest deref facts over calls. We
>> don't want to loose the ability to infer that since it enables
>> interesting transforms (such as code reordering over calls).
>>
>>
>> Add a separate nofreeobj attribute
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9>
>>
>> Rather than change nofree, we could add a parallel attribute with the
>> stronger object property. This - combined with deref(N) as a point in
>> time fact - would be enough to recover the current globally
>> deferenceable semantics.
>>
>> The downside of this alternative is a) possible overkill, and b) the
>> "ugly" factor of having two similar but not quite identical attributes.
>>
>>
>> Add an orthogonal attribute to promote pointer facts to object
>> ones
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10>
>>
>> To address the weakness of the former alternative, we could specify
>> an attribute which strengthens arbitrary pointer facts to object
>> facts. Examples of current pointer facts are attributes such as
>> readonly, and writeonly.
>>
>> This has not been well explored; there's a huge possible design space
>> here.
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210929/fb3b2f5f/attachment-0001.html>
More information about the llvm-dev
mailing list