[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

Wed Sep 29 11:28:21 PDT 2021

The consensus in the responses to my previous email was that we should 
go ahead and redefine dereferenceable to mean dereferenceable at the 
point-in-time the instruction executes (i.e. my option 2 below).  I 
finally got the patch which does this posted for review.  Interested 
readers should see https://reviews.llvm.org/D110745.

Philip

On 7/12/21 9:04 AM, Philip Reames wrote:
>
> At this point, I find myself needing to declare that the proposal 
> below is a failure, and ask the community what next steps we'd prefer.
>
> This effort stumbled into the fact that we don't seem to have any 
> actual agreement on what the semantics of various attributes are.  In 
> particular, the semantics of nofree don't appear to be in a usable 
> state, and my attempts at driving consensus have failed.  I am not 
> willing to continue investing effort in that direction.
>
> Given that, I see three options, and need input from the community as 
> to which we should chose.
>
> Option 1 - Back out the couple of changes which have landed, update 
> LangRef to be explicit about the scoped dereferenceability we had 
> historically, and consider this effort a failure.
>
> Option 2 - Change the semantic of the attributes to the point in time 
> semantic *without* attempting any further inference of the scoped 
> semantics.  At the current moment, the Java use case is covered (via 
> the GC rule), no one seems to care about the lost optimization power 
> for C/C++, and I am unclear on the practical impact (if any) on rust.
>
> Option 3 - Introduce a new 'nofreeobj' attribute whose semantics would 
> be specifically that an object is not freed in the dynamic scope of 
> the function through any mechanism (including concurrency).  This 
> attribute would be basically uninferrable, and would exist only to 
> support language guarantees being encoded by frontends.
>
> My recommendation would be for option 2, than 3, than 1.  It's worth 
> noting that we could also chose option 2, then implement option 3 
> lazily if anyone reports a practical performance regression.
>
> Philip
>
> On 3/17/21 2:22 PM, Philip Reames via llvm-dev wrote:
>>
>> TLDR: We should change the existing dereferenceability related 
>> attributes to imply point in time facts only, and re-infer stronger 
>> global dereferenceability facts where needed.
>>
>>
>>     Meta
>>     <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1>
>>
>> If you prefer to read proposals in a browser, you can read this email 
>> here 
>> <https://github.com/preames/public-notes/blob/master/deref+nofree.rst>.
>>
>> This proposal greatly benefited from multiple rounds of feedback from 
>> Johannes, Artur, and Nick. All remaining mistakes are my own.
>>
>> Johannes deserves a lot of credit for driving previous iterations on 
>> this design. In particular, I want to note that we've basically 
>> returned to something Johannes first proposed several years ago, 
>> before we had specified the nofree attribute family.
>>
>>
>>     The Basic Problem
>>     <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2>
>>
>> We have a long standing semantic problem with the way we define 
>> dereferenceability facts which makes it difficult to express C++ 
>> references, or more generally, dereferenceability on objects which 
>> may be freed at some point in the program. The current structure does 
>> lend itself well to memory which can't be freed. As discussed in 
>> detail a bit later, we want to seamlessly support both use cases.
>>
>> The basic statement of the problem is that a piece of memory marked 
>> with deref(N) is assumed to remain dereferenceable indefinitely. For 
>> an object which can be freed, marking it as deref can enable unsound 
>> transformations in cases like the following:
>>
>> o = deref(N) alloc();
>> if (c) free(o)
>> while(true) {
>>    if (c) break;
>>    // With the current semantics, we will hoist o.f above the loop
>>    v = o.f;
>> }
>>
>> Despite this, Clang does emit the existing dereferenceable attribute 
>> in some problematic cases. We have observed miscompiles as a result, 
>> and optimizer has an assortment of hacks to try not to be too 
>> aggressive and miscompile too 
>> widely.<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#havent-we-already-solved-this>
>>
>>
>>     Haven't we already solved this?
>>     <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3>
>>
>> This has been discussed relatively extensively in the past, included 
>> an accepted review (https://reviews.llvm.org/D61652 
>> <https://reviews.llvm.org/D61652>) which proposed splitting the 
>> dereferenceable attribute into two to adress this. However, this 
>> change never landed and recent findings reveal that we both need a 
>> broader solution, and have an interesting oppurtunity to take 
>> advantage of other recent work.
>>
>> The need for a broader solution comes from the observation that 
>> deref(N) is not the only attribute with this problem. 
>> deref_or_null(N) is a fairly obvious case we'd known about with the 
>> previous proposal, but it was recently realized that other allocation 
>> related facts have this problem as well. We now have specific 
>> examples with allocsize(N,M) - and the baked in variants in 
>> MemoryBuiltins - and suspect there are other attributes, either 
>> current or future, with the same challenge.
>>
>> The opportunity comes from the addition of "nofree" attribute. Up 
>> until recently, we really didn't have a good notion of "free"ing an 
>> allocation in the abstract machine model. We used to comingle this 
>> with our notion of capture. (i.e. We'd assume that functions which 
>> could free must also capture.) With the explicit notion of "nofree", 
>> we have an approach available to us we didn't before.
>>
>>
>>     The Proposal Itself
>>     <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4>
>>
>> The basic idea is that we're going to redefine the currently globally 
>> scoped attributes (deref, deref_or_null, and allocsize) such that 
>> they imply a point in time fact only and then combine that with 
>> nofree to recover the previous global semantics.
>>
>> More specifically:
>>
>>   * A deref attribute on a function parameter will imply that the
>>     memory is dereferenceable for a specified number of bytes at the
>>     instant the function call occurs.
>>   * A deref attribute on a function return will imply that the memory
>>     is dereferenceable at the moment of return.
>>
>> We will then use the point in time fact combined with other 
>> information to drive inference of the global facts. While in 
>> principle we may loose optimization potential, we believe this is 
>> sufficient to infer the global facts in all practical cases we care 
>> about.
>>
>> Sample inference cases:
>>
>>   * A deref(N) argument to a function with the nofree and nosync
>>     function attribute is known to be globally dereferenceable within
>>     the scope of the function call. We need the nosync to ensure that
>>     no other thread is freeing the memory on behalf of the callee in
>>     a coordinated manner.
>>   * An argument with the attributes deref(N), noalias, and nofree is
>>     known to be globally dereferenceable within the scope of the
>>     function call. This relies on the fact that free is modeled as
>>     writing to the memory freed, and thus noalias ensures there is no
>>     other argument which can be freed. (See discussion below.)
>>   * A memory allocation in a function with a garbage collector which
>>     guarantees collection occurs only at explicit safepoints and uses
>>     the gc.statepoint infrastructure, is known to be globally
>>     dereferenceable if there are no calls to gc.statepoint anywhere
>>     in the module. This effectively refines the abstract machine
>>     model used for garbage collection before lowering by RS4GC to
>>     disallow explicit deallocation (for collectors which opt in).
>>
>> The items above are described in terms of deref(N) for ease of 
>> description. The other attributes are handle analogously.
>>
>> *Explanation*
>>
>> The "deref(N), noalias, + nofree" argument case requires a bit of 
>> explanation as it involves a bunch of subtleties.
>>
>> First, the current wording of nofree argument attribute implies that 
>> the callee can not arrange for another thread to free the object on 
>> it's behalf. This is different than the specification of the nofree 
>> function attribute. There is no "nosync" equivalent for function 
>> attributes.
>>
>> Second, the noalias argument attribute is subtle. There's a couple of 
>> sub-cases worth discussing:
>>
>>   * If the noalias argument is written to (reminder: free is modeled
>>     as a write), then it must be the only copy of the pointer passed
>>     to the function and there can be no copies passed through memory
>>     used in the scope of function.
>>   * If the noalias argument is only read from, then there may be
>>     other copies of the pointer. However, all of those copies must
>>     also be read only. If the object was freed through one of those
>>     other copies, then we must have at least one writeable copy and
>>     having the noalias on the read copy was undefined behavior to
>>     begin with.
>>
>> Essentially, what we're doing with noalias is using it to promote a 
>> fact about the pointer to a fact about the object being pointed to. 
>> Code structure wise, we should probably write it exactly that way.
>>
>> *Result*
>>
>> It's important to acknowledge that with this change, we will lose the 
>> ability to specify global dereferenceability of arguments and return 
>> values in the general case. We believe the current proposal allows us 
>> to recover that fact for all interesting cases, but if we've missed 
>> an important use case we may need to iterate a bit.
>>
>> We've discussed a few alternatives (below) which could be revisited 
>> if it turns out we are missing an important use case.
>>
>>
>>     Use Cases
>>     <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5>
>>
>> *C++ References* -- A C++ reference implies that the value pointed to 
>> is dereferenceable at point of declaration, and that the reference 
>> itself is non-null. Of particular note, an object pointed to through 
>> a reference can be freed without introducing UB.
>>
>> class  A  {int  f; };
>>
>> void  ugly_delete(A &a) {delete  &a; }
>> ugly_delete(*new  A());
>>
>> void  ugly_delete2(A &a, A *a2) {
>>    if  (unknown)
>>      // a.f can be *proven* deref here as it's deref on entry,
>>      // and no free on path from entry to here.
>>      x = a.f;
>>    delete  a2;
>> }
>> auto  *a =new  A();
>> ugly_delete2(*a, a);
>>
>> A &foo() {...}
>> A &a = foo();
>> if  (unknown)
>>    delete  b;
>> // If a and b point to the same object, a.f may not be deref here
>> if  (unknown2)
>>    a.f;
>>
>> *Garbage Collected Objects (Java)* -- LLVM supports two models of 
>> GCed objects, the abstract machine and the physical machine model. 
>> The later is essentially the same as that for c++ as deallocation 
>> points (at safepoints) are explicit. The former has objects 
>> conceptually live forever (i.e. reclaimation is handled outside the 
>> model).
>>
>> class  A  {int  f; }
>>
>> void  foo(A  a) {
>>    ...
>>    // a.f is trivially deref anywhere in foo
>>    x=  a.f;
>> }
>>
>> A  a=  new  A();
>> ...
>> // a.f is trivially deref following it's definition
>> x=  a.f;
>>
>> A  foo();
>> a=  foo();
>> ...
>> // a.f is (still) trivially deref
>> x=  a.f;
>>
>> *Rust Borrows* -- A rust reference argument (e.g. "borrow") points to 
>> an object whose lifetime is guaranteed to be longer than the 
>> reference's defining scope. As such, the object is dereferenceable 
>> through the scope of the function. Today, rustc does emit a 
>> dereferenceable attribute using the current globally dereferenceable 
>> semantic.
>>
>> pub  fn  square(num:&i32) ->i32  {
>>    num*  num
>> }
>> square(&5);
>>
>> // a could be noalias, but isn't today
>> pub  fn  bar(a:&mut  i32, b:&i32) {
>>    *a=  a*  b
>> }
>>
>> bar(&mut  5,&2);
>>
>> // At first appearance, rust does not allow returning references. So 
>> return
>> // attributes are not relevant. This seems like a major language 
>> hole, so this
>> // should probably be checked with a language expert.
>>
>>
>>     Migration
>>     <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6>
>>
>> Existing bytecode will be upgraded to the weaker non-global 
>> semantics. This provides forward compatibility, but does lose 
>> optimization potential for previously compiled bytecode.
>>
>> C++ and GC'd language frontends don't change.
>>
>> Rustc should emit noalias where possible. In particular, 'a' in the 
>> case 'bar' above is currently not marked noalias and results in lost 
>> optimization potential as a result of this change. According to the 
>> rustc code, this is legal, but currently blocked on a noalias related 
>> miscompile. See https://github.com/rust-lang/rust/issues/54462 
>> <https://github.com/rust-lang/rust/issues/54462> and 
>> https://github.com/rust-lang/rust/issues/54878 
>> <https://github.com/rust-lang/rust/issues/54878> for further details. 
>> (My current belief is that all llvm side blockers have been resolved.)
>>
>> Frontends which want the global semantics should emit noalias, 
>> nofree, and nosync where appropriate. If this is not enough to 
>> recover optimizations in common cases, please explain why not. It's 
>> possible we've failed to account for something.
>>
>>
>>     Alternative Designs
>>     <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7>
>>
>> All of the alternate designs listed focus on recovering the full 
>> global deref semantics. Our hope is that any common case we've missed 
>> can be resolved with additional inference rules instead.
>>
>>
>>       Extend nofree to object semantics
>>       <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8>
>>
>> The nofree argument attribute current describes whether an object can 
>> freed through some particular copy of the pointer. We could strength 
>> the semantics to imply that the object is not freed through any copy 
>> of the pointer in the specified scope.
>>
>> Doing so greatly weakens our ability to infer the nofree property. 
>> The current nofree property when combined with capture tracking in 
>> the caller is enough to prove interest deref facts over calls. We 
>> don't want to loose the ability to infer that since it enables 
>> interesting transforms (such as code reordering over calls).
>>
>>
>>       Add a separate nofreeobj attribute
>>       <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9>
>>
>> Rather than change nofree, we could add a parallel attribute with the 
>> stronger object property. This - combined with deref(N) as a point in 
>> time fact - would be enough to recover the current globally 
>> deferenceable semantics.
>>
>> The downside of this alternative is a) possible overkill, and b) the 
>> "ugly" factor of having two similar but not quite identical attributes.
>>
>>
>>       Add an orthogonal attribute to promote pointer facts to object
>>       ones
>>       <https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10>
>>
>> To address the weakness of the former alternative, we could specify 
>> an attribute which strengthens arbitrary pointer facts to object 
>> facts. Examples of current pointer facts are attributes such as 
>> readonly, and writeonly.
>>
>> This has not been well explored; there's a huge possible design space 
>> here.
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210929/fb3b2f5f/attachment-0001.html>