<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 3/20/21 7:26 PM, Juneyoung Lee
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAGwnbJRKs7Sjb-9X4HEpDjWZ=nkVaGsvD8DoTKAXaGYWYew8yg@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">+1<br>
<div><br>
</div>
<div><br>
</div>
<div>I have one minor question: according to <a
href="https://llvm.org/docs/Atomics.html#optimization-outside-atomic"
moz-do-not-send="true">https://llvm.org/docs/Atomics.html#optimization-outside-atomic</a>
, introducing a store is problematic even if the pointer is
known to be dereferenceable. Does this also apply to
dereferenceable+nosync+nofree?</div>
</div>
</blockquote>
<p>This proposal is *only* discussing dereferenceability. This
question involves concurrency and the memory model which is not
changing at all. So, no, there's no change to when it's safe to
insert a store to a potentially shared location. <br>
</p>
<blockquote type="cite"
cite="mid:CAGwnbJRKs7Sjb-9X4HEpDjWZ=nkVaGsvD8DoTKAXaGYWYew8yg@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, Mar 18, 2021 at 6:22
AM Philip Reames via llvm-dev <<a
href="mailto:llvm-dev@lists.llvm.org" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>TLDR: We should change the existing dereferenceability
related attributes to imply point in time facts only, and
re-infer stronger global dereferenceability facts where
needed.</p>
<h2><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1"
target="_blank" moz-do-not-send="true">Meta</a></h2>
<p>If you prefer to read proposals in a browser, you can
read this email <a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst"
target="_blank" moz-do-not-send="true">here</a>.</p>
<p>This proposal greatly benefited from multiple rounds of
feedback from Johannes, Artur, and Nick. All remaining
mistakes are my own.</p>
<p>Johannes deserves a lot of credit for driving previous
iterations on this design. In particular, I want to note
that we've basically returned to something Johannes first
proposed several years ago, before we had specified the
nofree attribute family.</p>
<h2><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2"
target="_blank" moz-do-not-send="true">The Basic Problem</a></h2>
<p>We have a long standing semantic problem with the way we
define dereferenceability facts which makes it difficult
to express C++ references, or more generally,
dereferenceability on objects which may be freed at some
point in the program. The current structure does lend
itself well to memory which can't be freed. As discussed
in detail a bit later, we want to seamlessly support both
use cases.</p>
<p>The basic statement of the problem is that a piece of
memory marked with deref(N) is assumed to remain
dereferenceable indefinitely. For an object which can be
freed, marking it as deref can enable unsound
transformations in cases like the following:</p>
<pre>o = deref(N) alloc();
if (c) free(o)
while(true) {
if (c) break;
// With the current semantics, we will hoist o.f above the loop
v = o.f;
}
</pre>
<p>Despite this, Clang does emit the existing
dereferenceable attribute in some problematic cases. We
have observed miscompiles as a result, and optimizer has
an assortment of hacks to try not to be too aggressive and
miscompile too widely.</p>
<h2><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3"
target="_blank" moz-do-not-send="true">Haven't we
already solved this?</a></h2>
<p>This has been discussed relatively extensively in the
past, included an accepted review (<a
href="https://reviews.llvm.org/D61652" rel="nofollow"
target="_blank" moz-do-not-send="true">https://reviews.llvm.org/D61652</a>)
which proposed splitting the dereferenceable attribute
into two to adress this. However, this change never landed
and recent findings reveal that we both need a broader
solution, and have an interesting oppurtunity to take
advantage of other recent work.</p>
<p>The need for a broader solution comes from the
observation that deref(N) is not the only attribute with
this problem. deref_or_null(N) is a fairly obvious case
we'd known about with the previous proposal, but it was
recently realized that other allocation related facts have
this problem as well. We now have specific examples with
allocsize(N,M) - and the baked in variants in
MemoryBuiltins - and suspect there are other attributes,
either current or future, with the same challenge.</p>
<p>The opportunity comes from the addition of "nofree"
attribute. Up until recently, we really didn't have a good
notion of "free"ing an allocation in the abstract machine
model. We used to comingle this with our notion of
capture. (i.e. We'd assume that functions which could free
must also capture.) With the explicit notion of "nofree",
we have an approach available to us we didn't before.</p>
<h2><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4"
target="_blank" moz-do-not-send="true">The Proposal
Itself</a></h2>
<p>The basic idea is that we're going to redefine the
currently globally scoped attributes (deref,
deref_or_null, and allocsize) such that they imply a point
in time fact only and then combine that with nofree to
recover the previous global semantics.</p>
<p>More specifically:</p>
<ul>
<li>A deref attribute on a function parameter will imply
that the memory is dereferenceable for a specified
number of bytes at the instant the function call occurs.</li>
<li>A deref attribute on a function return will imply that
the memory is dereferenceable at the moment of return.</li>
</ul>
<p>We will then use the point in time fact combined with
other information to drive inference of the global facts.
While in principle we may loose optimization potential, we
believe this is sufficient to infer the global facts in
all practical cases we care about.</p>
<p>Sample inference cases:</p>
<ul>
<li>A deref(N) argument to a function with the nofree and
nosync function attribute is known to be globally
dereferenceable within the scope of the function call.
We need the nosync to ensure that no other thread is
freeing the memory on behalf of the callee in a
coordinated manner.</li>
<li>An argument with the attributes deref(N), noalias, and
nofree is known to be globally dereferenceable within
the scope of the function call. This relies on the fact
that free is modeled as writing to the memory freed, and
thus noalias ensures there is no other argument which
can be freed. (See discussion below.)</li>
<li>A memory allocation in a function with a garbage
collector which guarantees collection occurs only at
explicit safepoints and uses the gc.statepoint
infrastructure, is known to be globally dereferenceable
if there are no calls to gc.statepoint anywhere in the
module. This effectively refines the abstract machine
model used for garbage collection before lowering by
RS4GC to disallow explicit deallocation (for collectors
which opt in).</li>
</ul>
<p>The items above are described in terms of deref(N) for
ease of description. The other attributes are handle
analogously.</p>
<p><strong>Explanation</strong></p>
<p>The "deref(N), noalias, + nofree" argument case requires
a bit of explanation as it involves a bunch of subtleties.</p>
<p>First, the current wording of nofree argument attribute
implies that the callee can not arrange for another thread
to free the object on it's behalf. This is different than
the specification of the nofree function attribute. There
is no "nosync" equivalent for function attributes.</p>
<p>Second, the noalias argument attribute is subtle. There's
a couple of sub-cases worth discussing:</p>
<ul>
<li>If the noalias argument is written to (reminder: free
is modeled as a write), then it must be the only copy of
the pointer passed to the function and there can be no
copies passed through memory used in the scope of
function.</li>
<li>If the noalias argument is only read from, then there
may be other copies of the pointer. However, all of
those copies must also be read only. If the object was
freed through one of those other copies, then we must
have at least one writeable copy and having the noalias
on the read copy was undefined behavior to begin with.</li>
</ul>
<p>Essentially, what we're doing with noalias is using it to
promote a fact about the pointer to a fact about the
object being pointed to. Code structure wise, we should
probably write it exactly that way.</p>
<p><strong>Result</strong></p>
<p>It's important to acknowledge that with this change, we
will lose the ability to specify global dereferenceability
of arguments and return values in the general case. We
believe the current proposal allows us to recover that
fact for all interesting cases, but if we've missed an
important use case we may need to iterate a bit.</p>
<p>We've discussed a few alternatives (below) which could be
revisited if it turns out we are missing an important use
case.</p>
<h2><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5"
target="_blank" moz-do-not-send="true">Use Cases</a></h2>
<p><strong>C++ References</strong> -- A C++ reference
implies that the value pointed to is dereferenceable at
point of declaration, and that the reference itself is
non-null. Of particular note, an object pointed to through
a reference can be freed without introducing UB.</p>
<div>
<pre><span>class</span> <span>A</span> { <span>int</span> f; };
<span>void</span> <span>ugly_delete</span>(A &a) { <span>delete</span> &a; }
<span>ugly_delete</span>(*<span>new</span> A());
<span>void</span> <span>ugly_delete2</span>(A &a, A *a2) {
<span>if</span> (unknown)
<span><span>//</span> a.f can be *proven* deref here as it's deref on entry,</span>
<span><span>//</span> and no free on path from entry to here.</span>
x = a.<span>f</span>;
<span>delete</span> a2;
}
<span>auto</span> *a = <span>new</span> A();
<span>ugly_delete2</span>(*a, a);
A &<span>foo</span>() {...}
A &a = foo();
<span>if</span> (unknown)
<span>delete</span> b;
<span><span>//</span> If a and b point to the same object, a.f may not be deref here</span>
<span>if</span> (unknown2)
a.f;</pre>
</div>
<p><strong>Garbage Collected Objects (Java)</strong> -- LLVM
supports two models of GCed objects, the abstract machine
and the physical machine model. The later is essentially
the same as that for c++ as deallocation points (at
safepoints) are explicit. The former has objects
conceptually live forever (i.e. reclaimation is handled
outside the model).</p>
<div>
<pre><span>class</span> <span>A</span> { <span>int</span> f; }
<span>void</span> foo(<span>A</span> a) {
<span>...</span>
<span><span>//</span> a.f is trivially deref anywhere in foo</span>
x <span>=</span> a<span>.</span>f;
}
<span>A</span> a <span>=</span> <span>new</span> <span>A</span>();
<span>...</span>
<span><span>//</span> a.f is trivially deref following it's definition</span>
x <span>=</span> a<span>.</span>f;
<span>A</span> foo();
a <span>=</span> foo();
<span>...</span>
<span><span>//</span> a.f is (still) trivially deref</span>
x <span>=</span> a<span>.</span>f;</pre>
</div>
<p><strong>Rust Borrows</strong> -- A rust reference
argument (e.g. "borrow") points to an object whose
lifetime is guaranteed to be longer than the reference's
defining scope. As such, the object is dereferenceable
through the scope of the function. Today, rustc does emit
a dereferenceable attribute using the current globally
dereferenceable semantic.</p>
<div>
<pre><span>pub</span> <span>fn</span> <span>square</span>(num: <span>&</span><span>i32</span>) -> <span>i32</span> {
num <span>*</span> num
}
<span>square</span>(<span>&</span><span>5</span>);
<span>// a could be noalias, but isn't today</span>
<span>pub</span> <span>fn</span> <span>bar</span>(a: <span>&</span><span>mut</span> <span>i32</span>, b: <span>&</span><span>i32</span>) {
<span>*</span>a <span>=</span> a <span>*</span> b
}
<span>bar</span>(<span>&</span><span>mut</span> <span>5</span>, <span>&</span><span>2</span>);
<span>// At first appearance, rust does not allow returning references. So return</span>
<span>// attributes are not relevant. This seems like a major language hole, so this</span>
<span>// should probably be checked with a language expert.</span></pre>
</div>
<h2><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6"
target="_blank" moz-do-not-send="true">Migration</a></h2>
<p>Existing bytecode will be upgraded to the weaker
non-global semantics. This provides forward compatibility,
but does lose optimization potential for previously
compiled bytecode.</p>
<p>C++ and GC'd language frontends don't change.</p>
<p>Rustc should emit noalias where possible. In particular,
'a' in the case 'bar' above is currently not marked
noalias and results in lost optimization potential as a
result of this change. According to the rustc code, this
is legal, but currently blocked on a noalias related
miscompile. See <a
href="https://github.com/rust-lang/rust/issues/54462"
target="_blank" moz-do-not-send="true">https://github.com/rust-lang/rust/issues/54462</a>
and <a
href="https://github.com/rust-lang/rust/issues/54878"
target="_blank" moz-do-not-send="true">https://github.com/rust-lang/rust/issues/54878</a>
for further details. (My current belief is that all llvm
side blockers have been resolved.)</p>
<p>Frontends which want the global semantics should emit
noalias, nofree, and nosync where appropriate. If this is
not enough to recover optimizations in common cases,
please explain why not. It's possible we've failed to
account for something.</p>
<h2><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7"
target="_blank" moz-do-not-send="true">Alternative
Designs</a></h2>
<p>All of the alternate designs listed focus on recovering
the full global deref semantics. Our hope is that any
common case we've missed can be resolved with additional
inference rules instead.</p>
<h3><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8"
target="_blank" moz-do-not-send="true">Extend nofree to
object semantics</a></h3>
<p>The nofree argument attribute current describes whether
an object can freed through some particular copy of the
pointer. We could strength the semantics to imply that the
object is not freed through any copy of the pointer in the
specified scope.</p>
<p>Doing so greatly weakens our ability to infer the nofree
property. The current nofree property when combined with
capture tracking in the caller is enough to prove interest
deref facts over calls. We don't want to loose the ability
to infer that since it enables interesting transforms
(such as code reordering over calls).</p>
<h3><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9"
target="_blank" moz-do-not-send="true">Add a separate
nofreeobj attribute</a></h3>
<p>Rather than change nofree, we could add a parallel
attribute with the stronger object property. This -
combined with deref(N) as a point in time fact - would be
enough to recover the current globally deferenceable
semantics.</p>
<p>The downside of this alternative is a) possible overkill,
and b) the "ugly" factor of having two similar but not
quite identical attributes.</p>
<h3><a
href="https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10"
target="_blank" moz-do-not-send="true">Add an orthogonal
attribute to promote pointer facts to object ones</a></h3>
<p>To address the weakness of the former alternative, we
could specify an attribute which strengthens arbitrary
pointer facts to object facts. Examples of current pointer
facts are attributes such as readonly, and writeonly.</p>
<p>This has not been well explored; there's a huge possible
design space here.</p>
</div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"
moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
<a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div><br>
</div>
<font size="1">Juneyoung Lee</font>
<div><font size="1">Software Foundation Lab, Seoul National
University</font></div>
</div>
</div>
</blockquote>
</body>
</html>