<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Replying to self to summarize takeaways.</p>
<p>The consensus in responses seems to be strongly in favor of
option #2. My main hesitation with option #2 has always been the
lost ability for a frontend to provide the stronger scoped fact,
but in the process of writing a detailed response to Johannes
downthread, I realized we don't have a motivating example frontend
which actually benefits from said ability. Given that, it seems
like option #2 wins over option #1. <br>
</p>
<p>It seems stopping to ask for a sanity check was well warranted in
this case. :)</p>
<p>I will go ahead and cleanup <a class="moz-txt-link-freetext"
href="https://reviews.llvm.org/D100141">https://reviews.llvm.org/D100141</a>
into a real patch. I will give it a couple of days (concurrent
with review) to give anyone who might disagree with this decision
a chance to see this thread and chime in.</p>
<p>Philip<br>
</p>
<div class="moz-cite-prefix">On 4/9/21 12:05 PM, Philip Reames
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:4ea19e75-8655-a8ce-285b-5e30106c39b9@philipreames.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<p>I've stumbled across a case related to the nofree attribute
where we seem to have inconsistent interpretations of the
attribute semantic in tree. I'd like some input from others as
to what the "right" semantic should be.</p>
<p>The basic question is does the presence of nofree prevent the
callee from allocating and freeing memory entirely within it's
dynamic scope? At first, it seems obvious that it does, but
that turns out to be a bit inconsistent with other attributes
and leads to some surprising results. <br>
</p>
<p>For reference in the following discussion, here is the current
wording for the nofree function attribute in LangRef:</p>
<blockquote>
<p>"This function attribute indicates that the function does
not, directly or indirectly, call a memory-deallocation
function (free, for example). As a result, uncaptured pointers
that are known to be dereferenceable prior to a call to a
function with the <code class="docutils literal notranslate"><span
class="pre">nofree</span></code> attribute are still known
to be dereferenceable after the call (the capturing condition
is necessary in environments where the function might
communicate the pointer to another thread which then
deallocates the memory)."</p>
</blockquote>
<p>For discussion purposes, please assume the concurrency case has
been separately proven. That's not the point I'm getting at
here.<br>
</p>
<p>The two possible semantics as I see them are:</p>
<p><b>Option 1</b> - nofree implies no call to free, period</p>
<p>This is the one that to me seems most consistent with the
current wording, but it prevents the callee from allocating
storage and freeing it entirely within it's scope. This is, for
instance, a reasonable thing a target might want to do when
lowering large allocs. This requires transforms to be careful
in stripping the attribute, but isn't entirely horrible.<br>
</p>
<p>The more surprising bit is that it means we can not infer
nofree from readonly or readnone. Why? Because both are
specified only in terms of memory effects visible to the
caller. As a result, a readnone function can allocate storage,
write to it, and still be readonly. Our current inference rules
for readnone and readonly do exploit this flexibility.</p>
<p>The optimizer does currently assume that readonly implies
nofree. (See the accessor on Function) Removing this
substantially weakens our ability to infer nofree when faced
with a function declaration which hasn't been explicitly
annotated for nofree. We can get most of this back by adding
appropriate annotations to intrinsics, but not all. <br>
</p>
<p><b>Option 2</b> - nofree applies to memory visible to the
caller</p>
<p>In this case, we'd add wording to the nofree definition
analogous to that in the readonly/readnone specification.
(There's a subtlety about the precise definition of visible
here, but for the moment, let's hand wave in the same way we do
for the other attributes.)</p>
<p>This allows us to infer nofree from readonly, but essentially
cripples our ability to drive transformations within an
annotated function. We'd have to restrict all transforms and
inference to cases where we can prove that the object being
affected is visible to the caller. <br>
</p>
<p>The benefit is that this makes it slightly easier to infer
nofree in some cases. The main impact of this is improving
ability to reason about dereferenceability for uncaptured
objects over calls to functions for which we inferred nofree. <br>
</p>
<p>The downside of this is that we essentially loose all ability
to reason about nofree in a context free manner. For a specific
example of the impact of this, it means we can't infer
dereferenceability for an object allocated in F, and returned
(e.g. not freed), in the scope of F.</p>
<p>This breaks hoisting and vectorization improvements (e.g.
unconditional loads instead of predicated ones) I've checked in
over the last few months, and makes the ongoing deref
redefinition work substantially harder. <a
class="moz-txt-link-freetext"
href="https://reviews.llvm.org/D100141" moz-do-not-send="true">https://reviews.llvm.org/D100141</a>
shows what this looks like code wise.<br>
</p>
<p><b>My Take</b><br>
</p>
<p>At first, I was strongly convinced that option 1 was the right
choice. So much so in fact that I nearly didn't bother to post
this question. However, after giving it more thought, I've come
to distrust my own response a bit. I definitely have a conflict
of interest here. Option 2 requires me to effectively cripple
several recent optimizer enhancements, and maybe even revert
some code which becomes effectively useless. It also makes a
project I'm currently working on (deref redef) substantially
harder. <br>
</p>
<p>On the other hand, the inconsistency with readonly and readnone
is surprising. I can see an argument for that being the right
overall approach long term.</p>
<p>So essentially, this email is me asking for a sanity check. Do
folks think option 1 is the right option? Or am I forcing it to
be the right option because it makes things easier for me?</p>
<p>Philip<br>
</p>
</blockquote>
</body>
</html>