<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 4/12/21 10:58 AM, Nuno Lopes wrote:<br>
</div>
<blockquote type="cite"
cite="mid:003901d72fc5$7631d8c0$62958a40$@sapo.pt">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style>@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}span.pre
{mso-style-name:pre;}span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}div.WordSection1
{page:WordSection1;}</style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">I like option 2. I agree that allowing
functions to allocate & deallocate memory is useful.<o:p></o:p></p>
<p class="MsoNormal">Option 1 is super hard to infer. Plus it
necessarily hits fewer cases, as the whole call-graph would
need to consist of nofree calls. Option 2 doesn<span
style="font-family:"Times New Roman",serif">’</span>t
have such requirement.<o:p></o:p></p>
</div>
</blockquote>
I think the consensus seems to be in favor of option #1.<br>
<blockquote type="cite"
cite="mid:003901d72fc5$7631d8c0$62958a40$@sapo.pt">
<div class="WordSection1">
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Nofree is most useful for callers to know
that the dereferenceability of any pointer they have is kept
across the call. In general, function attributes are there to
help callers. Otherwise they would be at most a cache for
analyses that you can do locally (ok, plus information that
frontends can give, but clang can<span
style="font-family:"Times New Roman",serif">’</span>t
give you nofree I guess).</p>
</div>
</blockquote>
It's that last clause that's the entire difference between the two
approaches. :)<br>
<blockquote type="cite"
cite="mid:003901d72fc5$7631d8c0$62958a40$@sapo.pt">
<div class="WordSection1">
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I quickly scanned the test cases affected
by your patch, and those seem to be <span
style="font-family:"Times New Roman",serif">“</span>easily<span
style="font-family:"Times New Roman",serif">”</span>
recoverable. Those functions don<span
style="font-family:"Times New Roman",serif">’</span>t
have any call to free nor are those pointers passed to other
non-nofree calls, so you can assume that any dereferenceable
argument remains so throughout the whole function. Requires a
bit more work, but doable.</p>
</div>
</blockquote>
<p>Two points in response to your last paragraph:<br>
</p>
<p>1) I think you're over analyzing deliberately reduced minimal
test case for the current logic and assuming from that the
original examples look exactly like the reduced cases.</p>
2) I do think the need for context sensitive logic is a major
increase in difficulty over context insensitive. Your point about
being able to special case a context insensitive analysis when all
callees are nofree is a good one, and probably worth implementing.<br>
<blockquote type="cite"
cite="mid:003901d72fc5$7631d8c0$62958a40$@sapo.pt">
<div class="WordSection1">
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Nuno<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1
1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b>From:</b> Philip Reames
<a class="moz-txt-link-rfc2396E" href="mailto:listmail@philipreames.com"><listmail@philipreames.com></a> <br>
<b>Sent:</b> 09 April 2021 20:05<br>
<b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<b>Cc:</b> Johannes Doerfert
<a class="moz-txt-link-rfc2396E" href="mailto:johannesdoerfert@gmail.com"><johannesdoerfert@gmail.com></a>; Artur Pilipenko
<a class="moz-txt-link-rfc2396E" href="mailto:llvmlistbot@llvm.org"><llvmlistbot@llvm.org></a>; Nuno Lopes
<a class="moz-txt-link-rfc2396E" href="mailto:nunoplopes@sapo.pt"><nunoplopes@sapo.pt></a><br>
<b>Subject:</b> Ambiguity in the nofree function attribute<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p>I've stumbled across a case related to the nofree attribute
where we seem to have inconsistent interpretations of the
attribute semantic in tree. I'd like some input from others
as to what the "right" semantic should be.<o:p></o:p></p>
<p>The basic question is does the presence of nofree prevent the
callee from allocating and freeing memory entirely within it's
dynamic scope? At first, it seems obvious that it does, but
that turns out to be a bit inconsistent with other attributes
and leads to some surprising results. <o:p></o:p></p>
<p>For reference in the following discussion, here is the
current wording for the nofree function attribute in LangRef:<o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p>"This function attribute indicates that the function does
not, directly or indirectly, call a memory-deallocation
function (free, for example). As a result, uncaptured
pointers that are known to be dereferenceable prior to a
call to a function with the <span class="pre"><span
style="font-size:10.0pt;font-family:"Courier
New"">nofree</span></span> attribute are still
known to be dereferenceable after the call (the capturing
condition is necessary in environments where the function
might communicate the pointer to another thread which then
deallocates the memory)."<o:p></o:p></p>
</blockquote>
<p>For discussion purposes, please assume the concurrency case
has been separately proven. That's not the point I'm getting
at here.<o:p></o:p></p>
<p>The two possible semantics as I see them are:<o:p></o:p></p>
<p><b>Option 1</b> - nofree implies no call to free, period<o:p></o:p></p>
<p>This is the one that to me seems most consistent with the
current wording, but it prevents the callee from allocating
storage and freeing it entirely within it's scope. This is,
for instance, a reasonable thing a target might want to do
when lowering large allocs. This requires transforms to be
careful in stripping the attribute, but isn't entirely
horrible.<o:p></o:p></p>
<p>The more surprising bit is that it means we can not infer
nofree from readonly or readnone. Why? Because both are
specified only in terms of memory effects visible to the
caller. As a result, a readnone function can allocate
storage, write to it, and still be readonly. Our current
inference rules for readnone and readonly do exploit this
flexibility.<o:p></o:p></p>
<p>The optimizer does currently assume that readonly implies
nofree. (See the accessor on Function) Removing this
substantially weakens our ability to infer nofree when faced
with a function declaration which hasn't been explicitly
annotated for nofree. We can get most of this back by adding
appropriate annotations to intrinsics, but not all. <o:p></o:p></p>
<p><b>Option 2</b> - nofree applies to memory visible to the
caller<o:p></o:p></p>
<p>In this case, we'd add wording to the nofree definition
analogous to that in the readonly/readnone specification.
(There's a subtlety about the precise definition of visible
here, but for the moment, let's hand wave in the same way we
do for the other attributes.)<o:p></o:p></p>
<p>This allows us to infer nofree from readonly, but essentially
cripples our ability to drive transformations within an
annotated function. We'd have to restrict all transforms and
inference to cases where we can prove that the object being
affected is visible to the caller. <o:p></o:p></p>
<p>The benefit is that this makes it slightly easier to infer
nofree in some cases. The main impact of this is improving
ability to reason about dereferenceability for uncaptured
objects over calls to functions for which we inferred nofree.
<o:p></o:p></p>
<p>The downside of this is that we essentially loose all ability
to reason about nofree in a context free manner. For a
specific example of the impact of this, it means we can't
infer dereferenceability for an object allocated in F, and
returned (e.g. not freed), in the scope of F.<o:p></o:p></p>
<p>This breaks hoisting and vectorization improvements (e.g.
unconditional loads instead of predicated ones) I've checked
in over the last few months, and makes the ongoing deref
redefinition work substantially harder. <a
href="https://reviews.llvm.org/D100141"
moz-do-not-send="true">https://reviews.llvm.org/D100141</a>
shows what this looks like code wise.<o:p></o:p></p>
<p><b>My Take</b><o:p></o:p></p>
<p>At first, I was strongly convinced that option 1 was the
right choice. So much so in fact that I nearly didn't bother
to post this question. However, after giving it more thought,
I've come to distrust my own response a bit. I definitely
have a conflict of interest here. Option 2 requires me to
effectively cripple several recent optimizer enhancements, and
maybe even revert some code which becomes effectively
useless. It also makes a project I'm currently working on
(deref redef) substantially harder. <o:p></o:p></p>
<p>On the other hand, the inconsistency with readonly and
readnone is surprising. I can see an argument for that being
the right overall approach long term.<o:p></o:p></p>
<p>So essentially, this email is me asking for a sanity check.
Do folks think option 1 is the right option? Or am I forcing
it to be the right option because it makes things easier for
me?<o:p></o:p></p>
<p>Philip<o:p></o:p></p>
</div>
</blockquote>
</body>
</html>