<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
span.pre
{mso-style-name:pre;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple style='word-wrap:break-word'><div class=WordSection1><p class=MsoNormal>I like option 2. I agree that allowing functions to allocate & deallocate memory is useful.<o:p></o:p></p><p class=MsoNormal>Option 1 is super hard to infer. Plus it necessarily hits fewer cases, as the whole call-graph would need to consist of nofree calls. Option 2 doesn<span style='font-family:"Times New Roman",serif'>’</span>t have such requirement.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Nofree is most useful for callers to know that the dereferenceability of any pointer they have is kept across the call. In general, function attributes are there to help callers. Otherwise they would be at most a cache for analyses that you can do locally (ok, plus information that frontends can give, but clang can<span style='font-family:"Times New Roman",serif'>’</span>t give you nofree I guess).<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I quickly scanned the test cases affected by your patch, and those seem to be <span style='font-family:"Times New Roman",serif'>“</span>easily<span style='font-family:"Times New Roman",serif'>”</span> recoverable. Those functions don<span style='font-family:"Times New Roman",serif'>’</span>t have any call to free nor are those pointers passed to other non-nofree calls, so you can assume that any dereferenceable argument remains so throughout the whole function. Requires a bit more work, but doable.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Nuno<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b>From:</b> Philip Reames <listmail@philipreames.com> <br><b>Sent:</b> 09 April 2021 20:05<br><b>To:</b> llvm-dev@lists.llvm.org<br><b>Cc:</b> Johannes Doerfert <johannesdoerfert@gmail.com>; Artur Pilipenko <llvmlistbot@llvm.org>; Nuno Lopes <nunoplopes@sapo.pt><br><b>Subject:</b> Ambiguity in the nofree function attribute<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><p>I've stumbled across a case related to the nofree attribute where we seem to have inconsistent interpretations of the attribute semantic in tree. I'd like some input from others as to what the "right" semantic should be.<o:p></o:p></p><p>The basic question is does the presence of nofree prevent the callee from allocating and freeing memory entirely within it's dynamic scope? At first, it seems obvious that it does, but that turns out to be a bit inconsistent with other attributes and leads to some surprising results. <o:p></o:p></p><p>For reference in the following discussion, here is the current wording for the nofree function attribute in LangRef:<o:p></o:p></p><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><p>"This function attribute indicates that the function does not, directly or indirectly, call a memory-deallocation function (free, for example). As a result, uncaptured pointers that are known to be dereferenceable prior to a call to a function with the <span class=pre><span style='font-size:10.0pt;font-family:"Courier New"'>nofree</span></span> attribute are still known to be dereferenceable after the call (the capturing condition is necessary in environments where the function might communicate the pointer to another thread which then deallocates the memory)."<o:p></o:p></p></blockquote><p>For discussion purposes, please assume the concurrency case has been separately proven. That's not the point I'm getting at here.<o:p></o:p></p><p>The two possible semantics as I see them are:<o:p></o:p></p><p><b>Option 1</b> - nofree implies no call to free, period<o:p></o:p></p><p>This is the one that to me seems most consistent with the current wording, but it prevents the callee from allocating storage and freeing it entirely within it's scope. This is, for instance, a reasonable thing a target might want to do when lowering large allocs. This requires transforms to be careful in stripping the attribute, but isn't entirely horrible.<o:p></o:p></p><p>The more surprising bit is that it means we can not infer nofree from readonly or readnone. Why? Because both are specified only in terms of memory effects visible to the caller. As a result, a readnone function can allocate storage, write to it, and still be readonly. Our current inference rules for readnone and readonly do exploit this flexibility.<o:p></o:p></p><p>The optimizer does currently assume that readonly implies nofree. (See the accessor on Function) Removing this substantially weakens our ability to infer nofree when faced with a function declaration which hasn't been explicitly annotated for nofree. We can get most of this back by adding appropriate annotations to intrinsics, but not all. <o:p></o:p></p><p><b>Option 2</b> - nofree applies to memory visible to the caller<o:p></o:p></p><p>In this case, we'd add wording to the nofree definition analogous to that in the readonly/readnone specification. (There's a subtlety about the precise definition of visible here, but for the moment, let's hand wave in the same way we do for the other attributes.)<o:p></o:p></p><p>This allows us to infer nofree from readonly, but essentially cripples our ability to drive transformations within an annotated function. We'd have to restrict all transforms and inference to cases where we can prove that the object being affected is visible to the caller. <o:p></o:p></p><p>The benefit is that this makes it slightly easier to infer nofree in some cases. The main impact of this is improving ability to reason about dereferenceability for uncaptured objects over calls to functions for which we inferred nofree. <o:p></o:p></p><p>The downside of this is that we essentially loose all ability to reason about nofree in a context free manner. For a specific example of the impact of this, it means we can't infer dereferenceability for an object allocated in F, and returned (e.g. not freed), in the scope of F.<o:p></o:p></p><p>This breaks hoisting and vectorization improvements (e.g. unconditional loads instead of predicated ones) I've checked in over the last few months, and makes the ongoing deref redefinition work substantially harder. <a href="https://reviews.llvm.org/D100141">https://reviews.llvm.org/D100141</a> shows what this looks like code wise.<o:p></o:p></p><p><b>My Take</b><o:p></o:p></p><p>At first, I was strongly convinced that option 1 was the right choice. So much so in fact that I nearly didn't bother to post this question. However, after giving it more thought, I've come to distrust my own response a bit. I definitely have a conflict of interest here. Option 2 requires me to effectively cripple several recent optimizer enhancements, and maybe even revert some code which becomes effectively useless. It also makes a project I'm currently working on (deref redef) substantially harder. <o:p></o:p></p><p>On the other hand, the inconsistency with readonly and readnone is surprising. I can see an argument for that being the right overall approach long term.<o:p></o:p></p><p>So essentially, this email is me asking for a sanity check. Do folks think option 1 is the right option? Or am I forcing it to be the right option because it makes things easier for me?<o:p></o:p></p><p>Philip<o:p></o:p></p></div></body></html>