[llvm-dev] Ambiguity in the nofree function attribute

Tue Apr 13 14:43:47 PDT 2021

Replying to self to summarize takeaways.

The consensus in responses seems to be strongly in favor of option #2.  
My main hesitation with option #2 has always been the lost ability for a 
frontend to provide the stronger scoped fact, but in the process of 
writing a detailed response to Johannes downthread, I realized we don't 
have a motivating example frontend which actually benefits from said 
ability.  Given that, it seems like option #2 wins over option #1.

It seems stopping to ask for a sanity check was well warranted in this 
case.  :)

I will go ahead and cleanup https://reviews.llvm.org/D100141 into a real 
patch.  I will give it a couple of days (concurrent with review) to give 
anyone who might disagree with this decision a chance to see this thread 
and chime in.

Philip

On 4/9/21 12:05 PM, Philip Reames wrote:
>
> I've stumbled across a case related to the nofree attribute where we 
> seem to have inconsistent interpretations of the attribute semantic in 
> tree.  I'd like some input from others as to what the "right" semantic 
> should be.
>
> The basic question is does the presence of nofree prevent the callee 
> from allocating and freeing memory entirely within it's dynamic 
> scope?  At first, it seems obvious that it does, but that turns out to 
> be a bit inconsistent with other attributes and leads to some 
> surprising results.
>
> For reference in the following discussion, here is the current wording 
> for the nofree function attribute in LangRef:
>
>     "This function attribute indicates that the function does not,
>     directly or indirectly, call a memory-deallocation function (free,
>     for example). As a result, uncaptured pointers that are known to
>     be dereferenceable prior to a call to a function with the |nofree|
>     attribute are still known to be dereferenceable after the call
>     (the capturing condition is necessary in environments where the
>     function might communicate the pointer to another thread which
>     then deallocates the memory)."
>
> For discussion purposes, please assume the concurrency case has been 
> separately proven.  That's not the point I'm getting at here.
>
> The two possible semantics as I see them are:
>
> *Option 1* - nofree implies no call to free, period
>
> This is the one that to me seems most consistent with the current 
> wording, but it prevents the callee from allocating storage and 
> freeing it entirely within it's scope.  This is, for instance, a 
> reasonable thing a target might want to do when lowering large 
> allocs.  This requires transforms to be careful in stripping the 
> attribute, but isn't entirely horrible.
>
> The more surprising bit is that it means we can not infer nofree from 
> readonly or readnone.  Why?  Because both are specified only in terms 
> of memory effects visible to the caller.  As a result, a readnone 
> function can allocate storage, write to it, and still be readonly.  
> Our current inference rules for readnone and readonly do exploit this 
> flexibility.
>
> The optimizer does currently assume that readonly implies nofree.  
> (See the accessor on Function)  Removing this substantially weakens 
> our ability to infer nofree when faced with a function declaration 
> which hasn't been explicitly annotated for nofree.  We can get most of 
> this back by adding appropriate annotations to intrinsics, but not all.
>
> *Option 2* - nofree applies to memory visible to the caller
>
> In this case, we'd add wording to the nofree definition analogous to 
> that in the readonly/readnone specification. (There's a subtlety about 
> the precise definition of visible here, but for the moment, let's hand 
> wave in the same way we do for the other attributes.)
>
> This allows us to infer nofree from readonly, but essentially cripples 
> our ability to drive transformations within an annotated function.  
> We'd have to restrict all transforms and inference to cases where we 
> can prove that the object being affected is visible to the caller.
>
> The benefit is that this makes it slightly easier to infer nofree in 
> some cases.  The main impact of this is improving ability to reason 
> about dereferenceability for uncaptured objects over calls to 
> functions for which we inferred nofree.
>
> The downside of this is that we essentially loose all ability to 
> reason about nofree in a context free manner.  For a specific example 
> of the impact of this, it means we can't infer dereferenceability for 
> an object allocated in F, and returned (e.g. not freed), in the scope 
> of F.
>
> This breaks hoisting and vectorization improvements (e.g. 
> unconditional loads instead of predicated ones) I've checked in over 
> the last few months, and makes the ongoing deref redefinition work 
> substantially harder. https://reviews.llvm.org/D100141 shows what this 
> looks like code wise.
>
> *My Take*
>
> At first, I was strongly convinced that option 1 was the right 
> choice.  So much so in fact that I nearly didn't bother to post this 
> question.  However, after giving it more thought, I've come to 
> distrust my own response a bit.  I definitely have a conflict of 
> interest here.  Option 2 requires me to effectively cripple several 
> recent optimizer enhancements, and maybe even revert some code which 
> becomes effectively useless.  It also makes a project I'm currently 
> working on (deref redef) substantially harder.
>
> On the other hand, the inconsistency with readonly and readnone is 
> surprising.  I can see an argument for that being the right overall 
> approach long term.
>
> So essentially, this email is me asking for a sanity check.  Do folks 
> think option 1 is the right option?  Or am I forcing it to be the 
> right option because it makes things easier for me?
>
> Philip
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210413/7aab0d3e/attachment-0001.html>