[llvm-dev] Ambiguity in the nofree function attribute
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Fri Apr 9 12:05:02 PDT 2021
I've stumbled across a case related to the nofree attribute where we
seem to have inconsistent interpretations of the attribute semantic in
tree. I'd like some input from others as to what the "right" semantic
should be.
The basic question is does the presence of nofree prevent the callee
from allocating and freeing memory entirely within it's dynamic scope?
At first, it seems obvious that it does, but that turns out to be a bit
inconsistent with other attributes and leads to some surprising results.
For reference in the following discussion, here is the current wording
for the nofree function attribute in LangRef:
"This function attribute indicates that the function does not,
directly or indirectly, call a memory-deallocation function (free,
for example). As a result, uncaptured pointers that are known to be
dereferenceable prior to a call to a function with the |nofree|
attribute are still known to be dereferenceable after the call (the
capturing condition is necessary in environments where the function
might communicate the pointer to another thread which then
deallocates the memory)."
For discussion purposes, please assume the concurrency case has been
separately proven. That's not the point I'm getting at here.
The two possible semantics as I see them are:
*Option 1* - nofree implies no call to free, period
This is the one that to me seems most consistent with the current
wording, but it prevents the callee from allocating storage and freeing
it entirely within it's scope. This is, for instance, a reasonable
thing a target might want to do when lowering large allocs. This
requires transforms to be careful in stripping the attribute, but isn't
entirely horrible.
The more surprising bit is that it means we can not infer nofree from
readonly or readnone. Why? Because both are specified only in terms of
memory effects visible to the caller. As a result, a readnone function
can allocate storage, write to it, and still be readonly. Our current
inference rules for readnone and readonly do exploit this flexibility.
The optimizer does currently assume that readonly implies nofree. (See
the accessor on Function) Removing this substantially weakens our
ability to infer nofree when faced with a function declaration which
hasn't been explicitly annotated for nofree. We can get most of this
back by adding appropriate annotations to intrinsics, but not all.
*Option 2* - nofree applies to memory visible to the caller
In this case, we'd add wording to the nofree definition analogous to
that in the readonly/readnone specification. (There's a subtlety about
the precise definition of visible here, but for the moment, let's hand
wave in the same way we do for the other attributes.)
This allows us to infer nofree from readonly, but essentially cripples
our ability to drive transformations within an annotated function. We'd
have to restrict all transforms and inference to cases where we can
prove that the object being affected is visible to the caller.
The benefit is that this makes it slightly easier to infer nofree in
some cases. The main impact of this is improving ability to reason
about dereferenceability for uncaptured objects over calls to functions
for which we inferred nofree.
The downside of this is that we essentially loose all ability to reason
about nofree in a context free manner. For a specific example of the
impact of this, it means we can't infer dereferenceability for an object
allocated in F, and returned (e.g. not freed), in the scope of F.
This breaks hoisting and vectorization improvements (e.g. unconditional
loads instead of predicated ones) I've checked in over the last few
months, and makes the ongoing deref redefinition work substantially
harder. https://reviews.llvm.org/D100141 shows what this looks like
code wise.
*My Take*
At first, I was strongly convinced that option 1 was the right choice.
So much so in fact that I nearly didn't bother to post this question.
However, after giving it more thought, I've come to distrust my own
response a bit. I definitely have a conflict of interest here. Option
2 requires me to effectively cripple several recent optimizer
enhancements, and maybe even revert some code which becomes effectively
useless. It also makes a project I'm currently working on (deref redef)
substantially harder.
On the other hand, the inconsistency with readonly and readnone is
surprising. I can see an argument for that being the right overall
approach long term.
So essentially, this email is me asking for a sanity check. Do folks
think option 1 is the right option? Or am I forcing it to be the right
option because it makes things easier for me?
Philip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210409/ed24fc2f/attachment.html>
More information about the llvm-dev
mailing list