<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>I've stumbled across a case related to the nofree attribute where

      we seem to have inconsistent interpretations of the attribute

      semantic in tree.  I'd like some input from others as to what the

      "right" semantic should be.</p>

    <p>The basic question is does the presence of nofree prevent the

      callee from allocating and freeing memory entirely within it's

      dynamic scope?  At first, it seems obvious that it does, but that

      turns out to be a bit inconsistent with other attributes and leads

      to some surprising results.  <br>

    </p>

    <p>For reference in the following discussion, here is the current

      wording for the nofree function attribute in LangRef:</p>

    <blockquote>

      <p>"This function attribute indicates that the function does not,

        directly or

        indirectly, call a memory-deallocation function (free, for

        example). As a

        result, uncaptured pointers that are known to be dereferenceable

        prior to a

        call to a function with the <code class="docutils literal

          notranslate"><span class="pre">nofree</span></code> attribute

        are still known to be

        dereferenceable after the call (the capturing condition is

        necessary in

        environments where the function might communicate the pointer to

        another thread

        which then deallocates the memory)."</p>

    </blockquote>

    <p>For discussion purposes, please assume the concurrency case has

      been separately proven.  That's not the point I'm getting at here.<br>

    </p>

    <p>The two possible semantics as I see them are:</p>

    <p><b>Option 1</b> - nofree implies no call to free, period</p>

    <p>This is the one that to me seems most consistent with the current

      wording, but it prevents the callee from allocating storage and

      freeing it entirely within it's scope.  This is, for instance, a

      reasonable thing a target might want to do when lowering large

      allocs.  This requires transforms to be careful in stripping the

      attribute, but isn't entirely horrible.<br>

    </p>

    <p>The more surprising bit is that it means we can not infer nofree

      from readonly or readnone.  Why?  Because both are specified only

      in terms of memory effects visible to the caller.  As a result, a

      readnone function can allocate storage, write to it, and still be

      readonly.  Our current inference rules for readnone and readonly

      do exploit this flexibility.</p>

    <p>The optimizer does currently assume that readonly implies

      nofree.  (See the accessor on Function)  Removing this

      substantially weakens our ability to infer nofree when faced with

      a function declaration which hasn't been explicitly annotated for

      nofree.  We can get most of this back by adding appropriate

      annotations to intrinsics, but not all.  <br>

    </p>

    <p><b>Option 2</b> - nofree applies to memory visible to the caller</p>

    <p>In this case, we'd add wording to the nofree definition analogous

      to that in the readonly/readnone specification.  (There's a

      subtlety about the precise definition of visible here, but for the

      moment, let's hand wave in the same way we do for the other

      attributes.)</p>

    <p>This allows us to infer nofree from readonly, but essentially

      cripples our ability to drive transformations within an annotated

      function.  We'd have to restrict all transforms and inference to

      cases where we can prove that the object being affected is visible

      to the caller.  <br>

    </p>

    <p>The benefit is that this makes it slightly easier to infer nofree

      in some cases.  The main impact of this is improving ability to

      reason about dereferenceability for uncaptured objects over calls

      to functions for which we inferred nofree. <br>

    </p>

    <p>The downside of this is that we essentially loose all ability to

      reason about nofree in a context free manner.  For a specific

      example of the impact of this, it means we can't infer

      dereferenceability for an object allocated in F, and returned

      (e.g. not freed), in the scope of F.</p>

    <p>This breaks hoisting and vectorization improvements (e.g.

      unconditional loads instead of predicated ones) I've checked in

      over the last few months, and makes the ongoing deref redefinition

      work substantially harder.  <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D100141">https://reviews.llvm.org/D100141</a> shows

      what this looks like code wise.<br>

    </p>

    <p><b>My Take</b><br>

    </p>

    <p>At first, I was strongly convinced that option 1 was the right

      choice.  So much so in fact that I nearly didn't bother to post

      this question.  However, after giving it more thought, I've come

      to distrust my own response a bit.  I definitely have a conflict

      of interest here.  Option 2 requires me to effectively cripple

      several recent optimizer enhancements, and maybe even revert some

      code which becomes effectively useless.  It also makes a project

      I'm currently working on (deref redef) substantially harder.  <br>

    </p>

    <p>On the other hand, the inconsistency with readonly and readnone

      is surprising.  I can see an argument for that being the right

      overall approach long term.</p>

    <p>So essentially, this email is me asking for a sanity check.  Do

      folks think option 1 is the right option?  Or am I forcing it to

      be the right option because it makes things easier for me?</p>

    <p>Philip<br>

    </p>

  </body>

</html>