<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Replying to self to summarize takeaways.</p>

    <p>The consensus in responses seems to be strongly in favor of

      option #2.  My main hesitation with option #2 has always been the

      lost ability for a frontend to provide the stronger scoped fact,

      but in the process of writing a detailed response to Johannes

      downthread, I realized we don't have a motivating example frontend

      which actually benefits from said ability.  Given that, it seems

      like option #2 wins over option #1.  <br>

    </p>

    <p>It seems stopping to ask for a sanity check was well warranted in

      this case.  :)</p>

    <p>I will go ahead and cleanup <a class="moz-txt-link-freetext"

        href="https://reviews.llvm.org/D100141">https://reviews.llvm.org/D100141</a>

      into a real patch.  I will give it a couple of days (concurrent

      with review) to give anyone who might disagree with this decision

      a chance to see this thread and chime in.</p>

    <p>Philip<br>

    </p>

    <div class="moz-cite-prefix">On 4/9/21 12:05 PM, Philip Reames

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:4ea19e75-8655-a8ce-285b-5e30106c39b9@philipreames.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <p>I've stumbled across a case related to the nofree attribute

        where we seem to have inconsistent interpretations of the

        attribute semantic in tree.  I'd like some input from others as

        to what the "right" semantic should be.</p>

      <p>The basic question is does the presence of nofree prevent the

        callee from allocating and freeing memory entirely within it's

        dynamic scope?  At first, it seems obvious that it does, but

        that turns out to be a bit inconsistent with other attributes

        and leads to some surprising results.  <br>

      </p>

      <p>For reference in the following discussion, here is the current

        wording for the nofree function attribute in LangRef:</p>

      <blockquote>

        <p>"This function attribute indicates that the function does

          not, directly or indirectly, call a memory-deallocation

          function (free, for example). As a result, uncaptured pointers

          that are known to be dereferenceable prior to a call to a

          function with the <code class="docutils literal notranslate"><span

              class="pre">nofree</span></code> attribute are still known

          to be dereferenceable after the call (the capturing condition

          is necessary in environments where the function might

          communicate the pointer to another thread which then

          deallocates the memory)."</p>

      </blockquote>

      <p>For discussion purposes, please assume the concurrency case has

        been separately proven.  That's not the point I'm getting at

        here.<br>

      </p>

      <p>The two possible semantics as I see them are:</p>

      <p><b>Option 1</b> - nofree implies no call to free, period</p>

      <p>This is the one that to me seems most consistent with the

        current wording, but it prevents the callee from allocating

        storage and freeing it entirely within it's scope.  This is, for

        instance, a reasonable thing a target might want to do when

        lowering large allocs.  This requires transforms to be careful

        in stripping the attribute, but isn't entirely horrible.<br>

      </p>

      <p>The more surprising bit is that it means we can not infer

        nofree from readonly or readnone.  Why?  Because both are

        specified only in terms of memory effects visible to the

        caller.  As a result, a readnone function can allocate storage,

        write to it, and still be readonly.  Our current inference rules

        for readnone and readonly do exploit this flexibility.</p>

      <p>The optimizer does currently assume that readonly implies

        nofree.  (See the accessor on Function)  Removing this

        substantially weakens our ability to infer nofree when faced

        with a function declaration which hasn't been explicitly

        annotated for nofree.  We can get most of this back by adding

        appropriate annotations to intrinsics, but not all.  <br>

      </p>

      <p><b>Option 2</b> - nofree applies to memory visible to the

        caller</p>

      <p>In this case, we'd add wording to the nofree definition

        analogous to that in the readonly/readnone specification. 

        (There's a subtlety about the precise definition of visible

        here, but for the moment, let's hand wave in the same way we do

        for the other attributes.)</p>

      <p>This allows us to infer nofree from readonly, but essentially

        cripples our ability to drive transformations within an

        annotated function.  We'd have to restrict all transforms and

        inference to cases where we can prove that the object being

        affected is visible to the caller.  <br>

      </p>

      <p>The benefit is that this makes it slightly easier to infer

        nofree in some cases.  The main impact of this is improving

        ability to reason about dereferenceability for uncaptured

        objects over calls to functions for which we inferred nofree. <br>

      </p>

      <p>The downside of this is that we essentially loose all ability

        to reason about nofree in a context free manner.  For a specific

        example of the impact of this, it means we can't infer

        dereferenceability for an object allocated in F, and returned

        (e.g. not freed), in the scope of F.</p>

      <p>This breaks hoisting and vectorization improvements (e.g.

        unconditional loads instead of predicated ones) I've checked in

        over the last few months, and makes the ongoing deref

        redefinition work substantially harder.  <a

          class="moz-txt-link-freetext"

          href="https://reviews.llvm.org/D100141" moz-do-not-send="true">https://reviews.llvm.org/D100141</a>

        shows what this looks like code wise.<br>

      </p>

      <p><b>My Take</b><br>

      </p>

      <p>At first, I was strongly convinced that option 1 was the right

        choice.  So much so in fact that I nearly didn't bother to post

        this question.  However, after giving it more thought, I've come

        to distrust my own response a bit.  I definitely have a conflict

        of interest here.  Option 2 requires me to effectively cripple

        several recent optimizer enhancements, and maybe even revert

        some code which becomes effectively useless.  It also makes a

        project I'm currently working on (deref redef) substantially

        harder.  <br>

      </p>

      <p>On the other hand, the inconsistency with readonly and readnone

        is surprising.  I can see an argument for that being the right

        overall approach long term.</p>

      <p>So essentially, this email is me asking for a sanity check.  Do

        folks think option 1 is the right option?  Or am I forcing it to

        be the right option because it makes things easier for me?</p>

      <p>Philip<br>

      </p>

    </blockquote>

  </body>

</html>