<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 4/12/21 10:58 AM, Nuno Lopes wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:003901d72fc5$7631d8c0$62958a40$@sapo.pt">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <meta name="Generator" content="Microsoft Word 15 (filtered

        medium)">

      <style>@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0cm;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}span.pre

        {mso-style-name:pre;}span.EmailStyle21

        {mso-style-type:personal-reply;

        font-family:"Calibri",sans-serif;

        color:windowtext;}.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}div.WordSection1

        {page:WordSection1;}</style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]-->

      <div class="WordSection1">

        <p class="MsoNormal">I like option 2. I agree that allowing

          functions to allocate & deallocate memory is useful.<o:p></o:p></p>

        <p class="MsoNormal">Option 1 is super hard to infer. Plus it

          necessarily hits fewer cases, as the whole call-graph would

          need to consist of nofree calls. Option 2 doesn<span

            style="font-family:"Times New Roman",serif">’</span>t

          have such requirement.<o:p></o:p></p>

      </div>

    </blockquote>

    I think the consensus seems to be in favor of option #1.<br>

    <blockquote type="cite"

      cite="mid:003901d72fc5$7631d8c0$62958a40$@sapo.pt">

      <div class="WordSection1">

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">Nofree is most useful for callers to know

          that the dereferenceability of any pointer they have is kept

          across the call. In general, function attributes are there to

          help callers. Otherwise they would be at most a cache for

          analyses that you can do locally (ok, plus information that

          frontends can give, but clang can<span

            style="font-family:"Times New Roman",serif">’</span>t

          give you nofree I guess).</p>

      </div>

    </blockquote>

    It's that last clause that's the entire difference between the two

    approaches.  :)<br>

    <blockquote type="cite"

      cite="mid:003901d72fc5$7631d8c0$62958a40$@sapo.pt">

      <div class="WordSection1">

        <p class="MsoNormal"><o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">I quickly scanned the test cases affected

          by your patch, and those seem to be <span

            style="font-family:"Times New Roman",serif">“</span>easily<span

            style="font-family:"Times New Roman",serif">”</span>

          recoverable. Those functions don<span

            style="font-family:"Times New Roman",serif">’</span>t

          have any call to free nor are those pointers passed to other

          non-nofree calls, so you can assume that any dereferenceable

          argument remains so throughout the whole function. Requires a

          bit more work, but doable.</p>

      </div>

    </blockquote>

    <p>Two points in response to your last paragraph:<br>

    </p>

    <p>1) I think you're over analyzing deliberately reduced minimal

      test case for the current logic and assuming from that the

      original examples look exactly like the reduced cases.</p>

    2) I do think the need for context sensitive logic is a major

    increase in difficulty over context insensitive.  Your point about

    being able to special case a context insensitive analysis when all

    callees are nofree is a good one, and probably worth implementing.<br>

    <blockquote type="cite"

      cite="mid:003901d72fc5$7631d8c0$62958a40$@sapo.pt">

      <div class="WordSection1">

        <p class="MsoNormal"><o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">Nuno<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <div>

          <div style="border:none;border-top:solid #E1E1E1

            1.0pt;padding:3.0pt 0cm 0cm 0cm">

            <p class="MsoNormal"><b>From:</b> Philip Reames

              <a class="moz-txt-link-rfc2396E" href="mailto:listmail@philipreames.com"><listmail@philipreames.com></a> <br>

              <b>Sent:</b> 09 April 2021 20:05<br>

              <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

              <b>Cc:</b> Johannes Doerfert

              <a class="moz-txt-link-rfc2396E" href="mailto:johannesdoerfert@gmail.com"><johannesdoerfert@gmail.com></a>; Artur Pilipenko

              <a class="moz-txt-link-rfc2396E" href="mailto:llvmlistbot@llvm.org"><llvmlistbot@llvm.org></a>; Nuno Lopes

              <a class="moz-txt-link-rfc2396E" href="mailto:nunoplopes@sapo.pt"><nunoplopes@sapo.pt></a><br>

              <b>Subject:</b> Ambiguity in the nofree function attribute<o:p></o:p></p>

          </div>

        </div>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p>I've stumbled across a case related to the nofree attribute

          where we seem to have inconsistent interpretations of the

          attribute semantic in tree.  I'd like some input from others

          as to what the "right" semantic should be.<o:p></o:p></p>

        <p>The basic question is does the presence of nofree prevent the

          callee from allocating and freeing memory entirely within it's

          dynamic scope?  At first, it seems obvious that it does, but

          that turns out to be a bit inconsistent with other attributes

          and leads to some surprising results.  <o:p></o:p></p>

        <p>For reference in the following discussion, here is the

          current wording for the nofree function attribute in LangRef:<o:p></o:p></p>

        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">

          <p>"This function attribute indicates that the function does

            not, directly or indirectly, call a memory-deallocation

            function (free, for example). As a result, uncaptured

            pointers that are known to be dereferenceable prior to a

            call to a function with the <span class="pre"><span

                style="font-size:10.0pt;font-family:"Courier

                New"">nofree</span></span> attribute are still

            known to be dereferenceable after the call (the capturing

            condition is necessary in environments where the function

            might communicate the pointer to another thread which then

            deallocates the memory)."<o:p></o:p></p>

        </blockquote>

        <p>For discussion purposes, please assume the concurrency case

          has been separately proven.  That's not the point I'm getting

          at here.<o:p></o:p></p>

        <p>The two possible semantics as I see them are:<o:p></o:p></p>

        <p><b>Option 1</b> - nofree implies no call to free, period<o:p></o:p></p>

        <p>This is the one that to me seems most consistent with the

          current wording, but it prevents the callee from allocating

          storage and freeing it entirely within it's scope.  This is,

          for instance, a reasonable thing a target might want to do

          when lowering large allocs.  This requires transforms to be

          careful in stripping the attribute, but isn't entirely

          horrible.<o:p></o:p></p>

        <p>The more surprising bit is that it means we can not infer

          nofree from readonly or readnone.  Why?  Because both are

          specified only in terms of memory effects visible to the

          caller.  As a result, a readnone function can allocate

          storage, write to it, and still be readonly.  Our current

          inference rules for readnone and readonly do exploit this

          flexibility.<o:p></o:p></p>

        <p>The optimizer does currently assume that readonly implies

          nofree.  (See the accessor on Function)  Removing this

          substantially weakens our ability to infer nofree when faced

          with a function declaration which hasn't been explicitly

          annotated for nofree.  We can get most of this back by adding

          appropriate annotations to intrinsics, but not all.  <o:p></o:p></p>

        <p><b>Option 2</b> - nofree applies to memory visible to the

          caller<o:p></o:p></p>

        <p>In this case, we'd add wording to the nofree definition

          analogous to that in the readonly/readnone specification. 

          (There's a subtlety about the precise definition of visible

          here, but for the moment, let's hand wave in the same way we

          do for the other attributes.)<o:p></o:p></p>

        <p>This allows us to infer nofree from readonly, but essentially

          cripples our ability to drive transformations within an

          annotated function.  We'd have to restrict all transforms and

          inference to cases where we can prove that the object being

          affected is visible to the caller.  <o:p></o:p></p>

        <p>The benefit is that this makes it slightly easier to infer

          nofree in some cases.  The main impact of this is improving

          ability to reason about dereferenceability for uncaptured

          objects over calls to functions for which we inferred nofree.

          <o:p></o:p></p>

        <p>The downside of this is that we essentially loose all ability

          to reason about nofree in a context free manner.  For a

          specific example of the impact of this, it means we can't

          infer dereferenceability for an object allocated in F, and

          returned (e.g. not freed), in the scope of F.<o:p></o:p></p>

        <p>This breaks hoisting and vectorization improvements (e.g.

          unconditional loads instead of predicated ones) I've checked

          in over the last few months, and makes the ongoing deref

          redefinition work substantially harder.  <a

            href="https://reviews.llvm.org/D100141"

            moz-do-not-send="true">https://reviews.llvm.org/D100141</a>

          shows what this looks like code wise.<o:p></o:p></p>

        <p><b>My Take</b><o:p></o:p></p>

        <p>At first, I was strongly convinced that option 1 was the

          right choice.  So much so in fact that I nearly didn't bother

          to post this question.  However, after giving it more thought,

          I've come to distrust my own response a bit.  I definitely

          have a conflict of interest here.  Option 2 requires me to

          effectively cripple several recent optimizer enhancements, and

          maybe even revert some code which becomes effectively

          useless.  It also makes a project I'm currently working on

          (deref redef) substantially harder.  <o:p></o:p></p>

        <p>On the other hand, the inconsistency with readonly and

          readnone is surprising.  I can see an argument for that being

          the right overall approach long term.<o:p></o:p></p>

        <p>So essentially, this email is me asking for a sanity check. 

          Do folks think option 1 is the right option?  Or am I forcing

          it to be the right option because it makes things easier for

          me?<o:p></o:p></p>

        <p>Philip<o:p></o:p></p>

      </div>

    </blockquote>

  </body>

</html>