<div dir="ltr"><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Dec 13, 2021 at 7:39 PM Philip Reames <<a href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div>

    <p><br>

    </p>

    <div>On 12/7/21 12:29 AM, Nikita Popov

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <div dir="ltr" class="gmail_attr">On Mon, Dec 6, 2021 at 10:51

            PM Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>>

            wrote:<br>

          </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div>

              <p>I think this is an interesting problem.</p>

              <p>I'd probably lean towards the use of a separate

                attribute, but not strongly so.</p>

              <p>The example which makes me prefer the separate

                attribute would be a function with an out-param.  It's

                very unlikely that an out-param will be read on the

                exception path.  Being able to perform DSE for such out

                params seems quite interesting.</p>

            </div>

          </blockquote>

          <div>Right. I think it's mainly a question of whether we'd be

            able to infer the attribute in practice, in cases where it's

            not annotated by the frontend (which it should do in the

            sret case). I think this is possible at least for the case

            where all calls to the function pass in an alloca to this

            parameter (or another argument with nounwindread I guess)

            and don't use a landingpad for the call. However, I believe

            we do inference in this direction (RPO rather than PO) only

            in the module optimization pipeline, which means that

            DSE/MemCpyOpt might not be able to make use of the inferred

            information.<br>

          </div>

        </div>

      </div>

    </blockquote>

    <p>A couple of points here:</p>

    <ol>

      <li>I often see attributes for which inference isn't a goal as

        being under specified.  It's too easy not to think about all the

        corner cases up front, and that bites us later.  I think it's

        important to have specified the valid inference rules as part of

        the initial definition discussion, even if we don't implement

        them.  It forces us to think through the subtleties.  <br>

      </li>

      <li>I think you're alloc rule can be extended to any unescaped

        allocation for which we can indentify all accesses and that none

        are reachable on the exceptional path.  The trivial call rule

        (there is no exceptional path) is one sub-case of that.  This

        backwards walk may seem expensive, but I think we already do it

        in DSE, and could leave converting callsite attributes to

        functions attributes to a later RPO phase.</li>

      <li>You're right that we don't really do RPO today.  See point

        (1).  I wouldn't want to add such just for this.<br></li></ol></div></blockquote><div>In terms of detailed semantics, I think the main interesting question is what exactly "no read on unwind" means. I see two general approaches: The first is that reading (or possibly accessing) the argument memory after an unwind is immediate undefined behavior. The other is that the behavior is "as if" the argument memory is overwritten with poison on unwind. This means that the memory can be read without UB, but it cannot depend on any value written into it during the call. For example, if the argument memory is fully overwritten after the call and read again afterwards, that would still be nounwindread. I'd personally lean towards the latter interpretation, in that it is more generally applicable without giving up any useful optimization power that I see.<br></div><div><br></div><div>The other question would be what "argument memory" is. This could either be the whole underlying "allocated object" associated with the argument, or the size of the memory region would have to be specified as an attribute argument. So something like "i32* noalias nounwindread(4) %out" to say that the four bytes starting at the passed pointer are not read on unwind. I'd lean towards the former here, because it is simpler in terms of analysis/reasoning, and works even if we don't know the exact access location, just the underlying object. <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>

    <p>Aside: sret has the under-specified problem today.  I have no

      idea when it would be legal to infer sret.  <br></p></div></blockquote><div>I think the answer to this is "never", because sret is considered an ABI attribute -- though to be honest I'm not really clear in which way it actually affects the call ABI.</div><div><br></div><div>Regards,</div><div>Nikita<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>

    </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div>

              <p>However, I'll note that the same problem can be framed

                as an escape problem.  That is, we have an annotation

                not that a value is dead on the exception path, but that

                it hasn't been captured on entry to the routine.  Then,

                we can apply local reasoning to show that the first

                store can't be visible to may_unwind, and eliminate it.<br>

              </p>

            </div>

          </blockquote>

          <div>I don't think this would solve the same problem. In the

            given examples the pointer is already not visible to

            @may_unwind because it is noalias. "noalias" here is a

            weaker version of "not captured before": The pointer may be

            captured, but it's illegal to write through the captured

            pointer, which is sufficient for alias analysis. The problem

            with unwinding is that after the unwind, the calling

            function may read the stored value in a landingpad, which

            does not require any capture of the pointer.</div>

        </div>

      </div>

    </blockquote>

    You are completely correct, particularly on that last point.  Don't

    know what I was thinking when I first responded.  <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_quote">

          <div><br>

          </div>

          <div>Regards,</div>

          <div>Nikita<br>

          </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div>

              <p> </p>

              <p>I'd want to give the escape framing more thought as

                that seems potentially more general.  Does knowing that

                an argument does not point to escaped memory on entry

                help on all of your motivating examples?</p>

              <p>Philip<br>

              </p>

              <div>On 12/4/21 2:39 AM, Nikita Popov via llvm-dev wrote:<br>

              </div>

              <blockquote type="cite">

                <div dir="ltr">

                  <div>Hi,</div>

                  <div><br>

                  </div>

                  <div>Consider the following IR:</div>

                  <div><br>

                  </div>

                  <div>declare void <a class="gmail_plusreply" id="gmail-m_-5752136611135978018gmail-m_7533167157008207913plusReplyChip-4">@may_unwind()</a><br>

                  </div>

                  <div>define void @test(i32* noalias sret(i32) %out) {</div>

                  <div>    store i32 0, i32* %out</div>

                  <div>    call void <a class="gmail_plusreply" id="gmail-m_-5752136611135978018gmail-m_7533167157008207913plusReplyChip-3">@may_unwind()<br>

                    </a>

                    <div>    store i32 1, i32* %out</div>

                  </div>

                  <div>    ret void<br>

                  </div>

                  <div>}</div>

                  <div><br>

                  </div>

                  <div>Currently, we can't remove the first store as

                    dead, because the @may_unwind() call may unwind, and

                    the caller might read %out at that point, making the

                    first store visible.</div>

                  <div><br>

                  </div>

                  <div>Similarly, it prevents call slot optimization in

                    the following example, because the call may unwind

                    and make an early write to the sret argument

                    visible:<br>

                  </div>

                  <br>

                  declare void @may_unwind(i32*)<br>

                  declare void @llvm.memcpy.p0i8.p0i8.i64(i8*, i8*, i64,

                  i1)<br>

                  define void @test(i32* noalias sret(i32) %arg) {<br>

                      %tmp = alloca i32<br>

                      call void @may_unwind(i32* nocapture %tmp)<br>

                      %tmp.8 = bitcast i32* %tmp to i8*<br>

                      %arg.8 = bitcast i32* %arg to i8*<br>

                      call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4

                  %arg.8, i8* align 4 %tmp.8, i64 4, i1 false)<br>

                      ret void<br>

                  <div>}</div>

                  <div><br>

                  </div>

                  <div>I would like to address this in some form. The

                    easiest way would be to change LangRef to specify

                    that sret arguments cannot be read on unwind paths.

                    I think that matches how sret arguments are

                    generally used.</div>

                  <div><br>

                  </div>

                  <div>Alternatively, this could be handled using a

                    separate attribute that can be applied to any

                    argument, something along the lines of "i32*

                    nounwindread sret(i32) %arg". The benefit would be

                    that this is decoupled from sret ABI semantics and

                    could potentially be inferred (e.g. if the function

                    is only ever used with call and not invoke, this

                    should be a given).</div>

                  <div><br>

                  </div>

                  <div>Any thoughts on this? Is this a problem worth

                    solving, and if yes, would a new attribute be

                    preferred over restricting sret semantics?<br>

                  </div>

                  <div><br>

                  </div>

                  <div>Regards,</div>

                  <div>Nikita<br>

                  </div>

                </div>

                <br>

                <fieldset></fieldset>

                <pre>_______________________________________________

LLVM Developers mailing list

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

              </blockquote>

            </div>

          </blockquote>

        </div>

      </div>

    </blockquote>

  </div>

</blockquote></div></div>