<div dir="ltr"><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Dec 13, 2021 at 7:39 PM Philip Reames <<a href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 12/7/21 12:29 AM, Nikita Popov
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Dec 6, 2021 at 10:51
PM Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>I think this is an interesting problem.</p>
<p>I'd probably lean towards the use of a separate
attribute, but not strongly so.</p>
<p>The example which makes me prefer the separate
attribute would be a function with an out-param. It's
very unlikely that an out-param will be read on the
exception path. Being able to perform DSE for such out
params seems quite interesting.</p>
</div>
</blockquote>
<div>Right. I think it's mainly a question of whether we'd be
able to infer the attribute in practice, in cases where it's
not annotated by the frontend (which it should do in the
sret case). I think this is possible at least for the case
where all calls to the function pass in an alloca to this
parameter (or another argument with nounwindread I guess)
and don't use a landingpad for the call. However, I believe
we do inference in this direction (RPO rather than PO) only
in the module optimization pipeline, which means that
DSE/MemCpyOpt might not be able to make use of the inferred
information.<br>
</div>
</div>
</div>
</blockquote>
<p>A couple of points here:</p>
<ol>
<li>I often see attributes for which inference isn't a goal as
being under specified. It's too easy not to think about all the
corner cases up front, and that bites us later. I think it's
important to have specified the valid inference rules as part of
the initial definition discussion, even if we don't implement
them. It forces us to think through the subtleties. <br>
</li>
<li>I think you're alloc rule can be extended to any unescaped
allocation for which we can indentify all accesses and that none
are reachable on the exceptional path. The trivial call rule
(there is no exceptional path) is one sub-case of that. This
backwards walk may seem expensive, but I think we already do it
in DSE, and could leave converting callsite attributes to
functions attributes to a later RPO phase.</li>
<li>You're right that we don't really do RPO today. See point
(1). I wouldn't want to add such just for this.<br></li></ol></div></blockquote><div>In terms of detailed semantics, I think the main interesting question is what exactly "no read on unwind" means. I see two general approaches: The first is that reading (or possibly accessing) the argument memory after an unwind is immediate undefined behavior. The other is that the behavior is "as if" the argument memory is overwritten with poison on unwind. This means that the memory can be read without UB, but it cannot depend on any value written into it during the call. For example, if the argument memory is fully overwritten after the call and read again afterwards, that would still be nounwindread. I'd personally lean towards the latter interpretation, in that it is more generally applicable without giving up any useful optimization power that I see.<br></div><div><br></div><div>The other question would be what "argument memory" is. This could either be the whole underlying "allocated object" associated with the argument, or the size of the memory region would have to be specified as an attribute argument. So something like "i32* noalias nounwindread(4) %out" to say that the four bytes starting at the passed pointer are not read on unwind. I'd lean towards the former here, because it is simpler in terms of analysis/reasoning, and works even if we don't know the exact access location, just the underlying object. <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>
<p>Aside: sret has the under-specified problem today. I have no
idea when it would be legal to infer sret. <br></p></div></blockquote><div>I think the answer to this is "never", because sret is considered an ABI attribute -- though to be honest I'm not really clear in which way it actually affects the call ABI.</div><div><br></div><div>Regards,</div><div>Nikita<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>
</p>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>However, I'll note that the same problem can be framed
as an escape problem. That is, we have an annotation
not that a value is dead on the exception path, but that
it hasn't been captured on entry to the routine. Then,
we can apply local reasoning to show that the first
store can't be visible to may_unwind, and eliminate it.<br>
</p>
</div>
</blockquote>
<div>I don't think this would solve the same problem. In the
given examples the pointer is already not visible to
@may_unwind because it is noalias. "noalias" here is a
weaker version of "not captured before": The pointer may be
captured, but it's illegal to write through the captured
pointer, which is sufficient for alias analysis. The problem
with unwinding is that after the unwind, the calling
function may read the stored value in a landingpad, which
does not require any capture of the pointer.</div>
</div>
</div>
</blockquote>
You are completely correct, particularly on that last point. Don't
know what I was thinking when I first responded. <br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div><br>
</div>
<div>Regards,</div>
<div>Nikita<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p> </p>
<p>I'd want to give the escape framing more thought as
that seems potentially more general. Does knowing that
an argument does not point to escaped memory on entry
help on all of your motivating examples?</p>
<p>Philip<br>
</p>
<div>On 12/4/21 2:39 AM, Nikita Popov via llvm-dev wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi,</div>
<div><br>
</div>
<div>Consider the following IR:</div>
<div><br>
</div>
<div>declare void <a class="gmail_plusreply" id="gmail-m_-5752136611135978018gmail-m_7533167157008207913plusReplyChip-4">@may_unwind()</a><br>
</div>
<div>define void @test(i32* noalias sret(i32) %out) {</div>
<div> store i32 0, i32* %out</div>
<div> call void <a class="gmail_plusreply" id="gmail-m_-5752136611135978018gmail-m_7533167157008207913plusReplyChip-3">@may_unwind()<br>
</a>
<div> store i32 1, i32* %out</div>
</div>
<div> ret void<br>
</div>
<div>}</div>
<div><br>
</div>
<div>Currently, we can't remove the first store as
dead, because the @may_unwind() call may unwind, and
the caller might read %out at that point, making the
first store visible.</div>
<div><br>
</div>
<div>Similarly, it prevents call slot optimization in
the following example, because the call may unwind
and make an early write to the sret argument
visible:<br>
</div>
<br>
declare void @may_unwind(i32*)<br>
declare void @llvm.memcpy.p0i8.p0i8.i64(i8*, i8*, i64,
i1)<br>
define void @test(i32* noalias sret(i32) %arg) {<br>
%tmp = alloca i32<br>
call void @may_unwind(i32* nocapture %tmp)<br>
%tmp.8 = bitcast i32* %tmp to i8*<br>
%arg.8 = bitcast i32* %arg to i8*<br>
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4
%arg.8, i8* align 4 %tmp.8, i64 4, i1 false)<br>
ret void<br>
<div>}</div>
<div><br>
</div>
<div>I would like to address this in some form. The
easiest way would be to change LangRef to specify
that sret arguments cannot be read on unwind paths.
I think that matches how sret arguments are
generally used.</div>
<div><br>
</div>
<div>Alternatively, this could be handled using a
separate attribute that can be applied to any
argument, something along the lines of "i32*
nounwindread sret(i32) %arg". The benefit would be
that this is decoupled from sret ABI semantics and
could potentially be inferred (e.g. if the function
is only ever used with call and not invoke, this
should be a given).</div>
<div><br>
</div>
<div>Any thoughts on this? Is this a problem worth
solving, and if yes, would a new attribute be
preferred over restricting sret semantics?<br>
</div>
<div><br>
</div>
<div>Regards,</div>
<div>Nikita<br>
</div>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
LLVM Developers mailing list
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote></div></div>