[llvm-dev] sret read after unwind
Nikita Popov via llvm-dev
llvm-dev at lists.llvm.org
Mon Dec 13 12:12:47 PST 2021
On Mon, Dec 13, 2021 at 7:39 PM Philip Reames <listmail at philipreames.com>
wrote:
>
> On 12/7/21 12:29 AM, Nikita Popov wrote:
>
> On Mon, Dec 6, 2021 at 10:51 PM Philip Reames <listmail at philipreames.com>
> wrote:
>
>> I think this is an interesting problem.
>>
>> I'd probably lean towards the use of a separate attribute, but not
>> strongly so.
>>
>> The example which makes me prefer the separate attribute would be a
>> function with an out-param. It's very unlikely that an out-param will be
>> read on the exception path. Being able to perform DSE for such out params
>> seems quite interesting.
>>
> Right. I think it's mainly a question of whether we'd be able to infer the
> attribute in practice, in cases where it's not annotated by the frontend
> (which it should do in the sret case). I think this is possible at least
> for the case where all calls to the function pass in an alloca to this
> parameter (or another argument with nounwindread I guess) and don't use a
> landingpad for the call. However, I believe we do inference in this
> direction (RPO rather than PO) only in the module optimization pipeline,
> which means that DSE/MemCpyOpt might not be able to make use of the
> inferred information.
>
> A couple of points here:
>
> 1. I often see attributes for which inference isn't a goal as being
> under specified. It's too easy not to think about all the corner cases up
> front, and that bites us later. I think it's important to have specified
> the valid inference rules as part of the initial definition discussion,
> even if we don't implement them. It forces us to think through the
> subtleties.
> 2. I think you're alloc rule can be extended to any unescaped
> allocation for which we can indentify all accesses and that none are
> reachable on the exceptional path. The trivial call rule (there is no
> exceptional path) is one sub-case of that. This backwards walk may seem
> expensive, but I think we already do it in DSE, and could leave converting
> callsite attributes to functions attributes to a later RPO phase.
> 3. You're right that we don't really do RPO today. See point (1). I
> wouldn't want to add such just for this.
>
> In terms of detailed semantics, I think the main interesting question is
what exactly "no read on unwind" means. I see two general approaches: The
first is that reading (or possibly accessing) the argument memory after an
unwind is immediate undefined behavior. The other is that the behavior is
"as if" the argument memory is overwritten with poison on unwind. This
means that the memory can be read without UB, but it cannot depend on any
value written into it during the call. For example, if the argument memory
is fully overwritten after the call and read again afterwards, that would
still be nounwindread. I'd personally lean towards the latter
interpretation, in that it is more generally applicable without giving up
any useful optimization power that I see.
The other question would be what "argument memory" is. This could either be
the whole underlying "allocated object" associated with the argument, or
the size of the memory region would have to be specified as an attribute
argument. So something like "i32* noalias nounwindread(4) %out" to say that
the four bytes starting at the passed pointer are not read on unwind. I'd
lean towards the former here, because it is simpler in terms of
analysis/reasoning, and works even if we don't know the exact access
location, just the underlying object.
> Aside: sret has the under-specified problem today. I have no idea when it
> would be legal to infer sret.
>
I think the answer to this is "never", because sret is considered an ABI
attribute -- though to be honest I'm not really clear in which way it
actually affects the call ABI.
Regards,
Nikita
> However, I'll note that the same problem can be framed as an escape
>> problem. That is, we have an annotation not that a value is dead on the
>> exception path, but that it hasn't been captured on entry to the routine.
>> Then, we can apply local reasoning to show that the first store can't be
>> visible to may_unwind, and eliminate it.
>>
> I don't think this would solve the same problem. In the given examples the
> pointer is already not visible to @may_unwind because it is noalias.
> "noalias" here is a weaker version of "not captured before": The pointer
> may be captured, but it's illegal to write through the captured pointer,
> which is sufficient for alias analysis. The problem with unwinding is that
> after the unwind, the calling function may read the stored value in a
> landingpad, which does not require any capture of the pointer.
>
> You are completely correct, particularly on that last point. Don't know
> what I was thinking when I first responded.
>
>
> Regards,
> Nikita
>
>> I'd want to give the escape framing more thought as that seems
>> potentially more general. Does knowing that an argument does not point to
>> escaped memory on entry help on all of your motivating examples?
>>
>> Philip
>> On 12/4/21 2:39 AM, Nikita Popov via llvm-dev wrote:
>>
>> Hi,
>>
>> Consider the following IR:
>>
>> declare void @may_unwind()
>> define void @test(i32* noalias sret(i32) %out) {
>> store i32 0, i32* %out
>> call void @may_unwind()
>> store i32 1, i32* %out
>> ret void
>> }
>>
>> Currently, we can't remove the first store as dead, because the
>> @may_unwind() call may unwind, and the caller might read %out at that
>> point, making the first store visible.
>>
>> Similarly, it prevents call slot optimization in the following example,
>> because the call may unwind and make an early write to the sret argument
>> visible:
>>
>> declare void @may_unwind(i32*)
>> declare void @llvm.memcpy.p0i8.p0i8.i64(i8*, i8*, i64, i1)
>> define void @test(i32* noalias sret(i32) %arg) {
>> %tmp = alloca i32
>> call void @may_unwind(i32* nocapture %tmp)
>> %tmp.8 = bitcast i32* %tmp to i8*
>> %arg.8 = bitcast i32* %arg to i8*
>> call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %arg.8, i8* align 4
>> %tmp.8, i64 4, i1 false)
>> ret void
>> }
>>
>> I would like to address this in some form. The easiest way would be to
>> change LangRef to specify that sret arguments cannot be read on unwind
>> paths. I think that matches how sret arguments are generally used.
>>
>> Alternatively, this could be handled using a separate attribute that can
>> be applied to any argument, something along the lines of "i32* nounwindread
>> sret(i32) %arg". The benefit would be that this is decoupled from sret ABI
>> semantics and could potentially be inferred (e.g. if the function is only
>> ever used with call and not invoke, this should be a given).
>>
>> Any thoughts on this? Is this a problem worth solving, and if yes, would
>> a new attribute be preferred over restricting sret semantics?
>>
>> Regards,
>> Nikita
>>
>> _______________________________________________
>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211213/640e762f/attachment.html>
More information about the llvm-dev
mailing list