<div dir="ltr"><div dir="ltr"><div>>> Stepwisely defining the semantics of instructions is a desirable direction<br>>> IMO.<br>

><br>> I'm confused. What in the proposal would prevent us from defining<br>> the semantics of instructions, or force us to do it in an "undesirable way"?<br></div><div><br></div><div>I meant it would be great if the output state after executing an instruction can be described using its input state.</div><div>(that was the meaning of 'stepwise semantics', I should have been more clear about this)</div><div><br></div><div>For example, the semantics of 'z = add x y' can be defined as follows:</div><div>Given an input state s, next state s' = s[z -> s(x) + s(y)]</div><div>where s(x) is the value of x in the previous state, and s[z -> v] is a state with z updated to v.</div></div><div><br></div><div>Another example that involves memory: the semantics of 'i = ptrtoint p' can be defined as follows:</div><div>Given an input state s, next state s' = s[i -> s(p).obj.address + s(p).offset]</div><div>where obj.address is the begin address of a memory object obj pointed by p & offset is p's byte offset. (Imagine a pointer to the offset of some char array).</div><div>Note that ptrtoint & add can be nicely defined w.r.t the input state.</div><div><br></div><div><br></div><div>Now, the instruction that we're facing is 'p = alloca'.</div><div>To describe the output state after executing 'p = alloca', the address of new alloca should be determined.</div><div>If observedness is involved, we need to know the future state again. :/ We don't know whether the alloca is going to be observed or not without seeing the future.</div><div>This is the problem of the current lifetime intrinsics as well.</div><div><br></div><div><br></div><div>One possible approach to resolve this is adding an 'unobserved' flag to alloca instruction (similar to what was suggested by Nicolai).</div><div>And, we can say that if alloca with 'unobserved' is used by ptrtoint/icmp, it is UB.</div><div>But this makes ptrtoint/icmp make UB-raising instructions, which contradicts with what LLVM does.</div><div>Also, existing optimizations like loop versioning can introduce ptrtoint/pointer comparisons too.</div><div><br></div><div><br></div><div>I see that there are other questions that I didn't answer yet, but let me answer this first to limit the length of the text :)</div><div><br></div><div>Thanks,</div><div>Juneyoung</div><div><br></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jan 7, 2021 at 3:36 AM Johannes Doerfert <<a href="mailto:johannesdoerfert@gmail.com">johannesdoerfert@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 1/5/21 8:00 PM, Juneyoung Lee wrote:<br>

> Hi Johannes,<br>

><br>

> I read your proposal and thought about the model.<br>

<br>

Cool, thanks!<br>

<br>

<br>

><br>

> As you concerned in A3, certain programs may be valid only when memory<br>

> blocks with overlapping lifetimes have disjoint addresses.<br>

> Look at this example (I'm using malloc, but alloca also works):<br>

><br>

> p1 = malloc(4)<br>

> p2 = malloc(4) // for brevity, assume that there as enough space & p1 and<br>

> p2 != null<br>

> set<char*> s;<br>

> s.insert(p1); s.insert(p2); // If the second insert did nothing, it would<br>

> be surprise to programmers<br>

> work(s);<br>

> free(data1)<br>

> free(data2)<br>

><br>

> Clearly, IR semantics should guarantee that escaped blocks are disjoint. It<br>

> would be great for verification tools on LLVM IR to be able to answer that<br>

> the second insert will succeed.<br>

<br>

I agree, the second insert should succeed, assuming `p1 && p2`.<br>

I don't think my proposal would in any way impact the program above,<br>

if anything the A3 reasoning makes sure such a program with allocas<br>

is not miscompiled.<br>

<br>

I'm also not sure I understand what you try to argue for. Maybe<br>

elaborate a bit what it is you think is bad or needs to be changed.<br>

<br>

<br>

> The problem is that definition of escapedness is not clear at the semantic<br>

> level. Describing the IR semantics w.r.t. LLVM's escaped analysis function<br>

> isn't something we want.<br>

><br>

> Semantic definition of escapedness of a pointer seems hard, I mean in a<br>

> stepwise manner.<br>

> It isn't a single instruction such as 'escape i8* ptr', and we need to look<br>

> over all instructions in the function. For example, '(int)(p+1) - (int)p'<br>

> isn't semantically escaping the pointer p because the result is 1<br>

> regardless of the value of p.<br>

> Stepwisely defining the semantics of instructions is a desirable direction<br>

> IMO.<br>

<br>

I'm confused. What in the proposal would prevent us from defining<br>

the semantics of instructions, or force us to do it in an "undesirable way"?<br>

<br>

<br>

><br>

> In practice, syntactically checking escapedness + nice engineering might<br>

> not break optimizations in most cases (as undef/poison did); but it would<br>

> be great to move to another level, since LLVM IR is used in so many places<br>

> :)<br>

<br>

The property under which you can coalesce objects is simple:<br>

   It is not observable.<br>

<br>

Now, if the address of one of the two objects you coalesce is not<br>

observed, coalescing is not observable. That is a sufficient condition<br>

not a necessary one. Pointer "escaping" is one step further. If the<br>

address doesn't escape it is not observed. This does not mean the<br>

"semantic conditions for coalescing", e.g., for the purpose of translation<br>

validation, is supposed to be build on top of our "definition of escaping<br>

pointers". That said, we use "does not escape" as a precondition for<br>

various transformation and I'm unsure what is any different now. The<br>

entire escape argument is only used in the validity of the pointer folding.<br>

Similarly, we can fold a comparison of a noalias pointer with another one<br>

if the former does not escape (and both are dereferenced and one is<br>

written for sure).<br>

<br>

<br>

> The pointer comparison is another beast. It indeed has a few issues, and<br>

> solving it might require nontrivial solution.<br>

<br>

I think the main problem of the inconsistencies and such we've seen is<br>

rooted in the erroneous folding of pointer comparisons. Cleaning up the<br>

lifetime marker semantics is actually unrelated and simply not folding<br>

as described in A3 should solve the issue that has been reported. <br>

Nevertheless,<br>

we should take a crack at lifetime markers while we are here.<br>

<br>

<br>

~ Johannes<br>

<br>

<br>

<br>

><br>

> Thanks,<br>

> Juneyoung<br>

><br>

> On Tue, Jan 5, 2021 at 9:37 AM Johannes Doerfert <<a href="mailto:johannesdoerfert@gmail.com" target="_blank">johannesdoerfert@gmail.com</a>><br>

> wrote:<br>

><br>

>> Hi Juneyoung,<br>

>><br>

>> Happy new year :)<br>

>><br>

>> After we had a lengthy discussion on phab last year, I've tried to<br>

>> summarize my thoughts,<br>

>> especially given that I had some time to think about things over the break.<br>

>> Still, no promises on the quality ;)<br>

>><br>

>> I start with general questions I asked myself and then go on rambling<br>

>> about a potential design.<br>

>><br>

>><br>

>> Q1: Is lifetime a given property or a derived one, thus is it fixed or<br>

>> modifiable?<br>

>><br>

>> This is a question I asked myself a lot recently. I think it is derived<br>

>> and modifiable,<br>

>> at least I hope it is. Only that would allow transformations I would want<br>

>> us to do. Here are some examples:<br>

>>     <a href="https://godbolt.org/z/G8obj3" rel="noreferrer" target="_blank">https://godbolt.org/z/G8obj3</a><br>

>>     <a href="https://godbolt.org/z/obaTc" rel="noreferrer" target="_blank">https://godbolt.org/z/obaTc</a><br>

>><br>

>><br>

>> Q2: Is a pointer comparison, or similar use, extending the lifetime?<br>

>><br>

>> Asked differently, can we hoist a pointer comparison into a region where<br>

>> the pointer is dead?<br>

>> This is an important question which we haven't discussed much as we<br>

>> assumed LICM has to work.<br>

>><br>

>> The current behavior is that non-dereferencing uses are not extending<br>

>> the lifetime and are<br>

>> allowed outside of "lifetime regions" (as indicated by markers). They<br>

>> will always produce valid<br>

>> results. Though, we might want to think about a lifetime marker that<br>

>> spits out a new pointer<br>

>> value instead of reusing the old one.<br>

>><br>

>><br>

>> Q3: Can we use lifetime to argue about addresses?<br>

>><br>

>> The question here is, can we fold address comparisons based on<br>

>> lifetimes, or not.<br>

>><br>

>> So far, we fold comparisons based on "address information". For example,<br>

>> we "know" globals,<br>

>> allocas, and mallocs cannot be equal to one another. Also, two distinct<br>

>> allocations, for globals<br>

>> and allocas, are considered unequal. Now, the crux is that we have to be<br>

>> consistent if we do two<br>

>> comparisons, and, as of now, we are not (bug number missing). Since the<br>

>> backend (or any other place<br>

>> for that matter) might coalesce allocas, their addresses might not be<br>

>> different after all. If we<br>

>> already folded a comparison to "unequal" we are doomed if we later have<br>

>> a comparison that results<br>

>> in "equal". (Note, this is different from aliasing rules as they can be<br>

>> stricter.)<br>

>><br>

>><br>

>> Design:<br>

>><br>

>> I would hope we can come up with a model that treats memory "the same",<br>

>> regardless if it is global,<br>

>> stack, or heap. I want to avoid special casing one of them wrt. lifetime<br>

>> as I believe most optimizations<br>

>> would apply to any of them, potentially for different reasons and with<br>

>> different gains, but nevertheless.<br>

>><br>

>><br>

>> Proposal (largely based on the conversation in phab):<br>

>><br>

>> A1: Lifetime is a concept that talks about memory content *only*.<br>

>> Basically, the memory content is set to<br>

>>       undefined by lifetime markers. It is derived/modifiable as it is<br>

>> just an "as-is" property of the memory<br>

>>       content. The lifetimes of an object, as described by markers, might<br>

>> change during the compilation. They<br>

>>       might become smaller if we deduce the object is not accessed and<br>

>> the memory content is not used, they<br>

>>       might become larger if objects with non-overlapping lifetimes are<br>

>> coalesced. (One could see the latter as<br>

>>       introducing a new object though.)<br>

>><br>

>> A2: If we define lifetime as above, it has nothing to do with the<br>

>> address of an object. Consequently, pointer<br>

>>       comparisons and similar operations are valid outside the lifetime.<br>

>> Loads and stores are as well, they can<br>

>>       even not be removed "easily". A store followed by a lifetime marker<br>

>> or a load following a lifetime marker<br>

>>       is dead or results in undef respectively.<br>

>><br>

>> A3: We could not use lifetime to argue about addresses. This means we<br>

>> could/should also not argue that overlapping<br>

>>       lifetimes result in different addresses. Thus, a comparison between<br>

>> the address of two allocas could not<br>

>>       immediately be folded. That said, there would be special cases<br>

>> though. Basically, if one of the allocas does<br>

>>       not escape, other than the comparisons to be folded, we can fold<br>

>> them. Afterwards, coalescing or splitting<br>

>>       would still be consistent because it is unobservable. The problem<br>

>> we have in-tree is that we fold even though<br>

>>       the address is still observed (after the fold). It is still unclear<br>

>> to me what the impact of this would be<br>

>>       on real code. I suggested before that we run some experiments first<br>

>> before we make any decision whatsoever.<br>

>><br>

>> This is pretty much saying that lifetime markers are `memset(undef)`, as<br>

>> you suggested before (I think).<br>

>> There are some implementation level differences but at the end of the<br>

>> day they are basically the same.<br>

>><br>

>> Happy to hear some thoughts on this, especially if it fixes the problems<br>

>> that lead to D93376 in the first place.<br>

>><br>

>> ~ Johannes<br>

>><br>

>><br>

>> On 12/18/20 2:42 AM, Juneyoung Lee via llvm-dev wrote:<br>

>>> Hello all,<br>

>>><br>

>>> We're discussing the well-formedness of lifetime.start/end intrinsic<br>

>> here (<br>

>>> <a href="https://reviews.llvm.org/D93376" rel="noreferrer" target="_blank">https://reviews.llvm.org/D93376</a>), deciding what is a (syntactically &<br>

>>> semantically) valid use of these intrinsics.<br>

>>><br>

>>> I'd like to gather more context about the intrinsics.<br>

>>><br>

>>> 1. Is there a frontend or framework that introduces lifetime call with<br>

>>> non-stack allocated objects?<br>

>>> If exists, which behavior do they expect from it?<br>

>>><br>

>>> 2. Can a block's lifetime be started bytewise or elementwise?<br>

>>> I imagine an optimization that allow using stack very compactly, but<br>

>> wonder<br>

>>> there is a real-world use case.<br>

>>><br>

>>> Thanks,<br>

>>> Juneyoung<br>

>>><br>

>>><br>

>>> _______________________________________________<br>

>>> LLVM Developers mailing list<br>

>>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

>>> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

><br>

<br>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><br></div><font size="1">Juneyoung Lee</font><div><font size="1">Software Foundation Lab, Seoul National University</font></div></div></div></div>