<div dir="ltr"><div dir="ltr"><div dir="ltr">Me neither ATM, but it should be me if one has to do the experiment because I was pushing the patch.</div><div dir="ltr"><br></div><div>Thanks,</div><div>Juneyoung</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Jan 9, 2021 at 1:16 AM Johannes Doerfert <<a href="mailto:johannesdoerfert@gmail.com">johannesdoerfert@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">I think a sensible next step would be to determine the impact of A3 (below)<br>

on real code. Basically, "fixing" test3 here: <br>

<a href="https://opt.godbolt.org/z/Wesn5r" rel="noreferrer" target="_blank">https://opt.godbolt.org/z/Wesn5r</a><br>

<br>

I doubt I will get to that any time soon. Juneyoung, Nikita, would you <br>

be able to<br>

run some experiments?<br>

<br>

<br>

A3: We could not use lifetime to argue about addresses. This means we <br>

could/should also not argue that overlapping<br>

     lifetimes result in different addresses. Thus, a comparison between <br>

the address of two allocas could not<br>

     immediately be folded. That said, there would be special cases <br>

though. Basically, if one of the allocas does<br>

     not escape, other than the comparisons to be folded, we can fold <br>

them. Afterwards, coalescing or splitting<br>

     would still be consistent because it is unobservable. The problem <br>

we have in-tree is that we fold even though<br>

     the address is still observed (after the fold). It is still unclear <br>

to me what the impact of this would be<br>

     on real code. I suggested before that we run some experiments first <br>

before we make any decision whatsoever.<br>

<br>

<br>

On 1/7/21 10:12 PM, Juneyoung Lee wrote:<br>

> On Fri, Jan 8, 2021 at 8:54 AM Johannes Doerfert <<a href="mailto:johannesdoerfert@gmail.com" target="_blank">johannesdoerfert@gmail.com</a>><br>

> wrote:<br>

><br>

>> On 1/7/21 4:55 PM, Juneyoung Lee wrote:<br>

>>> It is, alloca must have its integral address decided at its *allocation*<br>

>>> time as well, like 0xFF00... .<br>

>>><br>

>>> For example, if 0xFFFF0000 ~ 0xFF000000 is already allocated, the new<br>

>>> alloca cannot be placed there unless it is not observed in the future,<br>

>>> according to your proposal.<br>

>>><br>

>>> But how do we know from the current state whether the stack variable is<br>

>>> going to be observed or not?<br>

>> With that argument you could allocate all but 4 bytes, do a malloc(4)<br>

>> and know the address of the returned pointer (assuming it is not null).<br>

>><br>

>> What I try to say is, either your scenario is part of the model and<br>

>> everything we do is broken as you could "observe" addresses passively,<br>

>> *or*, the abstract machine we use for semantic reasoning doesn't permit<br>

>> the above reasoning. I really hope for the latter.<br>

>><br>

> That's a very valid point..!<br>

><br>

> Actually, I have a memory model that addresses this problem.<br>

> The gist is that we can interpret each allocation instruction as *creating<br>

> 2 blocks* and nondeterministically returning one of them.<br>

> The other one is marked as invalid but not freed immediately. Deallocation<br>

> frees both blocks.<br>

> If there is not enough space for having two blocks, it is treated as<br>

> out-of-memory.<br>

><br>

> It makes guessing the address of an allocation according to memory layout<br>

> invalid UB.<br>

><br>

> p = malloc(4)<br>

> // If p != null, it is guaranteed there was at least two 4 byte slots<br>

> available, say 0x100~0x104 and 0x110~0x114.<br>

> // Two blocks are allocated at 0x110 and 0x114, and one of the addresses is<br>

> nondeterministically returned.<br>

> // By the nature of nondeterminism, all executions below should be<br>

> well-defined regardless of the address of p.<br>

> *(int*)0x100 = 10; // In an execution where p is 0x110, this raises UB.<br>

><br>

> The paper link is here <<a href="https://dl.acm.org/doi/10.1145/3276495" rel="noreferrer" target="_blank">https://dl.acm.org/doi/10.1145/3276495</a>> FYI.<br>

<br>

Nice! Thanks for the link :)<br>

<br>

<br>

> To argue differently: Who is to say there is a stack, or only one,<br>

>> or that alloca allocates memory "on the one stack"? That is not part of<br>

>> the IR, IMHO. I can write a machine on which alloca lowers to malloc,<br>

>> I don't even need the free during stack unwind but that I could do as<br>

>> well if I wanted to.<br>

><br>

> This is right, the example was for illustrative purposes. IR does not<br>

> enforce alloca to be placed at 'stack'.<br>

><br>

><br>

>>> Making icmp/ptrtoint yield poison will still make loop versioning or<br>

>>> pointer rewriting transformations unsound because these operations now<br>

>> can<br>

>>> create poison (even if pointers are noundef).<br>

>> I did not say they yield poison, at least I did not try to say that.<br>

>> What are you referring to exactly?<br>

>><br>

> I was referring to this:<br>

><br>

>>> But this makes ptrtoint/icmp make UB-raising instructions, which<br>

>>> contradicts with what LLVM does.<br>

>> As with other violation of attributes I would, on first though, suggest<br>

>> to produce poison, not UB.<br>

> But it is more about the (imaginary) attribute, so maybe I was slightly out<br>

> of topic.<br>

><br>

>> 2) is fine, I think the suggestion semantically makes sense perfectly. 1)<br>

>>> is something I'm concerned about now.<br>

>>><br>

>>> There are more than pointer foldings, such as rewriting such expression,<br>

>>> code motion ptr cmp, introduce ptr cmp, etc. There is also analysis<br>

>> relying<br>

>>> on ptr cmp.<br>

>>> Defining the correctness of each of them is something we want to avoid,<br>

>> and<br>

>>> maybe that's why we want to define precise semantics for things.<br>

>> I don't get the point. My proposal does not change the semantics of<br>

>> pointer comparisons, at all. I explicitly mentioned that in the last<br>

>> email.<br>

>><br>

> Oh okay, I thought it was a part of the lifetime proposal, but it seems<br>

> more like a separate thing.<br>

> I agree that this requires performance impact.<br>

> Also investigation of existing transformations would be needed; Alive2's<br>

> pointer comparison is doing approximation yet, but if it becomes fully<br>

> precise, it will show something from running LLVM unit tests I believe..! :)<br>

><br>

>> I think this will be less aggressive and may give nice feedback to<br>

>>> potential projects that are using lifetime with non-alloca.<br>

>> The lifetime marker debate, basically 2) above, is orthogonal to the<br>

>> problem<br>

>> you try to solve. It got mixed in as lifetime markers were used by<br>

>> StackColoring<br>

>> to perform coalescing but that is coincidental. You can (arguably)<br>

>> coalesce stack<br>

>> allocations regardless of lifetime markers and with 1) such a<br>

>> transformation<br>

>> (w/ and w/o lifetime markers) would actually be sound.<br>

>><br>

>><br>

>>> At the end, we can implement IR writer that lowers lifetime with<br>

>> non-alloca<br>

>>> into memset(undef). WDYT?<br>

>> Yeah, 2) is orthogonal and we can lower it that way. Unsure if it is<br>

>> helpful<br>

>> but we can certainly define it that way in the LangRef.<br>

>><br>

> Okay, thanks!<br>

><br>

><br>

>> ~ Johannes<br>

>><br>

>><br>

>><br>

>>> Thanks,<br>

>>> Juneyoung<br>

>>><br>

>>> p.s. The reply was late, sorry. I think I can spend more time on this<br>

>> today.<br>

>>> On Thu, Jan 7, 2021 at 9:02 AM Johannes Doerfert <<br>

>> <a href="mailto:johannesdoerfert@gmail.com" target="_blank">johannesdoerfert@gmail.com</a>><br>

>>> wrote:<br>

>>><br>

>>>> On 1/6/21 4:33 PM, Juneyoung Lee wrote:<br>

>>>>>>> Stepwisely defining the semantics of instructions is a desirable<br>

>>>>> direction<br>

>>>>>>> IMO.<br>

>>>>>> I'm confused. What in the proposal would prevent us from defining<br>

>>>>>> the semantics of instructions, or force us to do it in an "undesirable<br>

>>>>> way"?<br>

>>>>><br>

>>>>> I meant it would be great if the output state after executing an<br>

>>>>> instruction can be described using its input state.<br>

>>>>> (that was the meaning of 'stepwise semantics', I should have been more<br>

>>>>> clear about this)<br>

>>>>><br>

>>>>> For example, the semantics of 'z = add x y' can be defined as follows:<br>

>>>>> Given an input state s, next state s' = s[z -> s(x) + s(y)]<br>

>>>>> where s(x) is the value of x in the previous state, and s[z -> v] is a<br>

>>>>> state with z updated to v.<br>

>>>>><br>

>>>>> Another example that involves memory: the semantics of 'i = ptrtoint p'<br>

>>>> can<br>

>>>>> be defined as follows:<br>

>>>>> Given an input state s, next state s' = s[i -> s(p).obj.address +<br>

>>>>> s(p).offset]<br>

>>>>> where obj.address is the begin address of a memory object obj pointed<br>

>> by<br>

>>>> p<br>

>>>>> & offset is p's byte offset. (Imagine a pointer to the offset of some<br>

>>>> char<br>

>>>>> array).<br>

>>>>> Note that ptrtoint & add can be nicely defined w.r.t the input state.<br>

>>>>><br>

>>>>><br>

>>>>> Now, the instruction that we're facing is 'p = alloca'.<br>

>>>>> To describe the output state after executing 'p = alloca', the address<br>

>> of<br>

>>>>> new alloca should be determined.<br>

>>>>> If observedness is involved, we need to know the future state again. :/<br>

>>>> We<br>

>>>>> don't know whether the alloca is going to be observed or not without<br>

>>>> seeing<br>

>>>>> the future.<br>

>>>>> This is the problem of the current lifetime intrinsics as well.<br>

>>>> No, you mix things up here.<br>

>>>><br>

>>>> Nobody proposed to modify the semantics of `alloca`.<br>

>>>> `alloca` provides you with a fresh, unobserved block of<br>

>>>> dereferenceable memory that is implicitly freed as the stack<br>

>>>> unwinds. That is it. No context necessary.<br>

>>>><br>

>>>> If you want to modify the IR, you need to argue the observable<br>

>>>> semantics which is nothing new. That this might require more than<br>

>>>> a peephole view of the program is also not new.<br>

>>>><br>

>>>><br>

>>>>> One possible approach to resolve this is adding an 'unobserved' flag to<br>

>>>>> alloca instruction (similar to what was suggested by Nicolai).<br>

>>>>> And, we can say that if alloca with 'unobserved' is used by<br>

>>>> ptrtoint/icmp,<br>

>>>>> it is UB.<br>

>>>> The flag can be added, like we add other attributes. It should not<br>

>>>> be required for any optimization we talked about though. It basically<br>

>>>> is a way to manifest derived or given information into the IR.<br>

>>>><br>

>>>> Attribute deduction, as well as frontends with domain knowledge,<br>

>>>> can add such information. The flag we discussed in phab was not even<br>

>>>> sufficient for all the transformation examples I presented in my mail,<br>

>>>> that is why I extended my  argument. We could still have a "noescape"<br>

>>>> flag for allocas, but I'm not sure how useful that really is. We can<br>

>>>> certainly deduce it and manifest it, unsure if we have domain knowledge<br>

>>>> we can use for non-trivial cases though.<br>

>>>><br>

>>>><br>

>>>>> But this makes ptrtoint/icmp make UB-raising instructions, which<br>

>>>>> contradicts with what LLVM does.<br>

>>>> As with other violation of attributes I would, on first though, suggest<br>

>>>> to produce poison, not UB.<br>

>>>><br>

>>>><br>

>>>>> Also, existing optimizations like loop versioning can introduce<br>

>>>>> ptrtoint/pointer comparisons too.<br>

>>>> Sure. I am not certain why that is a problem. I get the feeling things<br>

>>>> are still mixed up here.<br>

>>>><br>

>>>> What I proposed is twofold:<br>

>>>><br>

>>>> 1) We stop folding comparisons between different allocas if changing the<br>

>>>> address<br>

>>>>       of both might be observable. Thus, if both might have their address<br>

>>>> "taken"/escaped,<br>

>>>>       other than the comparisons we want to fold, we cannot proceed.<br>

>>>><br>

>>>> 2) We define lifetime markers to mean `memset(undef)`.<br>

>>>><br>

>>>> The first should be sufficient for the problem you were trying to solve<br>

>>>> in the<br>

>>>> first place. The second makes lifetime markers less weird. Note that 1)<br>

>>>> is not changing<br>

>>>> the semantics of the IR. We basically just argue there is a bug in our<br>

>>>> instcombine right<br>

>>>> now as we do not check all necessary preconditions.<br>

>>>><br>

>>>><br>

>>>>> I see that there are other questions that I didn't answer yet, but let<br>

>> me<br>

>>>>> answer this first to limit the length of the text :)<br>

>>>> Sure, we can split the discussion :)<br>

>>>><br>

>>>> ~ Johannes<br>

>>>><br>

>>>><br>

>>>>> Thanks,<br>

>>>>> Juneyoung<br>

>>>>><br>

>>>>> On Thu, Jan 7, 2021 at 3:36 AM Johannes Doerfert <<br>

>>>> <a href="mailto:johannesdoerfert@gmail.com" target="_blank">johannesdoerfert@gmail.com</a>><br>

>>>>> wrote:<br>

>>>>><br>

>>>>>> On 1/5/21 8:00 PM, Juneyoung Lee wrote:<br>

>>>>>>> Hi Johannes,<br>

>>>>>>><br>

>>>>>>> I read your proposal and thought about the model.<br>

>>>>>> Cool, thanks!<br>

>>>>>><br>

>>>>>><br>

>>>>>>> As you concerned in A3, certain programs may be valid only when<br>

>> memory<br>

>>>>>>> blocks with overlapping lifetimes have disjoint addresses.<br>

>>>>>>> Look at this example (I'm using malloc, but alloca also works):<br>

>>>>>>><br>

>>>>>>> p1 = malloc(4)<br>

>>>>>>> p2 = malloc(4) // for brevity, assume that there as enough space & p1<br>

>>>> and<br>

>>>>>>> p2 != null<br>

>>>>>>> set<char*> s;<br>

>>>>>>> s.insert(p1); s.insert(p2); // If the second insert did nothing, it<br>

>>>> would<br>

>>>>>>> be surprise to programmers<br>

>>>>>>> work(s);<br>

>>>>>>> free(data1)<br>

>>>>>>> free(data2)<br>

>>>>>>><br>

>>>>>>> Clearly, IR semantics should guarantee that escaped blocks are<br>

>>>> disjoint.<br>

>>>>>> It<br>

>>>>>>> would be great for verification tools on LLVM IR to be able to answer<br>

>>>>>> that<br>

>>>>>>> the second insert will succeed.<br>

>>>>>> I agree, the second insert should succeed, assuming `p1 && p2`.<br>

>>>>>> I don't think my proposal would in any way impact the program above,<br>

>>>>>> if anything the A3 reasoning makes sure such a program with allocas<br>

>>>>>> is not miscompiled.<br>

>>>>>><br>

>>>>>> I'm also not sure I understand what you try to argue for. Maybe<br>

>>>>>> elaborate a bit what it is you think is bad or needs to be changed.<br>

>>>>>><br>

>>>>>><br>

>>>>>>> The problem is that definition of escapedness is not clear at the<br>

>>>>>> semantic<br>

>>>>>>> level. Describing the IR semantics w.r.t. LLVM's escaped analysis<br>

>>>>>> function<br>

>>>>>>> isn't something we want.<br>

>>>>>>><br>

>>>>>>> Semantic definition of escapedness of a pointer seems hard, I mean<br>

>> in a<br>

>>>>>>> stepwise manner.<br>

>>>>>>> It isn't a single instruction such as 'escape i8* ptr', and we need<br>

>> to<br>

>>>>>> look<br>

>>>>>>> over all instructions in the function. For example, '(int)(p+1) -<br>

>>>> (int)p'<br>

>>>>>>> isn't semantically escaping the pointer p because the result is 1<br>

>>>>>>> regardless of the value of p.<br>

>>>>>>> Stepwisely defining the semantics of instructions is a desirable<br>

>>>>>> direction<br>

>>>>>>> IMO.<br>

>>>>>> I'm confused. What in the proposal would prevent us from defining<br>

>>>>>> the semantics of instructions, or force us to do it in an "undesirable<br>

>>>>>> way"?<br>

>>>>>><br>

>>>>>><br>

>>>>>>> In practice, syntactically checking escapedness + nice engineering<br>

>>>> might<br>

>>>>>>> not break optimizations in most cases (as undef/poison did); but it<br>

>>>> would<br>

>>>>>>> be great to move to another level, since LLVM IR is used in so many<br>

>>>>>> places<br>

>>>>>>> :)<br>

>>>>>> The property under which you can coalesce objects is simple:<br>

>>>>>>       It is not observable.<br>

>>>>>><br>

>>>>>> Now, if the address of one of the two objects you coalesce is not<br>

>>>>>> observed, coalescing is not observable. That is a sufficient condition<br>

>>>>>> not a necessary one. Pointer "escaping" is one step further. If the<br>

>>>>>> address doesn't escape it is not observed. This does not mean the<br>

>>>>>> "semantic conditions for coalescing", e.g., for the purpose of<br>

>>>> translation<br>

>>>>>> validation, is supposed to be build on top of our "definition of<br>

>>>> escaping<br>

>>>>>> pointers". That said, we use "does not escape" as a precondition for<br>

>>>>>> various transformation and I'm unsure what is any different now. The<br>

>>>>>> entire escape argument is only used in the validity of the pointer<br>

>>>> folding.<br>

>>>>>> Similarly, we can fold a comparison of a noalias pointer with another<br>

>>>> one<br>

>>>>>> if the former does not escape (and both are dereferenced and one is<br>

>>>>>> written for sure).<br>

>>>>>><br>

>>>>>><br>

>>>>>>> The pointer comparison is another beast. It indeed has a few issues,<br>

>>>> and<br>

>>>>>>> solving it might require nontrivial solution.<br>

>>>>>> I think the main problem of the inconsistencies and such we've seen is<br>

>>>>>> rooted in the erroneous folding of pointer comparisons. Cleaning up<br>

>> the<br>

>>>>>> lifetime marker semantics is actually unrelated and simply not folding<br>

>>>>>> as described in A3 should solve the issue that has been reported.<br>

>>>>>> Nevertheless,<br>

>>>>>> we should take a crack at lifetime markers while we are here.<br>

>>>>>><br>

>>>>>><br>

>>>>>> ~ Johannes<br>

>>>>>><br>

>>>>>><br>

>>>>>><br>

>>>>>>> Thanks,<br>

>>>>>>> Juneyoung<br>

>>>>>>><br>

>>>>>>> On Tue, Jan 5, 2021 at 9:37 AM Johannes Doerfert <<br>

>>>>>> <a href="mailto:johannesdoerfert@gmail.com" target="_blank">johannesdoerfert@gmail.com</a>><br>

>>>>>>> wrote:<br>

>>>>>>><br>

>>>>>>>> Hi Juneyoung,<br>

>>>>>>>><br>

>>>>>>>> Happy new year :)<br>

>>>>>>>><br>

>>>>>>>> After we had a lengthy discussion on phab last year, I've tried to<br>

>>>>>>>> summarize my thoughts,<br>

>>>>>>>> especially given that I had some time to think about things over the<br>

>>>>>> break.<br>

>>>>>>>> Still, no promises on the quality ;)<br>

>>>>>>>><br>

>>>>>>>> I start with general questions I asked myself and then go on<br>

>> rambling<br>

>>>>>>>> about a potential design.<br>

>>>>>>>><br>

>>>>>>>><br>

>>>>>>>> Q1: Is lifetime a given property or a derived one, thus is it fixed<br>

>> or<br>

>>>>>>>> modifiable?<br>

>>>>>>>><br>

>>>>>>>> This is a question I asked myself a lot recently. I think it is<br>

>>>> derived<br>

>>>>>>>> and modifiable,<br>

>>>>>>>> at least I hope it is. Only that would allow transformations I would<br>

>>>>>> want<br>

>>>>>>>> us to do. Here are some examples:<br>

>>>>>>>>        <a href="https://godbolt.org/z/G8obj3" rel="noreferrer" target="_blank">https://godbolt.org/z/G8obj3</a><br>

>>>>>>>>        <a href="https://godbolt.org/z/obaTc" rel="noreferrer" target="_blank">https://godbolt.org/z/obaTc</a><br>

>>>>>>>><br>

>>>>>>>><br>

>>>>>>>> Q2: Is a pointer comparison, or similar use, extending the lifetime?<br>

>>>>>>>><br>

>>>>>>>> Asked differently, can we hoist a pointer comparison into a region<br>

>>>> where<br>

>>>>>>>> the pointer is dead?<br>

>>>>>>>> This is an important question which we haven't discussed much as we<br>

>>>>>>>> assumed LICM has to work.<br>

>>>>>>>><br>

>>>>>>>> The current behavior is that non-dereferencing uses are not<br>

>> extending<br>

>>>>>>>> the lifetime and are<br>

>>>>>>>> allowed outside of "lifetime regions" (as indicated by markers).<br>

>> They<br>

>>>>>>>> will always produce valid<br>

>>>>>>>> results. Though, we might want to think about a lifetime marker that<br>

>>>>>>>> spits out a new pointer<br>

>>>>>>>> value instead of reusing the old one.<br>

>>>>>>>><br>

>>>>>>>><br>

>>>>>>>> Q3: Can we use lifetime to argue about addresses?<br>

>>>>>>>><br>

>>>>>>>> The question here is, can we fold address comparisons based on<br>

>>>>>>>> lifetimes, or not.<br>

>>>>>>>><br>

>>>>>>>> So far, we fold comparisons based on "address information". For<br>

>>>> example,<br>

>>>>>>>> we "know" globals,<br>

>>>>>>>> allocas, and mallocs cannot be equal to one another. Also, two<br>

>>>> distinct<br>

>>>>>>>> allocations, for globals<br>

>>>>>>>> and allocas, are considered unequal. Now, the crux is that we have<br>

>> to<br>

>>>> be<br>

>>>>>>>> consistent if we do two<br>

>>>>>>>> comparisons, and, as of now, we are not (bug number missing). Since<br>

>>>> the<br>

>>>>>>>> backend (or any other place<br>

>>>>>>>> for that matter) might coalesce allocas, their addresses might not<br>

>> be<br>

>>>>>>>> different after all. If we<br>

>>>>>>>> already folded a comparison to "unequal" we are doomed if we later<br>

>>>> have<br>

>>>>>>>> a comparison that results<br>

>>>>>>>> in "equal". (Note, this is different from aliasing rules as they can<br>

>>>> be<br>

>>>>>>>> stricter.)<br>

>>>>>>>><br>

>>>>>>>><br>

>>>>>>>> Design:<br>

>>>>>>>><br>

>>>>>>>> I would hope we can come up with a model that treats memory "the<br>

>>>> same",<br>

>>>>>>>> regardless if it is global,<br>

>>>>>>>> stack, or heap. I want to avoid special casing one of them wrt.<br>

>>>> lifetime<br>

>>>>>>>> as I believe most optimizations<br>

>>>>>>>> would apply to any of them, potentially for different reasons and<br>

>> with<br>

>>>>>>>> different gains, but nevertheless.<br>

>>>>>>>><br>

>>>>>>>><br>

>>>>>>>> Proposal (largely based on the conversation in phab):<br>

>>>>>>>><br>

>>>>>>>> A1: Lifetime is a concept that talks about memory content *only*.<br>

>>>>>>>> Basically, the memory content is set to<br>

>>>>>>>>          undefined by lifetime markers. It is derived/modifiable as<br>

>> it<br>

>>>> is<br>

>>>>>>>> just an "as-is" property of the memory<br>

>>>>>>>>          content. The lifetimes of an object, as described by<br>

>> markers,<br>

>>>>>> might<br>

>>>>>>>> change during the compilation. They<br>

>>>>>>>>          might become smaller if we deduce the object is not accessed<br>

>>>> and<br>

>>>>>>>> the memory content is not used, they<br>

>>>>>>>>          might become larger if objects with non-overlapping<br>

>> lifetimes<br>

>>>> are<br>

>>>>>>>> coalesced. (One could see the latter as<br>

>>>>>>>>          introducing a new object though.)<br>

>>>>>>>><br>

>>>>>>>> A2: If we define lifetime as above, it has nothing to do with the<br>

>>>>>>>> address of an object. Consequently, pointer<br>

>>>>>>>>          comparisons and similar operations are valid outside the<br>

>>>> lifetime.<br>

>>>>>>>> Loads and stores are as well, they can<br>

>>>>>>>>          even not be removed "easily". A store followed by a lifetime<br>

>>>>>> marker<br>

>>>>>>>> or a load following a lifetime marker<br>

>>>>>>>>          is dead or results in undef respectively.<br>

>>>>>>>><br>

>>>>>>>> A3: We could not use lifetime to argue about addresses. This means<br>

>> we<br>

>>>>>>>> could/should also not argue that overlapping<br>

>>>>>>>>          lifetimes result in different addresses. Thus, a comparison<br>

>>>>>> between<br>

>>>>>>>> the address of two allocas could not<br>

>>>>>>>>          immediately be folded. That said, there would be special<br>

>> cases<br>

>>>>>>>> though. Basically, if one of the allocas does<br>

>>>>>>>>          not escape, other than the comparisons to be folded, we can<br>

>>>> fold<br>

>>>>>>>> them. Afterwards, coalescing or splitting<br>

>>>>>>>>          would still be consistent because it is unobservable. The<br>

>>>> problem<br>

>>>>>>>> we have in-tree is that we fold even though<br>

>>>>>>>>          the address is still observed (after the fold). It is still<br>

>>>>>> unclear<br>

>>>>>>>> to me what the impact of this would be<br>

>>>>>>>>          on real code. I suggested before that we run some<br>

>> experiments<br>

>>>>>> first<br>

>>>>>>>> before we make any decision whatsoever.<br>

>>>>>>>><br>

>>>>>>>> This is pretty much saying that lifetime markers are<br>

>> `memset(undef)`,<br>

>>>> as<br>

>>>>>>>> you suggested before (I think).<br>

>>>>>>>> There are some implementation level differences but at the end of<br>

>> the<br>

>>>>>>>> day they are basically the same.<br>

>>>>>>>><br>

>>>>>>>> Happy to hear some thoughts on this, especially if it fixes the<br>

>>>> problems<br>

>>>>>>>> that lead to D93376 in the first place.<br>

>>>>>>>><br>

>>>>>>>> ~ Johannes<br>

>>>>>>>><br>

>>>>>>>><br>

>>>>>>>> On 12/18/20 2:42 AM, Juneyoung Lee via llvm-dev wrote:<br>

>>>>>>>>> Hello all,<br>

>>>>>>>>><br>

>>>>>>>>> We're discussing the well-formedness of lifetime.start/end<br>

>> intrinsic<br>

>>>>>>>> here (<br>

>>>>>>>>> <a href="https://reviews.llvm.org/D93376" rel="noreferrer" target="_blank">https://reviews.llvm.org/D93376</a>), deciding what is a<br>

>> (syntactically<br>

>>>> &<br>

>>>>>>>>> semantically) valid use of these intrinsics.<br>

>>>>>>>>><br>

>>>>>>>>> I'd like to gather more context about the intrinsics.<br>

>>>>>>>>><br>

>>>>>>>>> 1. Is there a frontend or framework that introduces lifetime call<br>

>>>> with<br>

>>>>>>>>> non-stack allocated objects?<br>

>>>>>>>>> If exists, which behavior do they expect from it?<br>

>>>>>>>>><br>

>>>>>>>>> 2. Can a block's lifetime be started bytewise or elementwise?<br>

>>>>>>>>> I imagine an optimization that allow using stack very compactly,<br>

>> but<br>

>>>>>>>> wonder<br>

>>>>>>>>> there is a real-world use case.<br>

>>>>>>>>><br>

>>>>>>>>> Thanks,<br>

>>>>>>>>> Juneyoung<br>

>>>>>>>>><br>

>>>>>>>>><br>

>>>>>>>>> _______________________________________________<br>

>>>>>>>>> LLVM Developers mailing list<br>

>>>>>>>>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

>>>>>>>>> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

><br>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><br></div><font size="1">Juneyoung Lee</font><div><font size="1">Software Foundation Lab, Seoul National University</font></div></div></div>