[llvm-dev] lifetime.start/end

Mon Feb 8 10:37:50 PST 2021

Hi Johannes,

>>> I still believe folding can be done regardless of their lifetime, as long as 
>>> it is done consistent. Fold all pointer observations and the actual
>>> placement is not observable, we went down that road.
>>
>> Sure, but that's the kind of exception that always applies when discussing 
>> optimizations.
>> But to make my statement more precise: you cannot fold comparison of *escaped* 
>> pointers to false unless the pointers are *definitely not equal*, e.g. if 
>> their lifetimes overlap.
>>
> The interesting part is, lifetimes change. I continue to argue you cannot fold 
> if both pointers have escaped. If you do, and other things happen that fiddle 
> with the lifetime, you might accidentally coalesce them (explicitly or implicitly).

Lifetimes can only change if that does not add new observable behavior. For 
example, of two lifetimes overlap in the original program, then shrinking then 
is not allowed since that would make previously-definitely-inequal pointers now 
be potentially equal.

To give an example (using C syntax but I am thinking of LLVM semantics here):

int *x = (int*)malloc(4);
int *y = (int*)malloc(4);
bool eq = (x == y);
free(x);
free(y);
print(eq);

Here, moving the "free(x)" up to before the "int *y" would be incorrect since 
after that transformation, "eq" might be "true", which previously it never was.

>> My understanding is that lifetime markers are *explicitly intended to support 
>> coalescing* by permitting later phases to put allocas with disjoint lifetimes 
>> into the same stack slot. D93376 intended to keep it that way; fundamentally 
>> changing what lifetime markers are about by making them only about the content 
>> and not the address of the alloca seemed like a much more invasive change.
>> Am I misunderstanding something?
> 
> I do think they were designed for that, yes. And I do agree my change looks more 
> invasive from that perspective.
> 
> Now does D93376 help with the problems we have is my question. Or, asked 
> differently, what is a way out of this which improves the big picture, and what 
> is practical. For the first, limiting it to allocas seems to be the wrong way. 
> So far, the argument seems to be that "we only use it for them", but I'm not 
> sure why a use for other pointers would be somehow bad 
> (https://godbolt.org/z/G8obj3). Second, limiting it to syntactic constructs is 
> the opposite of practical. It has been said before, syntactic constraints are 
> brittle. Beyond that, the syntactic def-use chain requirement has not even come 
> up in any of this mails, at least as far as I understand it. It still is unclear 
> to me how that would help.

The restriction to alloca arises (I think) entirely from the fact that some 
syntactic restriction is believed to be needed if the lifetime markers are 
supposed to be able to affect whether allocations (that have their addresses 
observed) may overlap in memory. For malloc, a syntactic restriction clearly 
makes no sense, so if a syntactic restriction is needed then only allowing 
alloca makes that much more feasible.

The "proper" way to do this would be to decorate the *alloca* with annotations 
saying for which parts of the function the allocation has to be live, and to use 
that to define which alloca may observably overlap in memory and which may not. 
This would have a rather clear operational semantics. But sadly that's not how 
the lifetime markers have been designed, so the syntactic restriction is a way 
to "recover" that kind of a design: if each lifetime marker can be clearly 
associated to some alloca, then it's "as if" that marker was just an annotation 
at the alloca, and we can proceed as above.

(I do understand that these annotations would probably be not very practical, 
and one can view the markers as a convenient way for the frontend to convey 
those annotations to LLVM. But if that's what they are then they should be 
treated like that -- possibly by having some pre-processing pass that translates 
them to annotations and removes the markers, so later optimizations don't have 
to worry about this any more. Maybe this would be a possible fallback plan if 
your suggestion does not pan out.)

> That all said, let's see what we need:
> 1) A way to ensure we don't create inconsistent situations wrt. allocation 
> addresses where there were none.
> 2) A way to ensure we can reasonably coalesce allocas to minimize stack usage.

First and foremost we need a precise specification for what the operational 
behavior of any given LLVM IR program is -- and D93376 helps with that by 
proposing a clear operational behavior for the lifetime markers.

That was the original goal, anyway. (Or so I think -- I am not an author of that 
change proposal.) Since then the scope broadened once you pointed out that 
comparison folding is still tricky even after we have a precise semantics for 
allocation.

> For one, I feel we converged pretty much on the fact that pointer comparison 
> folding is a problem. In the example below we have lifetime markers and we use 
> them to argue about lifetimes of the allocations, but we still agree the folding 
> should not happen. Once the folding is restricted to pointer pairs for which at 
> least one doesn't have it's address observed, I would like to revisit 1). So 
> far, I believe it would be resolved. For 2), we could stick with what we have 
> now, or adapt something new afterwards. I guess the existing wording needs to be 
> cleared up either way.

What we have now (in terms of specification for the lifetime markers) is, as far 
as I am concerned, a rather vague sketch that should be turned into something 
more precise and operational. D93376 helps fix that, but it turns out (I guess 
unsurprisingly) that there is disagreement about what the more precise semantics 
even should be.

>> Note that we do not need to talk about coalescing here, we are just exploiting 
>> the fact that allocation is non-deterministic and that allocations with 
>> non-overlapping lifetime can be in the same spot or in different spots.
> 
> Sure. When I refer to coalescing I was also including "implicit coalescing", 
> e.g., as happened in my outlining example.

Oh I see. The question of "is this an allowed program transformation" and "is 
this an allowed execution" are closely related of course, but I still usually 
think of them very differently, in particular when there is lots of 
non-determinism as in our case.

After all, there's a difference between "this comparison might yield false at 
runtime" and "the compiler can replace this comparison by false" -- as we just 
discussed.

> I am basically struggling to understand why we need to introduce syntactic 
> restrictions on lifetime markers and why they need to be limited to allocas. 
> Those are two concepts included in D93376, right?

I gave my view on this above.

>> I agree that making the markers just erase the contents of the alloca is the 
>> simplest proposal on the table. *If* we truly can make LLVM follow that model, 
>> that would be very nice.
>> I would just be very surprised if it was possible to get this proposal past 
>> the people that will carefully watch over the existing optimization potential 
>> LLVM has, and make sure it does not become smaller. ;)  Maybe I am too 
>> pessimistic, but I would have never dared even making such a proposal as I 
>> would expect it to be shot down. I'll be happy to be proven wrong about this. :)
> 
> I've made a few actually, `mustprogress` was the latest and you were involved 
> ;). It does prevent certain optimizations we did before unconditionally but it 
> also fixes a conceptual problem.

That's fair! And thanks for that btw. :)

> At least the pointer inconsistency is a 
> conceptual problem that needs fixing. What we do with lifetime markers is 
> probably orthogonal.

I feel like there's also a conceptual problem with lifetime markers, because 
their spec is so vague that it becomes hard to say in general what their 
observable effect on program behavior is. Fundamentally, that's what I am most 
interested in fixing.
I am not sure to what extend I can help with the pointer inconsistency -- the 
spec is fairly clear there it seems, so if that's a transformation LLVM is 
currently doing, then it probably just has to stop doing that?

Kind regards,
Ralf

> 
> ~ Johannes
> 
> 
>>
>> Kind regards,
>> Ralf
>>
>>>
>>> ~ Johannes
>>>
>>>
>>>
>>>>
>>>> Kind regards,
>>>> Ralf
>>>>
>>>>>
>>>>> ~ Johannes
>>>>>
>>>>>
>>>>>>
>>>>>> Kind regards,
>>>>>> Ralf
>>>>>>
>>>>>>> A lot of my arguments around alloca/malloc/... is that we can augment the 
>>>>>>> information from the frontend by determining if an address is observed or 
>>>>>>> not. If it is not, we can limit its allocation, coalesce it, etc. 
>>>>>>> Lifetime markers can help us do this but they are not the only way.
>>>>>>> So, basically, we do not need lifetime markers to coalesce 
>>>>>>> alloca/malloc/... but that they can help.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Let's try to get out of this rabbit hole. If you have a problem with the 
>>>>>>> proposal as outlined in 
>>>>>>> https://lists.llvm.org/pipermail/llvm-dev/2021-January/147594.html , 
>>>>>>> please let me know. I assume dealing with observed addresses different 
>>>>>>> than with non-observede ones (implied by noescape) is the way forward now.
>>>>>>>
>>>>>>> ~ Johannes
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> Ralf
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Kind regards,
>>>>>>>>>> Ralf
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Does that make some sense?
>>>>>>>>>>>
>>>>>>>>>>> ~ Johannes
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Kind regards,
>>>>>>>>>>>> Ralf
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 1/7/21 10:12 PM, Juneyoung Lee wrote:
>>>>>>>>>>>>>> On Fri, Jan 8, 2021 at 8:54 AM Johannes Doerfert 
>>>>>>>>>>>>>> <johannesdoerfert at gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 1/7/21 4:55 PM, Juneyoung Lee wrote:
>>>>>>>>>>>>>>>> It is, alloca must have its integral address decided at its 
>>>>>>>>>>>>>>>> *allocation*
>>>>>>>>>>>>>>>> time as well, like 0xFF00... .
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For example, if 0xFFFF0000 ~ 0xFF000000 is already allocated, 
>>>>>>>>>>>>>>>> the new
>>>>>>>>>>>>>>>> alloca cannot be placed there unless it is not observed in the 
>>>>>>>>>>>>>>>> future,
>>>>>>>>>>>>>>>> according to your proposal.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> But how do we know from the current state whether the stack 
>>>>>>>>>>>>>>>> variable is
>>>>>>>>>>>>>>>> going to be observed or not?
>>>>>>>>>>>>>>> With that argument you could allocate all but 4 bytes, do a 
>>>>>>>>>>>>>>> malloc(4)
>>>>>>>>>>>>>>> and know the address of the returned pointer (assuming it is not 
>>>>>>>>>>>>>>> null).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What I try to say is, either your scenario is part of the model and
>>>>>>>>>>>>>>> everything we do is broken as you could "observe" addresses 
>>>>>>>>>>>>>>> passively,
>>>>>>>>>>>>>>> *or*, the abstract machine we use for semantic reasoning doesn't 
>>>>>>>>>>>>>>> permit
>>>>>>>>>>>>>>> the above reasoning. I really hope for the latter.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's a very valid point..!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually, I have a memory model that addresses this problem.
>>>>>>>>>>>>>> The gist is that we can interpret each allocation instruction as 
>>>>>>>>>>>>>> *creating
>>>>>>>>>>>>>> 2 blocks* and nondeterministically returning one of them.
>>>>>>>>>>>>>> The other one is marked as invalid but not freed immediately. 
>>>>>>>>>>>>>> Deallocation
>>>>>>>>>>>>>> frees both blocks.
>>>>>>>>>>>>>> If there is not enough space for having two blocks, it is treated as
>>>>>>>>>>>>>> out-of-memory.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It makes guessing the address of an allocation according to memory 
>>>>>>>>>>>>>> layout
>>>>>>>>>>>>>> invalid UB.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> p = malloc(4)
>>>>>>>>>>>>>> // If p != null, it is guaranteed there was at least two 4 byte slots
>>>>>>>>>>>>>> available, say 0x100~0x104 and 0x110~0x114.
>>>>>>>>>>>>>> // Two blocks are allocated at 0x110 and 0x114, and one of the 
>>>>>>>>>>>>>> addresses is
>>>>>>>>>>>>>> nondeterministically returned.
>>>>>>>>>>>>>> // By the nature of nondeterminism, all executions below should be
>>>>>>>>>>>>>> well-defined regardless of the address of p.
>>>>>>>>>>>>>> *(int*)0x100 = 10; // In an execution where p is 0x110, this 
>>>>>>>>>>>>>> raises UB.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The paper link is here <https://dl.acm.org/doi/10.1145/3276495> FYI.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Nice! Thanks for the link :)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> To argue differently: Who is to say there is a stack, or only one,
>>>>>>>>>>>>>>> or that alloca allocates memory "on the one stack"? That is not 
>>>>>>>>>>>>>>> part of
>>>>>>>>>>>>>>> the IR, IMHO. I can write a machine on which alloca lowers to 
>>>>>>>>>>>>>>> malloc,
>>>>>>>>>>>>>>> I don't even need the free during stack unwind but that I could 
>>>>>>>>>>>>>>> do as
>>>>>>>>>>>>>>> well if I wanted to.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is right, the example was for illustrative purposes. IR does not
>>>>>>>>>>>>>> enforce alloca to be placed at 'stack'.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Making icmp/ptrtoint yield poison will still make loop 
>>>>>>>>>>>>>>>> versioning or
>>>>>>>>>>>>>>>> pointer rewriting transformations unsound because these 
>>>>>>>>>>>>>>>> operations now
>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>> create poison (even if pointers are noundef).
>>>>>>>>>>>>>>> I did not say they yield poison, at least I did not try to say that.
>>>>>>>>>>>>>>> What are you referring to exactly?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I was referring to this:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> But this makes ptrtoint/icmp make UB-raising instructions, which
>>>>>>>>>>>>>>>> contradicts with what LLVM does.
>>>>>>>>>>>>>>> As with other violation of attributes I would, on first though, 
>>>>>>>>>>>>>>> suggest
>>>>>>>>>>>>>>> to produce poison, not UB.
>>>>>>>>>>>>>> But it is more about the (imaginary) attribute, so maybe I was 
>>>>>>>>>>>>>> slightly out
>>>>>>>>>>>>>> of topic.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2) is fine, I think the suggestion semantically makes sense 
>>>>>>>>>>>>>>> perfectly. 1)
>>>>>>>>>>>>>>>> is something I'm concerned about now.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There are more than pointer foldings, such as rewriting such 
>>>>>>>>>>>>>>>> expression,
>>>>>>>>>>>>>>>> code motion ptr cmp, introduce ptr cmp, etc. There is also analysis
>>>>>>>>>>>>>>> relying
>>>>>>>>>>>>>>>> on ptr cmp.
>>>>>>>>>>>>>>>> Defining the correctness of each of them is something we want to 
>>>>>>>>>>>>>>>> avoid,
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> maybe that's why we want to define precise semantics for things.
>>>>>>>>>>>>>>> I don't get the point. My proposal does not change the semantics of
>>>>>>>>>>>>>>> pointer comparisons, at all. I explicitly mentioned that in the last
>>>>>>>>>>>>>>> email.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Oh okay, I thought it was a part of the lifetime proposal, but it 
>>>>>>>>>>>>>> seems
>>>>>>>>>>>>>> more like a separate thing.
>>>>>>>>>>>>>> I agree that this requires performance impact.
>>>>>>>>>>>>>> Also investigation of existing transformations would be needed; 
>>>>>>>>>>>>>> Alive2's
>>>>>>>>>>>>>> pointer comparison is doing approximation yet, but if it becomes 
>>>>>>>>>>>>>> fully
>>>>>>>>>>>>>> precise, it will show something from running LLVM unit tests I 
>>>>>>>>>>>>>> believe..! :)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think this will be less aggressive and may give nice feedback to
>>>>>>>>>>>>>>>> potential projects that are using lifetime with non-alloca.
>>>>>>>>>>>>>>> The lifetime marker debate, basically 2) above, is orthogonal to the
>>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>>> you try to solve. It got mixed in as lifetime markers were used by
>>>>>>>>>>>>>>> StackColoring
>>>>>>>>>>>>>>> to perform coalescing but that is coincidental. You can (arguably)
>>>>>>>>>>>>>>> coalesce stack
>>>>>>>>>>>>>>> allocations regardless of lifetime markers and with 1) such a
>>>>>>>>>>>>>>> transformation
>>>>>>>>>>>>>>> (w/ and w/o lifetime markers) would actually be sound.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> At the end, we can implement IR writer that lowers lifetime with
>>>>>>>>>>>>>>> non-alloca
>>>>>>>>>>>>>>>> into memset(undef). WDYT?
>>>>>>>>>>>>>>> Yeah, 2) is orthogonal and we can lower it that way. Unsure if it is
>>>>>>>>>>>>>>> helpful
>>>>>>>>>>>>>>> but we can certainly define it that way in the LangRef.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Okay, thanks!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ~ Johannes
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Juneyoung
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> p.s. The reply was late, sorry. I think I can spend more time on 
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>> today.
>>>>>>>>>>>>>>>> On Thu, Jan 7, 2021 at 9:02 AM Johannes Doerfert <
>>>>>>>>>>>>>>> johannesdoerfert at gmail.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 1/6/21 4:33 PM, Juneyoung Lee wrote:
>>>>>>>>>>>>>>>>>>>> Stepwisely defining the semantics of instructions is a 
>>>>>>>>>>>>>>>>>>>> desirable
>>>>>>>>>>>>>>>>>> direction
>>>>>>>>>>>>>>>>>>>> IMO.
>>>>>>>>>>>>>>>>>>> I'm confused. What in the proposal would prevent us from 
>>>>>>>>>>>>>>>>>>> defining
>>>>>>>>>>>>>>>>>>> the semantics of instructions, or force us to do it in an 
>>>>>>>>>>>>>>>>>>> "undesirable
>>>>>>>>>>>>>>>>>> way"?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I meant it would be great if the output state after executing an
>>>>>>>>>>>>>>>>>> instruction can be described using its input state.
>>>>>>>>>>>>>>>>>> (that was the meaning of 'stepwise semantics', I should have 
>>>>>>>>>>>>>>>>>> been more
>>>>>>>>>>>>>>>>>> clear about this)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For example, the semantics of 'z = add x y' can be defined as 
>>>>>>>>>>>>>>>>>> follows:
>>>>>>>>>>>>>>>>>> Given an input state s, next state s' = s[z -> s(x) + s(y)]
>>>>>>>>>>>>>>>>>> where s(x) is the value of x in the previous state, and s[z -> 
>>>>>>>>>>>>>>>>>> v] is a
>>>>>>>>>>>>>>>>>> state with z updated to v.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Another example that involves memory: the semantics of 'i = 
>>>>>>>>>>>>>>>>>> ptrtoint p'
>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>> be defined as follows:
>>>>>>>>>>>>>>>>>> Given an input state s, next state s' = s[i -> s(p).obj.address +
>>>>>>>>>>>>>>>>>> s(p).offset]
>>>>>>>>>>>>>>>>>> where obj.address is the begin address of a memory object obj 
>>>>>>>>>>>>>>>>>> pointed
>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>> p
>>>>>>>>>>>>>>>>>> & offset is p's byte offset. (Imagine a pointer to the offset 
>>>>>>>>>>>>>>>>>> of some
>>>>>>>>>>>>>>>>> char
>>>>>>>>>>>>>>>>>> array).
>>>>>>>>>>>>>>>>>> Note that ptrtoint & add can be nicely defined w.r.t the input 
>>>>>>>>>>>>>>>>>> state.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Now, the instruction that we're facing is 'p = alloca'.
>>>>>>>>>>>>>>>>>> To describe the output state after executing 'p = alloca', the 
>>>>>>>>>>>>>>>>>> address
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> new alloca should be determined.
>>>>>>>>>>>>>>>>>> If observedness is involved, we need to know the future state 
>>>>>>>>>>>>>>>>>> again. :/
>>>>>>>>>>>>>>>>> We
>>>>>>>>>>>>>>>>>> don't know whether the alloca is going to be observed or not 
>>>>>>>>>>>>>>>>>> without
>>>>>>>>>>>>>>>>> seeing
>>>>>>>>>>>>>>>>>> the future.
>>>>>>>>>>>>>>>>>> This is the problem of the current lifetime intrinsics as well.
>>>>>>>>>>>>>>>>> No, you mix things up here.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Nobody proposed to modify the semantics of `alloca`.
>>>>>>>>>>>>>>>>> `alloca` provides you with a fresh, unobserved block of
>>>>>>>>>>>>>>>>> dereferenceable memory that is implicitly freed as the stack
>>>>>>>>>>>>>>>>> unwinds. That is it. No context necessary.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If you want to modify the IR, you need to argue the observable
>>>>>>>>>>>>>>>>> semantics which is nothing new. That this might require more than
>>>>>>>>>>>>>>>>> a peephole view of the program is also not new.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> One possible approach to resolve this is adding an 
>>>>>>>>>>>>>>>>>> 'unobserved' flag to
>>>>>>>>>>>>>>>>>> alloca instruction (similar to what was suggested by Nicolai).
>>>>>>>>>>>>>>>>>> And, we can say that if alloca with 'unobserved' is used by
>>>>>>>>>>>>>>>>> ptrtoint/icmp,
>>>>>>>>>>>>>>>>>> it is UB.
>>>>>>>>>>>>>>>>> The flag can be added, like we add other attributes. It should not
>>>>>>>>>>>>>>>>> be required for any optimization we talked about though. It 
>>>>>>>>>>>>>>>>> basically
>>>>>>>>>>>>>>>>> is a way to manifest derived or given information into the IR.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Attribute deduction, as well as frontends with domain knowledge,
>>>>>>>>>>>>>>>>> can add such information. The flag we discussed in phab was not 
>>>>>>>>>>>>>>>>> even
>>>>>>>>>>>>>>>>> sufficient for all the transformation examples I presented in 
>>>>>>>>>>>>>>>>> my mail,
>>>>>>>>>>>>>>>>> that is why I extended my  argument. We could still have a 
>>>>>>>>>>>>>>>>> "noescape"
>>>>>>>>>>>>>>>>> flag for allocas, but I'm not sure how useful that really is. 
>>>>>>>>>>>>>>>>> We can
>>>>>>>>>>>>>>>>> certainly deduce it and manifest it, unsure if we have domain 
>>>>>>>>>>>>>>>>> knowledge
>>>>>>>>>>>>>>>>> we can use for non-trivial cases though.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> But this makes ptrtoint/icmp make UB-raising instructions, which
>>>>>>>>>>>>>>>>>> contradicts with what LLVM does.
>>>>>>>>>>>>>>>>> As with other violation of attributes I would, on first though, 
>>>>>>>>>>>>>>>>> suggest
>>>>>>>>>>>>>>>>> to produce poison, not UB.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Also, existing optimizations like loop versioning can introduce
>>>>>>>>>>>>>>>>>> ptrtoint/pointer comparisons too.
>>>>>>>>>>>>>>>>> Sure. I am not certain why that is a problem. I get the feeling 
>>>>>>>>>>>>>>>>> things
>>>>>>>>>>>>>>>>> are still mixed up here.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What I proposed is twofold:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1) We stop folding comparisons between different allocas if 
>>>>>>>>>>>>>>>>> changing the
>>>>>>>>>>>>>>>>> address
>>>>>>>>>>>>>>>>>       of both might be observable. Thus, if both might have 
>>>>>>>>>>>>>>>>> their address
>>>>>>>>>>>>>>>>> "taken"/escaped,
>>>>>>>>>>>>>>>>>       other than the comparisons we want to fold, we cannot 
>>>>>>>>>>>>>>>>> proceed.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2) We define lifetime markers to mean `memset(undef)`.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The first should be sufficient for the problem you were trying 
>>>>>>>>>>>>>>>>> to solve
>>>>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>>>>> first place. The second makes lifetime markers less weird. Note 
>>>>>>>>>>>>>>>>> that 1)
>>>>>>>>>>>>>>>>> is not changing
>>>>>>>>>>>>>>>>> the semantics of the IR. We basically just argue there is a bug 
>>>>>>>>>>>>>>>>> in our
>>>>>>>>>>>>>>>>> instcombine right
>>>>>>>>>>>>>>>>> now as we do not check all necessary preconditions.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I see that there are other questions that I didn't answer yet, 
>>>>>>>>>>>>>>>>>> but let
>>>>>>>>>>>>>>> me
>>>>>>>>>>>>>>>>>> answer this first to limit the length of the text :)
>>>>>>>>>>>>>>>>> Sure, we can split the discussion :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ~ Johannes
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Juneyoung
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Jan 7, 2021 at 3:36 AM Johannes Doerfert <
>>>>>>>>>>>>>>>>> johannesdoerfert at gmail.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 1/5/21 8:00 PM, Juneyoung Lee wrote:
>>>>>>>>>>>>>>>>>>>> Hi Johannes,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I read your proposal and thought about the model.
>>>>>>>>>>>>>>>>>>> Cool, thanks!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> As you concerned in A3, certain programs may be valid only when
>>>>>>>>>>>>>>> memory
>>>>>>>>>>>>>>>>>>>> blocks with overlapping lifetimes have disjoint addresses.
>>>>>>>>>>>>>>>>>>>> Look at this example (I'm using malloc, but alloca also works):
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> p1 = malloc(4)
>>>>>>>>>>>>>>>>>>>> p2 = malloc(4) // for brevity, assume that there as enough 
>>>>>>>>>>>>>>>>>>>> space & p1
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> p2 != null
>>>>>>>>>>>>>>>>>>>> set<char*> s;
>>>>>>>>>>>>>>>>>>>> s.insert(p1); s.insert(p2); // If the second insert did 
>>>>>>>>>>>>>>>>>>>> nothing, it
>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>> be surprise to programmers
>>>>>>>>>>>>>>>>>>>> work(s);
>>>>>>>>>>>>>>>>>>>> free(data1)
>>>>>>>>>>>>>>>>>>>> free(data2)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Clearly, IR semantics should guarantee that escaped blocks are
>>>>>>>>>>>>>>>>> disjoint.
>>>>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>>>> would be great for verification tools on LLVM IR to be able 
>>>>>>>>>>>>>>>>>>>> to answer
>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>> the second insert will succeed.
>>>>>>>>>>>>>>>>>>> I agree, the second insert should succeed, assuming `p1 && p2`.
>>>>>>>>>>>>>>>>>>> I don't think my proposal would in any way impact the program 
>>>>>>>>>>>>>>>>>>> above,
>>>>>>>>>>>>>>>>>>> if anything the A3 reasoning makes sure such a program with 
>>>>>>>>>>>>>>>>>>> allocas
>>>>>>>>>>>>>>>>>>> is not miscompiled.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm also not sure I understand what you try to argue for. Maybe
>>>>>>>>>>>>>>>>>>> elaborate a bit what it is you think is bad or needs to be 
>>>>>>>>>>>>>>>>>>> changed.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The problem is that definition of escapedness is not clear 
>>>>>>>>>>>>>>>>>>>> at the
>>>>>>>>>>>>>>>>>>> semantic
>>>>>>>>>>>>>>>>>>>> level. Describing the IR semantics w.r.t. LLVM's escaped 
>>>>>>>>>>>>>>>>>>>> analysis
>>>>>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>>>>>>> isn't something we want.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Semantic definition of escapedness of a pointer seems hard, 
>>>>>>>>>>>>>>>>>>>> I mean
>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>>>> stepwise manner.
>>>>>>>>>>>>>>>>>>>> It isn't a single instruction such as 'escape i8* ptr', and 
>>>>>>>>>>>>>>>>>>>> we need
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> look
>>>>>>>>>>>>>>>>>>>> over all instructions in the function. For example, 
>>>>>>>>>>>>>>>>>>>> '(int)(p+1) -
>>>>>>>>>>>>>>>>> (int)p'
>>>>>>>>>>>>>>>>>>>> isn't semantically escaping the pointer p because the result 
>>>>>>>>>>>>>>>>>>>> is 1
>>>>>>>>>>>>>>>>>>>> regardless of the value of p.
>>>>>>>>>>>>>>>>>>>> Stepwisely defining the semantics of instructions is a 
>>>>>>>>>>>>>>>>>>>> desirable
>>>>>>>>>>>>>>>>>>> direction
>>>>>>>>>>>>>>>>>>>> IMO.
>>>>>>>>>>>>>>>>>>> I'm confused. What in the proposal would prevent us from 
>>>>>>>>>>>>>>>>>>> defining
>>>>>>>>>>>>>>>>>>> the semantics of instructions, or force us to do it in an 
>>>>>>>>>>>>>>>>>>> "undesirable
>>>>>>>>>>>>>>>>>>> way"?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> In practice, syntactically checking escapedness + nice 
>>>>>>>>>>>>>>>>>>>> engineering
>>>>>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>>>>>> not break optimizations in most cases (as undef/poison did); 
>>>>>>>>>>>>>>>>>>>> but it
>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>> be great to move to another level, since LLVM IR is used in 
>>>>>>>>>>>>>>>>>>>> so many
>>>>>>>>>>>>>>>>>>> places
>>>>>>>>>>>>>>>>>>>> :)
>>>>>>>>>>>>>>>>>>> The property under which you can coalesce objects is simple:
>>>>>>>>>>>>>>>>>>>       It is not observable.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Now, if the address of one of the two objects you coalesce is 
>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>> observed, coalescing is not observable. That is a sufficient 
>>>>>>>>>>>>>>>>>>> condition
>>>>>>>>>>>>>>>>>>> not a necessary one. Pointer "escaping" is one step further. 
>>>>>>>>>>>>>>>>>>> If the
>>>>>>>>>>>>>>>>>>> address doesn't escape it is not observed. This does not mean 
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> "semantic conditions for coalescing", e.g., for the purpose of
>>>>>>>>>>>>>>>>> translation
>>>>>>>>>>>>>>>>>>> validation, is supposed to be build on top of our "definition of
>>>>>>>>>>>>>>>>> escaping
>>>>>>>>>>>>>>>>>>> pointers". That said, we use "does not escape" as a 
>>>>>>>>>>>>>>>>>>> precondition for
>>>>>>>>>>>>>>>>>>> various transformation and I'm unsure what is any different 
>>>>>>>>>>>>>>>>>>> now. The
>>>>>>>>>>>>>>>>>>> entire escape argument is only used in the validity of the 
>>>>>>>>>>>>>>>>>>> pointer
>>>>>>>>>>>>>>>>> folding.
>>>>>>>>>>>>>>>>>>> Similarly, we can fold a comparison of a noalias pointer with 
>>>>>>>>>>>>>>>>>>> another
>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>> if the former does not escape (and both are dereferenced and 
>>>>>>>>>>>>>>>>>>> one is
>>>>>>>>>>>>>>>>>>> written for sure).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The pointer comparison is another beast. It indeed has a few 
>>>>>>>>>>>>>>>>>>>> issues,
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> solving it might require nontrivial solution.
>>>>>>>>>>>>>>>>>>> I think the main problem of the inconsistencies and such 
>>>>>>>>>>>>>>>>>>> we've seen is
>>>>>>>>>>>>>>>>>>> rooted in the erroneous folding of pointer comparisons. 
>>>>>>>>>>>>>>>>>>> Cleaning up
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> lifetime marker semantics is actually unrelated and simply 
>>>>>>>>>>>>>>>>>>> not folding
>>>>>>>>>>>>>>>>>>> as described in A3 should solve the issue that has been 
>>>>>>>>>>>>>>>>>>> reported.
>>>>>>>>>>>>>>>>>>> Nevertheless,
>>>>>>>>>>>>>>>>>>> we should take a crack at lifetime markers while we are here.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ~ Johannes
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Juneyoung
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Tue, Jan 5, 2021 at 9:37 AM Johannes Doerfert <
>>>>>>>>>>>>>>>>>>> johannesdoerfert at gmail.com>
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi Juneyoung,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Happy new year :)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> After we had a lengthy discussion on phab last year, I've 
>>>>>>>>>>>>>>>>>>>>> tried to
>>>>>>>>>>>>>>>>>>>>> summarize my thoughts,
>>>>>>>>>>>>>>>>>>>>> especially given that I had some time to think about things 
>>>>>>>>>>>>>>>>>>>>> over the
>>>>>>>>>>>>>>>>>>> break.
>>>>>>>>>>>>>>>>>>>>> Still, no promises on the quality ;)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I start with general questions I asked myself and then go on
>>>>>>>>>>>>>>> rambling
>>>>>>>>>>>>>>>>>>>>> about a potential design.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Q1: Is lifetime a given property or a derived one, thus is 
>>>>>>>>>>>>>>>>>>>>> it fixed
>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>> modifiable?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> This is a question I asked myself a lot recently. I think 
>>>>>>>>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>>>>> derived
>>>>>>>>>>>>>>>>>>>>> and modifiable,
>>>>>>>>>>>>>>>>>>>>> at least I hope it is. Only that would allow 
>>>>>>>>>>>>>>>>>>>>> transformations I would
>>>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>>>> us to do. Here are some examples:
>>>>>>>>>>>>>>>>>>>>> https://godbolt.org/z/G8obj3
>>>>>>>>>>>>>>>>>>>>> https://godbolt.org/z/obaTc
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Q2: Is a pointer comparison, or similar use, extending the 
>>>>>>>>>>>>>>>>>>>>> lifetime?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Asked differently, can we hoist a pointer comparison into a 
>>>>>>>>>>>>>>>>>>>>> region
>>>>>>>>>>>>>>>>> where
>>>>>>>>>>>>>>>>>>>>> the pointer is dead?
>>>>>>>>>>>>>>>>>>>>> This is an important question which we haven't discussed 
>>>>>>>>>>>>>>>>>>>>> much as we
>>>>>>>>>>>>>>>>>>>>> assumed LICM has to work.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The current behavior is that non-dereferencing uses are not
>>>>>>>>>>>>>>> extending
>>>>>>>>>>>>>>>>>>>>> the lifetime and are
>>>>>>>>>>>>>>>>>>>>> allowed outside of "lifetime regions" (as indicated by 
>>>>>>>>>>>>>>>>>>>>> markers).
>>>>>>>>>>>>>>> They
>>>>>>>>>>>>>>>>>>>>> will always produce valid
>>>>>>>>>>>>>>>>>>>>> results. Though, we might want to think about a lifetime 
>>>>>>>>>>>>>>>>>>>>> marker that
>>>>>>>>>>>>>>>>>>>>> spits out a new pointer
>>>>>>>>>>>>>>>>>>>>> value instead of reusing the old one.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Q3: Can we use lifetime to argue about addresses?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The question here is, can we fold address comparisons based on
>>>>>>>>>>>>>>>>>>>>> lifetimes, or not.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> So far, we fold comparisons based on "address information". 
>>>>>>>>>>>>>>>>>>>>> For
>>>>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>>>>>>>>> we "know" globals,
>>>>>>>>>>>>>>>>>>>>> allocas, and mallocs cannot be equal to one another. Also, two
>>>>>>>>>>>>>>>>> distinct
>>>>>>>>>>>>>>>>>>>>> allocations, for globals
>>>>>>>>>>>>>>>>>>>>> and allocas, are considered unequal. Now, the crux is that 
>>>>>>>>>>>>>>>>>>>>> we have
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>> consistent if we do two
>>>>>>>>>>>>>>>>>>>>> comparisons, and, as of now, we are not (bug number 
>>>>>>>>>>>>>>>>>>>>> missing). Since
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> backend (or any other place
>>>>>>>>>>>>>>>>>>>>> for that matter) might coalesce allocas, their addresses 
>>>>>>>>>>>>>>>>>>>>> might not
>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>> different after all. If we
>>>>>>>>>>>>>>>>>>>>> already folded a comparison to "unequal" we are doomed if 
>>>>>>>>>>>>>>>>>>>>> we later
>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>> a comparison that results
>>>>>>>>>>>>>>>>>>>>> in "equal". (Note, this is different from aliasing rules as 
>>>>>>>>>>>>>>>>>>>>> they can
>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>> stricter.)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Design:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I would hope we can come up with a model that treats memory 
>>>>>>>>>>>>>>>>>>>>> "the
>>>>>>>>>>>>>>>>> same",
>>>>>>>>>>>>>>>>>>>>> regardless if it is global,
>>>>>>>>>>>>>>>>>>>>> stack, or heap. I want to avoid special casing one of them 
>>>>>>>>>>>>>>>>>>>>> wrt.
>>>>>>>>>>>>>>>>> lifetime
>>>>>>>>>>>>>>>>>>>>> as I believe most optimizations
>>>>>>>>>>>>>>>>>>>>> would apply to any of them, potentially for different 
>>>>>>>>>>>>>>>>>>>>> reasons and
>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>> different gains, but nevertheless.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Proposal (largely based on the conversation in phab):
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> A1: Lifetime is a concept that talks about memory content 
>>>>>>>>>>>>>>>>>>>>> *only*.
>>>>>>>>>>>>>>>>>>>>> Basically, the memory content is set to
>>>>>>>>>>>>>>>>>>>>>          undefined by lifetime markers. It is 
>>>>>>>>>>>>>>>>>>>>> derived/modifiable as
>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>> just an "as-is" property of the memory
>>>>>>>>>>>>>>>>>>>>>          content. The lifetimes of an object, as described by
>>>>>>>>>>>>>>> markers,
>>>>>>>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>>>>>>> change during the compilation. They
>>>>>>>>>>>>>>>>>>>>>          might become smaller if we deduce the object is 
>>>>>>>>>>>>>>>>>>>>> not accessed
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> the memory content is not used, they
>>>>>>>>>>>>>>>>>>>>>          might become larger if objects with non-overlapping
>>>>>>>>>>>>>>> lifetimes
>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>> coalesced. (One could see the latter as
>>>>>>>>>>>>>>>>>>>>>          introducing a new object though.)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> A2: If we define lifetime as above, it has nothing to do 
>>>>>>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>>>>>>> address of an object. Consequently, pointer
>>>>>>>>>>>>>>>>>>>>>          comparisons and similar operations are valid 
>>>>>>>>>>>>>>>>>>>>> outside the
>>>>>>>>>>>>>>>>> lifetime.
>>>>>>>>>>>>>>>>>>>>> Loads and stores are as well, they can
>>>>>>>>>>>>>>>>>>>>>          even not be removed "easily". A store followed by 
>>>>>>>>>>>>>>>>>>>>> a lifetime
>>>>>>>>>>>>>>>>>>> marker
>>>>>>>>>>>>>>>>>>>>> or a load following a lifetime marker
>>>>>>>>>>>>>>>>>>>>>          is dead or results in undef respectively.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> A3: We could not use lifetime to argue about addresses. 
>>>>>>>>>>>>>>>>>>>>> This means
>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>> could/should also not argue that overlapping
>>>>>>>>>>>>>>>>>>>>>          lifetimes result in different addresses. Thus, a 
>>>>>>>>>>>>>>>>>>>>> comparison
>>>>>>>>>>>>>>>>>>> between
>>>>>>>>>>>>>>>>>>>>> the address of two allocas could not
>>>>>>>>>>>>>>>>>>>>>          immediately be folded. That said, there would be 
>>>>>>>>>>>>>>>>>>>>> special
>>>>>>>>>>>>>>> cases
>>>>>>>>>>>>>>>>>>>>> though. Basically, if one of the allocas does
>>>>>>>>>>>>>>>>>>>>>          not escape, other than the comparisons to be 
>>>>>>>>>>>>>>>>>>>>> folded, we can
>>>>>>>>>>>>>>>>> fold
>>>>>>>>>>>>>>>>>>>>> them. Afterwards, coalescing or splitting
>>>>>>>>>>>>>>>>>>>>>          would still be consistent because it is 
>>>>>>>>>>>>>>>>>>>>> unobservable. The
>>>>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>>>>>>>>> we have in-tree is that we fold even though
>>>>>>>>>>>>>>>>>>>>>          the address is still observed (after the fold). It 
>>>>>>>>>>>>>>>>>>>>> is still
>>>>>>>>>>>>>>>>>>> unclear
>>>>>>>>>>>>>>>>>>>>> to me what the impact of this would be
>>>>>>>>>>>>>>>>>>>>>          on real code. I suggested before that we run some
>>>>>>>>>>>>>>> experiments
>>>>>>>>>>>>>>>>>>> first
>>>>>>>>>>>>>>>>>>>>> before we make any decision whatsoever.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> This is pretty much saying that lifetime markers are
>>>>>>>>>>>>>>> `memset(undef)`,
>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>> you suggested before (I think).
>>>>>>>>>>>>>>>>>>>>> There are some implementation level differences but at the 
>>>>>>>>>>>>>>>>>>>>> end of
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> day they are basically the same.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Happy to hear some thoughts on this, especially if it fixes 
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> problems
>>>>>>>>>>>>>>>>>>>>> that lead to D93376 in the first place.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> ~ Johannes
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 12/18/20 2:42 AM, Juneyoung Lee via llvm-dev wrote:
>>>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> We're discussing the well-formedness of lifetime.start/end
>>>>>>>>>>>>>>> intrinsic
>>>>>>>>>>>>>>>>>>>>> here (
>>>>>>>>>>>>>>>>>>>>>> https://reviews.llvm.org/D93376), deciding what is a
>>>>>>>>>>>>>>> (syntactically
>>>>>>>>>>>>>>>>> &
>>>>>>>>>>>>>>>>>>>>>> semantically) valid use of these intrinsics.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I'd like to gather more context about the intrinsics.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 1. Is there a frontend or framework that introduces 
>>>>>>>>>>>>>>>>>>>>>> lifetime call
>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>> non-stack allocated objects?
>>>>>>>>>>>>>>>>>>>>>> If exists, which behavior do they expect from it?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 2. Can a block's lifetime be started bytewise or elementwise?
>>>>>>>>>>>>>>>>>>>>>> I imagine an optimization that allow using stack very 
>>>>>>>>>>>>>>>>>>>>>> compactly,
>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>>> wonder
>>>>>>>>>>>>>>>>>>>>>> there is a real-world use case.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> Juneyoung
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>>>>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>

-- 
Website: https://people.mpi-sws.org/~jung/