[llvm-dev] lifetime.start/end

Johannes Doerfert via llvm-dev llvm-dev at lists.llvm.org
Mon Jan 11 07:05:24 PST 2021


On 1/11/21 6:23 AM, Ralf Jung wrote:
> Hi all,
>
> Thanks for including me in this thread!
>
> I agree that *in principle*, alloca is entirely equivalent to malloc 
> and "automatic free when the function returns". The stack is just a 
> very optimized allocator. However, if the compiler takes advantage of 
> this restricted structure of stack allocations in ways that would not 
> be legal for malloc, then we have to account for this by also making 
> alloca special in the semantics.
>
>> A3: We could not use lifetime to argue about addresses. This means we 
>> could/should also not argue that overlapping
>>      lifetimes result in different addresses. Thus, a comparison 
>> between the address of two allocas could not
>>      immediately be folded. That said, there would be special cases 
>> though. Basically, if one of the allocas does
>>      not escape, other than the comparisons to be folded, we can fold 
>> them. Afterwards, coalescing or splitting
>>      would still be consistent because it is unobservable. The 
>> problem we have in-tree is that we fold even though
>>      the address is still observed (after the fold). It is still 
>> unclear to me what the impact of this would be
>>      on real code. I suggested before that we run some experiments 
>> first before we make any decision whatsoever.
>
> I hope this question has not been answered yet, but I don't see how 
> that fold could be legal. I asked the same question on Phabricator but 
> it seems you prefer to have the discussion here. Taking your example 
> from there and adjusting it:
>
> p = malloc(1)
> q = malloc(1)
> %c = icmp eq %p, %q
> free(q)
> free(p)
>
> I think there is a guarantee that c will always be "false". Every 
> operational model of allocation that I have ever seen will guarantee 
> this, and the same for every program logic that I can imagine. So if 
> the compiler may fold this to "false", then as far as I can see, 
> pointer comparison becomes entirely unpredictable. The only consistent 
> model I can think of is "icmp on pointers may spuriously return 'true' 
> at any time", which I doubt anyone wants. ;)
>
> Am I misunderstanding what you are proposing here?

I'm confused. In your text I read: It should be "false" but *not* folded 
to "false" or else ...


FWIW, I would expect `%c = false` here. The tricky part is, do we see 
`%c = false` at compile time
or at runtime. If it's the former, we might run into the same coalescing 
issue we have for allocas:

p = malloc(1)
free(q)
q = malloc(1)
free(p)
%c = icmp eq %p, %q

What is `%c` now? It could go either way, depending on your allocator, 
agreed? If so, we need to ensure
there is not a second `%c2 = icmp eq %p, %q` somewhere if we want to 
fold `%c` as it needs to be consistent.

Does that make some sense?

~ Johannes


>
> Kind regards,
> Ralf
>
>>
>>
>> On 1/7/21 10:12 PM, Juneyoung Lee wrote:
>>> On Fri, Jan 8, 2021 at 8:54 AM Johannes Doerfert 
>>> <johannesdoerfert at gmail.com>
>>> wrote:
>>>
>>>> On 1/7/21 4:55 PM, Juneyoung Lee wrote:
>>>>> It is, alloca must have its integral address decided at its 
>>>>> *allocation*
>>>>> time as well, like 0xFF00... .
>>>>>
>>>>> For example, if 0xFFFF0000 ~ 0xFF000000 is already allocated, the new
>>>>> alloca cannot be placed there unless it is not observed in the 
>>>>> future,
>>>>> according to your proposal.
>>>>>
>>>>> But how do we know from the current state whether the stack 
>>>>> variable is
>>>>> going to be observed or not?
>>>> With that argument you could allocate all but 4 bytes, do a malloc(4)
>>>> and know the address of the returned pointer (assuming it is not 
>>>> null).
>>>>
>>>> What I try to say is, either your scenario is part of the model and
>>>> everything we do is broken as you could "observe" addresses passively,
>>>> *or*, the abstract machine we use for semantic reasoning doesn't 
>>>> permit
>>>> the above reasoning. I really hope for the latter.
>>>>
>>> That's a very valid point..!
>>>
>>> Actually, I have a memory model that addresses this problem.
>>> The gist is that we can interpret each allocation instruction as 
>>> *creating
>>> 2 blocks* and nondeterministically returning one of them.
>>> The other one is marked as invalid but not freed immediately. 
>>> Deallocation
>>> frees both blocks.
>>> If there is not enough space for having two blocks, it is treated as
>>> out-of-memory.
>>>
>>> It makes guessing the address of an allocation according to memory 
>>> layout
>>> invalid UB.
>>>
>>> p = malloc(4)
>>> // If p != null, it is guaranteed there was at least two 4 byte slots
>>> available, say 0x100~0x104 and 0x110~0x114.
>>> // Two blocks are allocated at 0x110 and 0x114, and one of the 
>>> addresses is
>>> nondeterministically returned.
>>> // By the nature of nondeterminism, all executions below should be
>>> well-defined regardless of the address of p.
>>> *(int*)0x100 = 10; // In an execution where p is 0x110, this raises UB.
>>>
>>> The paper link is here <https://dl.acm.org/doi/10.1145/3276495> FYI.
>>
>> Nice! Thanks for the link :)
>>
>>
>>> To argue differently: Who is to say there is a stack, or only one,
>>>> or that alloca allocates memory "on the one stack"? That is not 
>>>> part of
>>>> the IR, IMHO. I can write a machine on which alloca lowers to malloc,
>>>> I don't even need the free during stack unwind but that I could do as
>>>> well if I wanted to.
>>>
>>> This is right, the example was for illustrative purposes. IR does not
>>> enforce alloca to be placed at 'stack'.
>>>
>>>
>>>>> Making icmp/ptrtoint yield poison will still make loop versioning or
>>>>> pointer rewriting transformations unsound because these operations 
>>>>> now
>>>> can
>>>>> create poison (even if pointers are noundef).
>>>> I did not say they yield poison, at least I did not try to say that.
>>>> What are you referring to exactly?
>>>>
>>> I was referring to this:
>>>
>>>>> But this makes ptrtoint/icmp make UB-raising instructions, which
>>>>> contradicts with what LLVM does.
>>>> As with other violation of attributes I would, on first though, 
>>>> suggest
>>>> to produce poison, not UB.
>>> But it is more about the (imaginary) attribute, so maybe I was 
>>> slightly out
>>> of topic.
>>>
>>>> 2) is fine, I think the suggestion semantically makes sense 
>>>> perfectly. 1)
>>>>> is something I'm concerned about now.
>>>>>
>>>>> There are more than pointer foldings, such as rewriting such 
>>>>> expression,
>>>>> code motion ptr cmp, introduce ptr cmp, etc. There is also analysis
>>>> relying
>>>>> on ptr cmp.
>>>>> Defining the correctness of each of them is something we want to 
>>>>> avoid,
>>>> and
>>>>> maybe that's why we want to define precise semantics for things.
>>>> I don't get the point. My proposal does not change the semantics of
>>>> pointer comparisons, at all. I explicitly mentioned that in the last
>>>> email.
>>>>
>>> Oh okay, I thought it was a part of the lifetime proposal, but it seems
>>> more like a separate thing.
>>> I agree that this requires performance impact.
>>> Also investigation of existing transformations would be needed; 
>>> Alive2's
>>> pointer comparison is doing approximation yet, but if it becomes fully
>>> precise, it will show something from running LLVM unit tests I 
>>> believe..! :)
>>>
>>>> I think this will be less aggressive and may give nice feedback to
>>>>> potential projects that are using lifetime with non-alloca.
>>>> The lifetime marker debate, basically 2) above, is orthogonal to the
>>>> problem
>>>> you try to solve. It got mixed in as lifetime markers were used by
>>>> StackColoring
>>>> to perform coalescing but that is coincidental. You can (arguably)
>>>> coalesce stack
>>>> allocations regardless of lifetime markers and with 1) such a
>>>> transformation
>>>> (w/ and w/o lifetime markers) would actually be sound.
>>>>
>>>>
>>>>> At the end, we can implement IR writer that lowers lifetime with
>>>> non-alloca
>>>>> into memset(undef). WDYT?
>>>> Yeah, 2) is orthogonal and we can lower it that way. Unsure if it is
>>>> helpful
>>>> but we can certainly define it that way in the LangRef.
>>>>
>>> Okay, thanks!
>>>
>>>
>>>> ~ Johannes
>>>>
>>>>
>>>>
>>>>> Thanks,
>>>>> Juneyoung
>>>>>
>>>>> p.s. The reply was late, sorry. I think I can spend more time on this
>>>> today.
>>>>> On Thu, Jan 7, 2021 at 9:02 AM Johannes Doerfert <
>>>> johannesdoerfert at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On 1/6/21 4:33 PM, Juneyoung Lee wrote:
>>>>>>>>> Stepwisely defining the semantics of instructions is a desirable
>>>>>>> direction
>>>>>>>>> IMO.
>>>>>>>> I'm confused. What in the proposal would prevent us from defining
>>>>>>>> the semantics of instructions, or force us to do it in an 
>>>>>>>> "undesirable
>>>>>>> way"?
>>>>>>>
>>>>>>> I meant it would be great if the output state after executing an
>>>>>>> instruction can be described using its input state.
>>>>>>> (that was the meaning of 'stepwise semantics', I should have 
>>>>>>> been more
>>>>>>> clear about this)
>>>>>>>
>>>>>>> For example, the semantics of 'z = add x y' can be defined as 
>>>>>>> follows:
>>>>>>> Given an input state s, next state s' = s[z -> s(x) + s(y)]
>>>>>>> where s(x) is the value of x in the previous state, and s[z -> 
>>>>>>> v] is a
>>>>>>> state with z updated to v.
>>>>>>>
>>>>>>> Another example that involves memory: the semantics of 'i = 
>>>>>>> ptrtoint p'
>>>>>> can
>>>>>>> be defined as follows:
>>>>>>> Given an input state s, next state s' = s[i -> s(p).obj.address +
>>>>>>> s(p).offset]
>>>>>>> where obj.address is the begin address of a memory object obj 
>>>>>>> pointed
>>>> by
>>>>>> p
>>>>>>> & offset is p's byte offset. (Imagine a pointer to the offset of 
>>>>>>> some
>>>>>> char
>>>>>>> array).
>>>>>>> Note that ptrtoint & add can be nicely defined w.r.t the input 
>>>>>>> state.
>>>>>>>
>>>>>>>
>>>>>>> Now, the instruction that we're facing is 'p = alloca'.
>>>>>>> To describe the output state after executing 'p = alloca', the 
>>>>>>> address
>>>> of
>>>>>>> new alloca should be determined.
>>>>>>> If observedness is involved, we need to know the future state 
>>>>>>> again. :/
>>>>>> We
>>>>>>> don't know whether the alloca is going to be observed or not 
>>>>>>> without
>>>>>> seeing
>>>>>>> the future.
>>>>>>> This is the problem of the current lifetime intrinsics as well.
>>>>>> No, you mix things up here.
>>>>>>
>>>>>> Nobody proposed to modify the semantics of `alloca`.
>>>>>> `alloca` provides you with a fresh, unobserved block of
>>>>>> dereferenceable memory that is implicitly freed as the stack
>>>>>> unwinds. That is it. No context necessary.
>>>>>>
>>>>>> If you want to modify the IR, you need to argue the observable
>>>>>> semantics which is nothing new. That this might require more than
>>>>>> a peephole view of the program is also not new.
>>>>>>
>>>>>>
>>>>>>> One possible approach to resolve this is adding an 'unobserved' 
>>>>>>> flag to
>>>>>>> alloca instruction (similar to what was suggested by Nicolai).
>>>>>>> And, we can say that if alloca with 'unobserved' is used by
>>>>>> ptrtoint/icmp,
>>>>>>> it is UB.
>>>>>> The flag can be added, like we add other attributes. It should not
>>>>>> be required for any optimization we talked about though. It 
>>>>>> basically
>>>>>> is a way to manifest derived or given information into the IR.
>>>>>>
>>>>>> Attribute deduction, as well as frontends with domain knowledge,
>>>>>> can add such information. The flag we discussed in phab was not even
>>>>>> sufficient for all the transformation examples I presented in my 
>>>>>> mail,
>>>>>> that is why I extended my  argument. We could still have a 
>>>>>> "noescape"
>>>>>> flag for allocas, but I'm not sure how useful that really is. We can
>>>>>> certainly deduce it and manifest it, unsure if we have domain 
>>>>>> knowledge
>>>>>> we can use for non-trivial cases though.
>>>>>>
>>>>>>
>>>>>>> But this makes ptrtoint/icmp make UB-raising instructions, which
>>>>>>> contradicts with what LLVM does.
>>>>>> As with other violation of attributes I would, on first though, 
>>>>>> suggest
>>>>>> to produce poison, not UB.
>>>>>>
>>>>>>
>>>>>>> Also, existing optimizations like loop versioning can introduce
>>>>>>> ptrtoint/pointer comparisons too.
>>>>>> Sure. I am not certain why that is a problem. I get the feeling 
>>>>>> things
>>>>>> are still mixed up here.
>>>>>>
>>>>>> What I proposed is twofold:
>>>>>>
>>>>>> 1) We stop folding comparisons between different allocas if 
>>>>>> changing the
>>>>>> address
>>>>>>       of both might be observable. Thus, if both might have their 
>>>>>> address
>>>>>> "taken"/escaped,
>>>>>>       other than the comparisons we want to fold, we cannot proceed.
>>>>>>
>>>>>> 2) We define lifetime markers to mean `memset(undef)`.
>>>>>>
>>>>>> The first should be sufficient for the problem you were trying to 
>>>>>> solve
>>>>>> in the
>>>>>> first place. The second makes lifetime markers less weird. Note 
>>>>>> that 1)
>>>>>> is not changing
>>>>>> the semantics of the IR. We basically just argue there is a bug 
>>>>>> in our
>>>>>> instcombine right
>>>>>> now as we do not check all necessary preconditions.
>>>>>>
>>>>>>
>>>>>>> I see that there are other questions that I didn't answer yet, 
>>>>>>> but let
>>>> me
>>>>>>> answer this first to limit the length of the text :)
>>>>>> Sure, we can split the discussion :)
>>>>>>
>>>>>> ~ Johannes
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>> Juneyoung
>>>>>>>
>>>>>>> On Thu, Jan 7, 2021 at 3:36 AM Johannes Doerfert <
>>>>>> johannesdoerfert at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On 1/5/21 8:00 PM, Juneyoung Lee wrote:
>>>>>>>>> Hi Johannes,
>>>>>>>>>
>>>>>>>>> I read your proposal and thought about the model.
>>>>>>>> Cool, thanks!
>>>>>>>>
>>>>>>>>
>>>>>>>>> As you concerned in A3, certain programs may be valid only when
>>>> memory
>>>>>>>>> blocks with overlapping lifetimes have disjoint addresses.
>>>>>>>>> Look at this example (I'm using malloc, but alloca also works):
>>>>>>>>>
>>>>>>>>> p1 = malloc(4)
>>>>>>>>> p2 = malloc(4) // for brevity, assume that there as enough 
>>>>>>>>> space & p1
>>>>>> and
>>>>>>>>> p2 != null
>>>>>>>>> set<char*> s;
>>>>>>>>> s.insert(p1); s.insert(p2); // If the second insert did 
>>>>>>>>> nothing, it
>>>>>> would
>>>>>>>>> be surprise to programmers
>>>>>>>>> work(s);
>>>>>>>>> free(data1)
>>>>>>>>> free(data2)
>>>>>>>>>
>>>>>>>>> Clearly, IR semantics should guarantee that escaped blocks are
>>>>>> disjoint.
>>>>>>>> It
>>>>>>>>> would be great for verification tools on LLVM IR to be able to 
>>>>>>>>> answer
>>>>>>>> that
>>>>>>>>> the second insert will succeed.
>>>>>>>> I agree, the second insert should succeed, assuming `p1 && p2`.
>>>>>>>> I don't think my proposal would in any way impact the program 
>>>>>>>> above,
>>>>>>>> if anything the A3 reasoning makes sure such a program with 
>>>>>>>> allocas
>>>>>>>> is not miscompiled.
>>>>>>>>
>>>>>>>> I'm also not sure I understand what you try to argue for. Maybe
>>>>>>>> elaborate a bit what it is you think is bad or needs to be 
>>>>>>>> changed.
>>>>>>>>
>>>>>>>>
>>>>>>>>> The problem is that definition of escapedness is not clear at the
>>>>>>>> semantic
>>>>>>>>> level. Describing the IR semantics w.r.t. LLVM's escaped analysis
>>>>>>>> function
>>>>>>>>> isn't something we want.
>>>>>>>>>
>>>>>>>>> Semantic definition of escapedness of a pointer seems hard, I 
>>>>>>>>> mean
>>>> in a
>>>>>>>>> stepwise manner.
>>>>>>>>> It isn't a single instruction such as 'escape i8* ptr', and we 
>>>>>>>>> need
>>>> to
>>>>>>>> look
>>>>>>>>> over all instructions in the function. For example, '(int)(p+1) -
>>>>>> (int)p'
>>>>>>>>> isn't semantically escaping the pointer p because the result is 1
>>>>>>>>> regardless of the value of p.
>>>>>>>>> Stepwisely defining the semantics of instructions is a desirable
>>>>>>>> direction
>>>>>>>>> IMO.
>>>>>>>> I'm confused. What in the proposal would prevent us from defining
>>>>>>>> the semantics of instructions, or force us to do it in an 
>>>>>>>> "undesirable
>>>>>>>> way"?
>>>>>>>>
>>>>>>>>
>>>>>>>>> In practice, syntactically checking escapedness + nice 
>>>>>>>>> engineering
>>>>>> might
>>>>>>>>> not break optimizations in most cases (as undef/poison did); 
>>>>>>>>> but it
>>>>>> would
>>>>>>>>> be great to move to another level, since LLVM IR is used in so 
>>>>>>>>> many
>>>>>>>> places
>>>>>>>>> :)
>>>>>>>> The property under which you can coalesce objects is simple:
>>>>>>>>       It is not observable.
>>>>>>>>
>>>>>>>> Now, if the address of one of the two objects you coalesce is not
>>>>>>>> observed, coalescing is not observable. That is a sufficient 
>>>>>>>> condition
>>>>>>>> not a necessary one. Pointer "escaping" is one step further. If 
>>>>>>>> the
>>>>>>>> address doesn't escape it is not observed. This does not mean the
>>>>>>>> "semantic conditions for coalescing", e.g., for the purpose of
>>>>>> translation
>>>>>>>> validation, is supposed to be build on top of our "definition of
>>>>>> escaping
>>>>>>>> pointers". That said, we use "does not escape" as a 
>>>>>>>> precondition for
>>>>>>>> various transformation and I'm unsure what is any different 
>>>>>>>> now. The
>>>>>>>> entire escape argument is only used in the validity of the pointer
>>>>>> folding.
>>>>>>>> Similarly, we can fold a comparison of a noalias pointer with 
>>>>>>>> another
>>>>>> one
>>>>>>>> if the former does not escape (and both are dereferenced and 
>>>>>>>> one is
>>>>>>>> written for sure).
>>>>>>>>
>>>>>>>>
>>>>>>>>> The pointer comparison is another beast. It indeed has a few 
>>>>>>>>> issues,
>>>>>> and
>>>>>>>>> solving it might require nontrivial solution.
>>>>>>>> I think the main problem of the inconsistencies and such we've 
>>>>>>>> seen is
>>>>>>>> rooted in the erroneous folding of pointer comparisons. 
>>>>>>>> Cleaning up
>>>> the
>>>>>>>> lifetime marker semantics is actually unrelated and simply not 
>>>>>>>> folding
>>>>>>>> as described in A3 should solve the issue that has been reported.
>>>>>>>> Nevertheless,
>>>>>>>> we should take a crack at lifetime markers while we are here.
>>>>>>>>
>>>>>>>>
>>>>>>>> ~ Johannes
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Juneyoung
>>>>>>>>>
>>>>>>>>> On Tue, Jan 5, 2021 at 9:37 AM Johannes Doerfert <
>>>>>>>> johannesdoerfert at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Juneyoung,
>>>>>>>>>>
>>>>>>>>>> Happy new year :)
>>>>>>>>>>
>>>>>>>>>> After we had a lengthy discussion on phab last year, I've 
>>>>>>>>>> tried to
>>>>>>>>>> summarize my thoughts,
>>>>>>>>>> especially given that I had some time to think about things 
>>>>>>>>>> over the
>>>>>>>> break.
>>>>>>>>>> Still, no promises on the quality ;)
>>>>>>>>>>
>>>>>>>>>> I start with general questions I asked myself and then go on
>>>> rambling
>>>>>>>>>> about a potential design.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Q1: Is lifetime a given property or a derived one, thus is it 
>>>>>>>>>> fixed
>>>> or
>>>>>>>>>> modifiable?
>>>>>>>>>>
>>>>>>>>>> This is a question I asked myself a lot recently. I think it is
>>>>>> derived
>>>>>>>>>> and modifiable,
>>>>>>>>>> at least I hope it is. Only that would allow transformations 
>>>>>>>>>> I would
>>>>>>>> want
>>>>>>>>>> us to do. Here are some examples:
>>>>>>>>>>        https://godbolt.org/z/G8obj3
>>>>>>>>>>        https://godbolt.org/z/obaTc
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Q2: Is a pointer comparison, or similar use, extending the 
>>>>>>>>>> lifetime?
>>>>>>>>>>
>>>>>>>>>> Asked differently, can we hoist a pointer comparison into a 
>>>>>>>>>> region
>>>>>> where
>>>>>>>>>> the pointer is dead?
>>>>>>>>>> This is an important question which we haven't discussed much 
>>>>>>>>>> as we
>>>>>>>>>> assumed LICM has to work.
>>>>>>>>>>
>>>>>>>>>> The current behavior is that non-dereferencing uses are not
>>>> extending
>>>>>>>>>> the lifetime and are
>>>>>>>>>> allowed outside of "lifetime regions" (as indicated by markers).
>>>> They
>>>>>>>>>> will always produce valid
>>>>>>>>>> results. Though, we might want to think about a lifetime 
>>>>>>>>>> marker that
>>>>>>>>>> spits out a new pointer
>>>>>>>>>> value instead of reusing the old one.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Q3: Can we use lifetime to argue about addresses?
>>>>>>>>>>
>>>>>>>>>> The question here is, can we fold address comparisons based on
>>>>>>>>>> lifetimes, or not.
>>>>>>>>>>
>>>>>>>>>> So far, we fold comparisons based on "address information". For
>>>>>> example,
>>>>>>>>>> we "know" globals,
>>>>>>>>>> allocas, and mallocs cannot be equal to one another. Also, two
>>>>>> distinct
>>>>>>>>>> allocations, for globals
>>>>>>>>>> and allocas, are considered unequal. Now, the crux is that we 
>>>>>>>>>> have
>>>> to
>>>>>> be
>>>>>>>>>> consistent if we do two
>>>>>>>>>> comparisons, and, as of now, we are not (bug number missing). 
>>>>>>>>>> Since
>>>>>> the
>>>>>>>>>> backend (or any other place
>>>>>>>>>> for that matter) might coalesce allocas, their addresses 
>>>>>>>>>> might not
>>>> be
>>>>>>>>>> different after all. If we
>>>>>>>>>> already folded a comparison to "unequal" we are doomed if we 
>>>>>>>>>> later
>>>>>> have
>>>>>>>>>> a comparison that results
>>>>>>>>>> in "equal". (Note, this is different from aliasing rules as 
>>>>>>>>>> they can
>>>>>> be
>>>>>>>>>> stricter.)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Design:
>>>>>>>>>>
>>>>>>>>>> I would hope we can come up with a model that treats memory "the
>>>>>> same",
>>>>>>>>>> regardless if it is global,
>>>>>>>>>> stack, or heap. I want to avoid special casing one of them wrt.
>>>>>> lifetime
>>>>>>>>>> as I believe most optimizations
>>>>>>>>>> would apply to any of them, potentially for different reasons 
>>>>>>>>>> and
>>>> with
>>>>>>>>>> different gains, but nevertheless.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Proposal (largely based on the conversation in phab):
>>>>>>>>>>
>>>>>>>>>> A1: Lifetime is a concept that talks about memory content 
>>>>>>>>>> *only*.
>>>>>>>>>> Basically, the memory content is set to
>>>>>>>>>>          undefined by lifetime markers. It is 
>>>>>>>>>> derived/modifiable as
>>>> it
>>>>>> is
>>>>>>>>>> just an "as-is" property of the memory
>>>>>>>>>>          content. The lifetimes of an object, as described by
>>>> markers,
>>>>>>>> might
>>>>>>>>>> change during the compilation. They
>>>>>>>>>>          might become smaller if we deduce the object is not 
>>>>>>>>>> accessed
>>>>>> and
>>>>>>>>>> the memory content is not used, they
>>>>>>>>>>          might become larger if objects with non-overlapping
>>>> lifetimes
>>>>>> are
>>>>>>>>>> coalesced. (One could see the latter as
>>>>>>>>>>          introducing a new object though.)
>>>>>>>>>>
>>>>>>>>>> A2: If we define lifetime as above, it has nothing to do with 
>>>>>>>>>> the
>>>>>>>>>> address of an object. Consequently, pointer
>>>>>>>>>>          comparisons and similar operations are valid outside 
>>>>>>>>>> the
>>>>>> lifetime.
>>>>>>>>>> Loads and stores are as well, they can
>>>>>>>>>>          even not be removed "easily". A store followed by a 
>>>>>>>>>> lifetime
>>>>>>>> marker
>>>>>>>>>> or a load following a lifetime marker
>>>>>>>>>>          is dead or results in undef respectively.
>>>>>>>>>>
>>>>>>>>>> A3: We could not use lifetime to argue about addresses. This 
>>>>>>>>>> means
>>>> we
>>>>>>>>>> could/should also not argue that overlapping
>>>>>>>>>>          lifetimes result in different addresses. Thus, a 
>>>>>>>>>> comparison
>>>>>>>> between
>>>>>>>>>> the address of two allocas could not
>>>>>>>>>>          immediately be folded. That said, there would be 
>>>>>>>>>> special
>>>> cases
>>>>>>>>>> though. Basically, if one of the allocas does
>>>>>>>>>>          not escape, other than the comparisons to be folded, 
>>>>>>>>>> we can
>>>>>> fold
>>>>>>>>>> them. Afterwards, coalescing or splitting
>>>>>>>>>>          would still be consistent because it is 
>>>>>>>>>> unobservable. The
>>>>>> problem
>>>>>>>>>> we have in-tree is that we fold even though
>>>>>>>>>>          the address is still observed (after the fold). It 
>>>>>>>>>> is still
>>>>>>>> unclear
>>>>>>>>>> to me what the impact of this would be
>>>>>>>>>>          on real code. I suggested before that we run some
>>>> experiments
>>>>>>>> first
>>>>>>>>>> before we make any decision whatsoever.
>>>>>>>>>>
>>>>>>>>>> This is pretty much saying that lifetime markers are
>>>> `memset(undef)`,
>>>>>> as
>>>>>>>>>> you suggested before (I think).
>>>>>>>>>> There are some implementation level differences but at the 
>>>>>>>>>> end of
>>>> the
>>>>>>>>>> day they are basically the same.
>>>>>>>>>>
>>>>>>>>>> Happy to hear some thoughts on this, especially if it fixes the
>>>>>> problems
>>>>>>>>>> that lead to D93376 in the first place.
>>>>>>>>>>
>>>>>>>>>> ~ Johannes
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/18/20 2:42 AM, Juneyoung Lee via llvm-dev wrote:
>>>>>>>>>>> Hello all,
>>>>>>>>>>>
>>>>>>>>>>> We're discussing the well-formedness of lifetime.start/end
>>>> intrinsic
>>>>>>>>>> here (
>>>>>>>>>>> https://reviews.llvm.org/D93376), deciding what is a
>>>> (syntactically
>>>>>> &
>>>>>>>>>>> semantically) valid use of these intrinsics.
>>>>>>>>>>>
>>>>>>>>>>> I'd like to gather more context about the intrinsics.
>>>>>>>>>>>
>>>>>>>>>>> 1. Is there a frontend or framework that introduces lifetime 
>>>>>>>>>>> call
>>>>>> with
>>>>>>>>>>> non-stack allocated objects?
>>>>>>>>>>> If exists, which behavior do they expect from it?
>>>>>>>>>>>
>>>>>>>>>>> 2. Can a block's lifetime be started bytewise or elementwise?
>>>>>>>>>>> I imagine an optimization that allow using stack very 
>>>>>>>>>>> compactly,
>>>> but
>>>>>>>>>> wonder
>>>>>>>>>>> there is a real-world use case.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Juneyoung
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>


More information about the llvm-dev mailing list