[PATCH] D93376: [LangRef] Clarify the semantics of lifetime intrinsics

Fri Dec 18 07:54:30 PST 2020

jdoerfert added a comment.

In D93376#2462400 <https://reviews.llvm.org/D93376#2462400>, @aqjune wrote:

> In D93376#2462245 <https://reviews.llvm.org/D93376#2462245>, @jdoerfert wrote:
>
>> [...]
>>
>>> We're leaving a construct something that is really hard to be used. In perspective of the frontend, simply placing malloc()/free() or memset(undef) or their own marker at the place where lifetime.start/lifetime.end are supposed to be would have been a better choice.
>>
>> I'm not so sure about that. memset(undef) is, for example, not recognized commonly while `hasOnlyLifetimeUsers` (or similar) is a common check. Moving malloc/free may come with a cost.  Placing the malloc/free in a loop is a totally different story than placing lifetime markers in there. Imagine you have a heap allocation in a loop and the FE (or something else) places the malloc earlier instead and uses lifetime markers to assure the middle end that there are no loop carried dependences via this memory region. (Also, we should implement this transformation, it could not only save time wrt. allocations but also allow vectorization, especially given the lifetime markers.)
>
> Because in case of stack allocated objects, lifetime.start/end determines the object's location as well. Two allocas with disjoint lifetimes should be able to be overlapped.

Sure. You could do the same transformation with heap objects under the same constraints. So you go from

  for (/* a long time */) {
    x = malloc(large);
    use(x);
    free(x);

    y = malloc(large);
    use(y);
    free(y)
  }

to

  x = malloc(large);
  y = malloc(large);
  for (/* a long time */) {
    lifetime.start(x)
    use(x);
    lifetime.end(x)

    lifetime.start(y)
    use(y);
    lifetime.end(y)
  }
  free(x);
  free(y);

and finally to

  x = malloc(large);
  for (/* a long time */) {
    lifetime.start(x)
    use(x);
    lifetime.end(x)

    lifetime.start(x)
    use(x);
    lifetime.end(x)
  }
  free(x);

same as you would for stack allocations.

> For a heap allocation, malloc()/free() should determine the address, and I wanted to say that it isn't clear whether frontend writers will expect lifetime intrinsics to determine the disjointness; it is unclear.

The above transformation could be done in the middle end just fine. Even the first step is beneficial and the lifetime markers make it possible to perform the second part.

> The optimization makes sense, but won't simply putting memset(undef) at the end of the loop do the trick? The backend pipeline (e.g. CodeGenPrepare) can remove the redundant memset.

Yes and no. If you'd replace both markers with memset(undef) you might get to this but there are reasons not to do it. First, you'd need to teach places about this so the dead store is not removed. Second, you'd need to teach places about the memset(undef) pattern (as we did/do for lifetime markers). Third, it is arguably easier to read/understand `lifetime.start(x)` than `memset(x, undef)`. Finally, a memset(undef) looks like a write (and is one) but that is not strictly what a lifetime marker is. If you happen to have an empty lifetime range followed by a non empty lifetime range, in the memset version there are WAW dependences which you need to argue away while in the lifetime version they don't exist; This is probably part of the teaching point above but shows how memset now becomes different things.

>> I disagree that syntactic constraints are needed, in a generic sense, and I find they often make the IR harder to work with. I'm not sure I understand your example but I guess if you inline the call the lifetime argument could be syntactically something else, right? So it might become syntactically useful even if it wasn't before. The PHI case is the opposite. It started to be syntactically useful, i.a., alloca arguments, but after sinking there might not be a single alloca but a phi picking one. Arguably, the information is no different. A user could easily determine all the underlying objects and filter the allocas.
>
> It's because it can introduce UB. Consider this example:
>
>   p = alloca i32
>   store 0, p // analysis will say this is perfectly fine
>   f(p) // lifetime.start(p) is hidden in f(p)
>
> After inlining:
>
>   p = alloca i32
>   store 0, p
>   lifetime.start(p) // this makes the above store raise UB 

Given the proposed semantic of lifetime.start, the program had UB semantics prior to inlining, we might not have known it statically but that is not an issue.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93376/new/

https://reviews.llvm.org/D93376