[PATCH] D93376: [LangRef] Clarify the semantics of lifetime intrinsics

Mon Dec 21 04:54:41 PST 2020

RalfJung added a comment.

> Given the proposed semantic of lifetime.start, the program had UB semantics prior to inlining, we might not have known it statically but that is not an issue.

The question is, *how* is it UB? To define UB precisely, we need an operational specification -- something that could, in principle, be implemented in a (complicated) interpreter. But for this example, which is the instruction that raises the UB? It cannot be the alloca, since when the alloca is being executed we have no way to know whether the return value will be used in a lifetime.start in the future! So does UB only arise when the lifetime.start happens? That would mean that the store itself is fine, but when lifetime.start happens we check if there was a load/store already and we raise UB at that point. I am not sure if that has the right behavior.

It also doesn't solve the issue of defining alloca in a way that allocations can overlap.

> As you noted, there are "constrainted" and "unconstrained" allocas; if there might be a lifetime marker use, an alloca is constrained.

The thing is, with a purely operational interpretation of lifetime.start, whether an alloca is "constrained" depends on *what happens in the future*. This is not a well-defined semantics (or at least, it is non-obvious how to make it one).

To give one example where "depending on the future" raises some nasty questions that one really doesn't want to arise, consider a program like

  x = alloca 4
  y = alloca 4
  if (x == y) {
    lifetime.start(x); lifetime.end(x);
    lifetime.start(y); lifetime.end(y);
  }

Now if the two pointers may be equal, then their lifetimes are disjoint so they cannot be equal. But if they cannot be equal, the if is never executed so the program is equivalent to

  x = alloca 4
  y = alloca 4

But in this program, they can be equal.

> Also, I did not suggest to change the semantics without an RFC and everything else that is needed. You suggest otherwise in your response. I did propose alternatives to fix your bug, but most of my responses show how the restrictions you want to put in place are making (potentially existing) optimizations invalid.

The thing is, in our view (aqjune, nlopes, and me) you *are* changing the semantics by allowing the lifetime intrinsics to be used on things like "malloc". From all we can tell, the intrinsics were never meant to be used with anything but alloca, so for all intents and purposes, currently, they only support alloca. All this patch is doing is making this currently-implicit assumption explicit. Your responses show that this is really long-overdue, since currently different parts of the compiler seem to assume different semantics.

The syntactic restriction we propose is only new in the sense that it hasn't been written down yet. But conceptually, I'd argue that the intrinsics have always been restricted to be "for alloca only", and I think the history of their introduction and how they currently work confirms this.

Now, maybe there are good reasons to support these intrinsics also more generally. But IMO the best way to go about this is to first document the limited cases the intrinsics was meant (and is implemented) to work with, and once we have a solid base to work on, we can consider extending the scope of what the intrinsics can be used for.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93376/new/

https://reviews.llvm.org/D93376