[PATCH] D93376: [LangRef] Clarify the semantics of lifetime intrinsics

Sun Dec 27 01:45:28 PST 2020

nhaehnle added a comment.

In D93376#2471740 <https://reviews.llvm.org/D93376#2471740>, @RalfJung wrote:

>> It's fair to say that this patch is a change of semantics by forbidding the use of the lifetime intrinsics in certain places where they were previously allowed.
>
> They were allowed by the LangRef but not supported in the code

You mean not supported //by codegen//, right? It would still have been possible for somebody to use them in some intermediate-only use of LLVM. I agree with your more important points, though.

>> The one change I would make in that proposal is to say that "if there is a lifetime marker use, an alloca is constrained". As long as the markers must be in the same IR function as the alloca itself, and phi/select are syntactically excluded, this determination can be made reliably (if not easily, due to the bitcast mess).
>
> So the "is" here is a purely syntactic check? That sounds pretty similar to what the patch says here, right (modulo the details of the meaning of "constrained", and of course saying what exactly constitutes a "use")?

Right. As I said, I think the patch is pointing in the right direction. It would be good to more explicitly talk about "constrained" vs. "unconstrained" allocas in the LangRef part.

> The purely syntactic check fails, however, for allocas that are not following the syntactic restriction proposed in the patch (or a slight relaxation of this restriction to support a few more operations on the "use path" from the alloca to the intrinsic).

That's why I'd hope that the IR verifier checks will stick around. Also, //slight// relaxations of the restriction can still be checked syntactically, but the cost becomes more expensive. For example, when `lifetime.start(bitcast(alloca))` is allowed, determining whether an alloca is constrained requires searching not just the users of the alloca, but also the users of any bitcasts of the alloca. This makes the syntactic check more expensive, and it's why I brought up the idea of an explicit flag on the alloca instruction.

Though relaxing the syntactic rules has other downsides. For example: if only the form `lifetime.start(alloca)` is allowed, then preventing the introduction of phis or selects in there (as in: `lifetime.start(select(alloca, ...))` is easy. If `lifetime.start(bitcast(alloca))` is allowed as well, then preventing the introduction of phis or selects between the bitcast and the alloca (as in: `lifetime.start(bitcast(select(alloca, ...)))` becomes a burden.

That's the part that makes me most uneasy about this patch.

Really, it'd be best if the syntactic rules could be kept as strict as possible.

>> Ideally, there would be a flag on the alloca instruction itself to distinguish between constrained and unconstrained. As it is, the flag is implicitly given as "has a (indirect) lifetime.start use". That's not great, and perhaps you could make a follow-up patch that introduces such a flag explicitly, but I think it's workable either way.
>
> Note that it's not just a boolean flag. For constrained allocas, it also matters with which other constrained allocas their lifetime can potentially overlap. This determines which allocations might overlap in memory, and which may not. So even an explicit boolean flag would be insufficient to avoid "magic syntactical rules".

I do believe a boolean flag is sufficient for what's pragmatically required. I see a number of different kinds of questions that we want to be answered:

1. Is a store/load through the alloca'd pointer UB without a lifetime.start?
2. Is the alloca'd pointer known to compare unequal to a different alloca'd pointer? (for IR transforms)
3. Which allocas can be overlapped on the stack? (for codegen)

Only the first one really needs a definitive answer encoded in the IR, and it's a simple Yes/No question.

For the second question, the actual lifetime ranges is important, but a boolean flag can be used to quickly determine whether the question must be answered conservatively or not, and I'd wager that that's sufficient for most practical purposes.

The third question also needs lifetime ranges, but I don't see why you'd want more than a boolean flag on the alloca instruction here as part of the IR itself. It feels more natural to derive the information when needed from the constrained/unconstrained distinction and placement of lifetime markers.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93376/new/

https://reviews.llvm.org/D93376