[LLVMdev] lifetime.start/end clarification

Wed Nov 5 13:06:53 PST 2014

----- Original Message -----
> From: "Nick Lewycky" <nlewycky at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Reid Kleckner" <rnk at google.com>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Wednesday, November 5, 2014 2:59:54 PM
> Subject: Re: [LLVMdev] lifetime.start/end clarification
> 
> On 5 November 2014 12:48, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> ----- Original Message -----
> > From: "Nick Lewycky" < nlewycky at google.com >
> > To: "Hal Finkel" < hfinkel at anl.gov >
> 
> 
> > Cc: "Reid Kleckner" < rnk at google.com >, "LLVM Developers Mailing
> > List" < llvmdev at cs.uiuc.edu >
> > Sent: Wednesday, November 5, 2014 2:39:38 PM
> > Subject: Re: [LLVMdev] lifetime.start/end clarification
> > 
> > On 5 November 2014 11:51, Hal Finkel < hfinkel at anl.gov > wrote:
> > 
> > 
> > ----- Original Message -----
> > > From: "Reid Kleckner" < rnk at google.com >
> > > To: "Philip Reames" < listmail at philipreames.com >
> > > Cc: "LLVM Developers Mailing List" < llvmdev at cs.uiuc.edu >
> > > Sent: Wednesday, November 5, 2014 12:54:30 PM
> > > Subject: Re: [LLVMdev] lifetime.start/end clarification
> > > 
> > > This seems fine to me. The optimizer can (soundly) conclude that
> > > %p
> > > is dead after the "lifetime.end" (for the two instructions), and
> > > dead before the "lifetime.start" (for the *single* instruction in
> > > that basic block, *not* for the previous BB). This seems like the
> > > proper result for this example, am I missing something?
> > > 
> > > 
> > > What if I put that in a loop, unroll it once, and prove that the
> > > lifetime.start is unreachable? We would end up with IR like:
> > > 
> > > 
> > > loop:
> > > ... use %p
> > > call void @lifetime.end( %p )
> > > 
> > > ... use %p
> > > call void @lifetime.end( %p )
> > > br i1 %c, label %loop, label %exit
> > > 
> > > 
> > > Are the second uses of %p uses of dead memory?
> > > 
> > > 
> > > We have similar issues if the optimizer somehow removes the
> > > lifetime
> > > end and keeps the start:
> > > 
> > > 
> > > 
> > > loop:
> > > call void @lifetime.start( %p )
> > > 
> > > ... use %p
> > > call void @lifetime.start( %p )
> > > 
> > > 
> > > ... use %p
> > > br i1 %c, label %loop, label %exit
> > > 
> > > 
> > > For this reason, it has been suggested that these intrinsics are
> > > horribly broken,
> > 
> > I disagree, these just seem like bugs. lifetime_start are marked as
> > IntrReadWriteArgMem, but this is not really sufficient to prevent
> > their removal should the memory be subsequently unused. Plus there
> > are other places that just delete the lifetime intrinsics, like
> > this
> > in lib/Transforms/Scalar/SROA.cpp:
> > 
> > // FIXME: Currently the SSAUpdater infrastructure doesn't reason
> > about
> > // lifetime intrinsics and so we strip them (and the bitcasts+GEPs
> > // leading to them) here. Eventually it should use them to optimize
> > the
> > // scalar values produced.
> > if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
> > assert(II->getIntrinsicID() == Intrinsic::lifetime_start ||
> > II->getIntrinsicID() == Intrinsic::lifetime_end);
> > II->eraseFromParent();
> > continue;
> > }
> > 
> > we need to go through the various places that might delete these
> > intrinsics and fix them. The same will be true with any other
> > mechanism.
> > 
> > 
> > 
> > It removes them because it does (or will) remove the associated
> > alloca anyways as part of turning loads and stores into SSA.
> > There's
> > no need for lifetime intrinsic equivalents on SSA given that we
> > have
> > use-lists and tools like the dominator tree.
> 
> Good point, I did not think too carefully about what the code was
> doing, but rather pointing out that there is special-case code
> dealing with lifetime intrinsics that needs to be looked at, and
> code that does not deal specifically with lifetime intrinsics that
> may have to do so. I certainly agree that we don't need them for SSA
> values.
> 
> For the code in question, I don't see why you wouldn't just RAUW the
> alloca with undef and then let DCE remove the intrinsics (this is,
> however, somewhat off-topic for this thread).
> 
> > 
> > 
> > 
> > 
> > > and both should be remodeled to just mean "store of
> > > undef bytes to this memory".
> > 
> > This is a bad idea. Stores of undef bytes can be removed if we can
> > prove that the address is dereferenceable. And if they can't be
> > removed, then they have side effects that can't ever be removed.
> > Please don't do that.
> > 
> > I think the idea is to define them with the semantics of storing
> > undef bytes, but keep them implemented as intrinsic function calls,
> > so that the optimizer does not simply delete them. It's a way of
> > communicating that these are deliberate and valuable stores to
> > undef, as opposed to stores of SSA values that were later found to
> > be undef.
> 
> I did not get that impression, and if that is what was proposed, I
> don't see how that differs, in practice, from what we have now.
> 
> 
> 
> The LangRef definition looks like that plus some special rules about
> how *all* uses before the start are dead. *The* start? What about
> multiple starts? What does it mean to have start/end/start/end? Can
> you use an alloca normally, then lifetime.start it? According to
> langref, no, *all* uses before the start may be nuked. 

Ah, yes, good point.

> It's a weird
> rule, but it's intended to support the use case of stack slot
> colouring, where your starts and ends are paired and tightly wrap
> the point where the variable is live.

Yes.

> 
> If you remove that oddity, lifetime.start and lifetime.end become
> semantically equivalent and both just mean "store undef there" and
> become straight-forward to reason about, though harder to use for
> stack slot colouring (it becomes a bidirectional data flow problem,
> which is hard on compile time). At this stage, I think the tradeoff
> is worthwhile.

I'm unsure, but could easily agree.

 -Hal

> 
> 
> 
> Thanks again,
> Hal
> 
> 
> 
> > 
> > 
> > 
> > -Hal
> > 
> > > If "use %p" is a load, for example, in
> > > both cases we can safely say it returns undef, because it's a
> > > use-after-scope.
> > > 
> > > 
> > > I think coming up with a new representation with simpler
> > > semantics
> > > is
> > > the way to go. One allocation or lifetime start, and one
> > > deallocation and end.
> > > 
> > > 
> > > Implementing this in Clang will be tricky, though. Clang's IRGen
> > > is
> > > supposed to be a dumb AST walk, but it has already strayed from
> > > that
> > > path. Needs more thought...
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > > 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > 
> > 
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > 
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory