[PATCH] D99135: [deref] Implement initial set of inference rules for deref-at-point

Thu Mar 25 11:56:22 PDT 2021

reames added inline comments.

================
Comment at: llvm/lib/IR/Value.cpp:741
+  // *) An alloca is deallocated when the function returns
+  if (isa<GlobalValue>(V) || isa<Constant>(V) || isa<AllocaInst>(V))
+    return false;
----------------
nlopes wrote:
> reames wrote:
> > nlopes wrote:
> > > nikic wrote:
> > > > nlopes wrote:
> > > > > reames wrote:
> > > > > > nlopes wrote:
> > > > > > > reames wrote:
> > > > > > > > reames wrote:
> > > > > > > > > nlopes wrote:
> > > > > > > > > > an Alloca can be killed with the lifetime intrinsics. Storing to an alloca after lifetime_end is UB.
> > > > > > > > > Right, but how does that affect dereferenceability?  The memory is still dereferenceable.  The contents may be undefined, but the access won't fault.
> > > > > > > > > 
> > > > > > > > > p.s. Lifetime intrinsics are very badly specified, and are inconsistent across the optimizer - including in the existing deref code.  I really don't want them to be a blocker here.  
> > > > > > > > JFYI, the wording in the LangRef on semantics of a dead object (after lifetime end) was recently changed in c821ef4.  I objected to the wording at the time, though the patch appears not to have been adjusted.  This is why.  
> > > > > > > > 
> > > > > > > > lifetime intrinsics do not change dereferenceability.  The relevant wording from c821ef4 needs removed from the LangRef.  It is wrong.  Both with current and proposed semantics.
> > > > > > > I agree that doing a store after lifetime_end intrinsic doesn't cause a crash (ATM), but it may write to another object.
> > > > > > > For example:
> > > > > > >   %p = alloca i8
> > > > > > >   %q = alloca i8
> > > > > > >   
> > > > > > >   lifetime_start %p
> > > > > > >   ...
> > > > > > >   lifetime_end %p
> > > > > > >   lifetime_start %q
> > > > > > >   store i8 1, %q
> > > > > > >   store i8 0, %p ; UB: %p is dead
> > > > > > >   
> > > > > > >   %v = load i8, %q
> > > > > > > 
> > > > > > > What's the value of %v? In practice, it's going to be 0 because %p & %q will end up allocated at the same address. And this is why storing after lifetime_end must be UB.
> > > > > > > 
> > > > > > > So now it depends on how this analysis is used. My concern is that it may be used to move a store past lifetime_end and then we have a miscompilation. Because if a pointer is dereferenceable for the size of the store and it's guaranteed to be alive throughout the function, then  we can conclude that loads & stores to that pointer can be moved freely.
> > > > > > > 
> > > > > > > A simple solution is to check if the particular alloca is an argument to a lifetime_end intrinsic. More complex ones require an additional argument like a BB or specific instruction, such that the query becomes: is the object still alive at this BB/instruction?
> > > > > > This has turned into a discussion of the lifetime intrinsics.  I assert that is unrelated to this change (as the implementation here exactly matches the existing deref implementation for this case).  I ask that this discussion be removed from this review thread and taken elsewhere.
> > > > > > 
> > > > > > I'll also note that you are commingling concerns in your discussion.  Dereferenceability does NOT imply it's safe to insert a store.  That's a higher level property with additional concerns (e.g. concurrency) which need to be satisfied.
> > > > > Unfortunately we need to settle on the lifetime semantics if we want to have code inferring properties about allocas here.
> > > > > BTW, my concern is about movement of load/stores, not of introduction.
> > > > > 
> > > > > This `canFree` is called from `getPointerDereferenceableBytes()`, which will infer for how many bytes a pointer is dereferenceable in the whole function. Not just in a particular BB. Hence once the dereferenceable attribute is inferred, "free" movement of memory accesses becomes legal. But it's not, as we have to respect the boundaries of lifetime_start/end.
> > > > > (I'm even ignoring segmented stacks to avoid extra complication).
> > > > > 
> > > > > Fundamentally, I think that if we go with a definition of dereferenceability per function, we are always going to have our hands tied. We may need to refine the abstraction and do it per BB, for example. Without knowing what every single user of `getPointerDereferenceableBytes()` does with the result of the analysis, I'm not comfortable declaring an alloca'ed object as always dereferenceable.
> > > > Movement of accesses across lifetime intrinsics is already prevented because lifetime intrinsics are modeled as clobbering the location.
> > > > 
> > > > I really don't think we should get lifetime intrinsics caught up in this if we can avoid it. Requiring context to handle allocas complicates things a lot.
> > > > 
> > > > Do you have some specific example of a miscompile that would occur if lifetime intrinsics are not modeled here?
> > > That's a good point, thanks! Since the intrinsics clobber the objects then movement is already _likely_ prevented. I tried and couldn't trigger a miscompilation by hand in ~15 min of fiddling with IR.
> > > 
> > > I'm still uncomfortable declaring allocas dereferenceable in the whole function when they are not. Who knows what the users of `getPointerDereferenceableBytes` will do with this information (now or in the future).. We don't have documentation about these analyses.
> > > 
> > > Anyway, to unblock this, I'll just suggest a comment is added to `getPointerDereferenceableBytes` regarding allocas. Meaning, just because an alloca'ed pointer isn't escaped and is "dereferenceable" for x bytes, it doesn't mean we can introduce a store to it (as one would suppose!). One needs to consider lifetime as well.
> > Nuno,
> > 
> > I will not put an incorrect comment into the source code to reflect a mistaken understanding of the code.  As explained by both myself and Nikita previously, you are commingling concerns of dereferenability, concurrency, and memory aliasing.  
> > 
> > I have already asked you to move the discussion of lifetime intrinsics elsewhere.  As I asserted before, the code as written here matches the existing implementation it's replacing.  If you want to change that implementation or the semantics thereof, feel free to put forth proposals.  I will not.  
> > 
> > I am frustrated by this sub-thread.  TBH IMO, it seems like you are not discussing the issue *in this patch* in good faith.  It seems like you're trying to insist I broaden the scope of work substantially.  That bothers me.
> > 
> The whole point of your RFC & patches was to fix bugs in LLVM, not to match the implementation it is replacing AFAIU.
> 
> Anyway, since apparently I'm acting in bad faith regarding this patch, I'll step down immediately.
To close the loop here for third parties reading along, Nuno and I had an offline conversation on this sub-point today.

This discussion did trigger a clarification response to the original RFC to be much more specific about the scope of the proposal.  That seems to have been the original confusion which let us down this path of miscommunication.  

The conversation also triggered me to change my handling of a related discussion thread about changes which recently changed the semantics of lifetime intrinsics. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99135/new/

https://reviews.llvm.org/D99135