[PATCH] D99135: [deref] Implement initial set of inference rules for deref-at-point
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 24 09:12:31 PDT 2021
reames added inline comments.
================
Comment at: llvm/lib/IR/Value.cpp:741
+ // *) An alloca is deallocated when the function returns
+ if (isa<GlobalValue>(V) || isa<Constant>(V) || isa<AllocaInst>(V))
+ return false;
----------------
nlopes wrote:
> nikic wrote:
> > nlopes wrote:
> > > reames wrote:
> > > > nlopes wrote:
> > > > > reames wrote:
> > > > > > reames wrote:
> > > > > > > nlopes wrote:
> > > > > > > > an Alloca can be killed with the lifetime intrinsics. Storing to an alloca after lifetime_end is UB.
> > > > > > > Right, but how does that affect dereferenceability? The memory is still dereferenceable. The contents may be undefined, but the access won't fault.
> > > > > > >
> > > > > > > p.s. Lifetime intrinsics are very badly specified, and are inconsistent across the optimizer - including in the existing deref code. I really don't want them to be a blocker here.
> > > > > > JFYI, the wording in the LangRef on semantics of a dead object (after lifetime end) was recently changed in c821ef4. I objected to the wording at the time, though the patch appears not to have been adjusted. This is why.
> > > > > >
> > > > > > lifetime intrinsics do not change dereferenceability. The relevant wording from c821ef4 needs removed from the LangRef. It is wrong. Both with current and proposed semantics.
> > > > > I agree that doing a store after lifetime_end intrinsic doesn't cause a crash (ATM), but it may write to another object.
> > > > > For example:
> > > > > %p = alloca i8
> > > > > %q = alloca i8
> > > > >
> > > > > lifetime_start %p
> > > > > ...
> > > > > lifetime_end %p
> > > > > lifetime_start %q
> > > > > store i8 1, %q
> > > > > store i8 0, %p ; UB: %p is dead
> > > > >
> > > > > %v = load i8, %q
> > > > >
> > > > > What's the value of %v? In practice, it's going to be 0 because %p & %q will end up allocated at the same address. And this is why storing after lifetime_end must be UB.
> > > > >
> > > > > So now it depends on how this analysis is used. My concern is that it may be used to move a store past lifetime_end and then we have a miscompilation. Because if a pointer is dereferenceable for the size of the store and it's guaranteed to be alive throughout the function, then we can conclude that loads & stores to that pointer can be moved freely.
> > > > >
> > > > > A simple solution is to check if the particular alloca is an argument to a lifetime_end intrinsic. More complex ones require an additional argument like a BB or specific instruction, such that the query becomes: is the object still alive at this BB/instruction?
> > > > This has turned into a discussion of the lifetime intrinsics. I assert that is unrelated to this change (as the implementation here exactly matches the existing deref implementation for this case). I ask that this discussion be removed from this review thread and taken elsewhere.
> > > >
> > > > I'll also note that you are commingling concerns in your discussion. Dereferenceability does NOT imply it's safe to insert a store. That's a higher level property with additional concerns (e.g. concurrency) which need to be satisfied.
> > > Unfortunately we need to settle on the lifetime semantics if we want to have code inferring properties about allocas here.
> > > BTW, my concern is about movement of load/stores, not of introduction.
> > >
> > > This `canFree` is called from `getPointerDereferenceableBytes()`, which will infer for how many bytes a pointer is dereferenceable in the whole function. Not just in a particular BB. Hence once the dereferenceable attribute is inferred, "free" movement of memory accesses becomes legal. But it's not, as we have to respect the boundaries of lifetime_start/end.
> > > (I'm even ignoring segmented stacks to avoid extra complication).
> > >
> > > Fundamentally, I think that if we go with a definition of dereferenceability per function, we are always going to have our hands tied. We may need to refine the abstraction and do it per BB, for example. Without knowing what every single user of `getPointerDereferenceableBytes()` does with the result of the analysis, I'm not comfortable declaring an alloca'ed object as always dereferenceable.
> > Movement of accesses across lifetime intrinsics is already prevented because lifetime intrinsics are modeled as clobbering the location.
> >
> > I really don't think we should get lifetime intrinsics caught up in this if we can avoid it. Requiring context to handle allocas complicates things a lot.
> >
> > Do you have some specific example of a miscompile that would occur if lifetime intrinsics are not modeled here?
> That's a good point, thanks! Since the intrinsics clobber the objects then movement is already _likely_ prevented. I tried and couldn't trigger a miscompilation by hand in ~15 min of fiddling with IR.
>
> I'm still uncomfortable declaring allocas dereferenceable in the whole function when they are not. Who knows what the users of `getPointerDereferenceableBytes` will do with this information (now or in the future).. We don't have documentation about these analyses.
>
> Anyway, to unblock this, I'll just suggest a comment is added to `getPointerDereferenceableBytes` regarding allocas. Meaning, just because an alloca'ed pointer isn't escaped and is "dereferenceable" for x bytes, it doesn't mean we can introduce a store to it (as one would suppose!). One needs to consider lifetime as well.
Nuno,
I will not put an incorrect comment into the source code to reflect a mistaken understanding of the code. As explained by both myself and Nikita previously, you are commingling concerns of dereferenability, concurrency, and memory aliasing.
I have already asked you to move the discussion of lifetime intrinsics elsewhere. As I asserted before, the code as written here matches the existing implementation it's replacing. If you want to change that implementation or the semantics thereof, feel free to put forth proposals. I will not.
I am frustrated by this sub-thread. TBH IMO, it seems like you are not discussing the issue *in this patch* in good faith. It seems like you're trying to insist I broaden the scope of work substantially. That bothers me.
================
Comment at: llvm/lib/IR/Value.cpp:756
+ // on F's behalf either.
+ if (A->hasNoFreeAttr() && A->hasNoAliasAttr())
+ return false;
----------------
nlopes wrote:
> nlopes wrote:
> > reames wrote:
> > > nlopes wrote:
> > > > reames wrote:
> > > > > nlopes wrote:
> > > > > > I don't understand this argument of why hasNoSync isn't needed.
> > > > > > All argument attributes are with respect to what the function does, not the environment. It would be impossible to infer attributes if the environment was considered.
> > > > > >
> > > > > > My understanding is that readonly implies nofree. Can't see why nofree would have those special semantics
> > > > > The focus here isn't on nofree implying nosync. It's noalias which implies nosync. In order to for there to be a written (freed) copy of the pointer accessible in the other thread, the noalias fact must be false. If so, that's already UB.
> > > > Ok, I confess I'm less familiar with the noalias attribute. Though memcpy's signature is something like:
> > > > declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #1
> > > >
> > > > The caller of memcpy may have had the src/dst pointers escaped to other threads. Even then, both arguments are marked as noalias because the C spec says the arguments cannot overlap.
> > > > This contradicts the global definition of noalias you are suggesting.
> > > I'm also fairly unfamiliar with noalias. I'm going off existing wording, and what the optimizer appears to actually do.
> > >
> > > I don't think it actually is contradictory. Your concern involves a *writeable* copy of the pointer. The wording of noalias talks about *access*. It doesn't appear to be UB to have a writeable copy of the pointer on another thread (or even within the callee). It does appear to be UB to *actually write to it* during the execution of the callee.
> > >
> > > Note that this definition does make noalias argument attributes very hard to infer. I'll note that we only have noalias return inference which might be exactly because of that. :) I'm not sure.
> > I see; thanks for the explanation. LLVM doesn't seem to assume that noalias pointers are distinct indeed: https://gcc.godbolt.org/z/KTfnzMMj1
> > So I'm good on this now :)
> Just re-read LangRef a few times again regarding noalias (https://llvm.org/docs/LangRef.html#parameter-attributes).
> I can't interpret what's written there as anything related with pointers not being writable in another thread. It just says that if I access object o1, and that object is pointed to by a noalias argument %p, then object o1 can't be accessed through any other pointer that is not based on %p. There's no word about concurrent accesses.
> So either this code is incorrect or LangRef needs updating.
I'm going to move this inference rule to it's own review. We probably need input from Johannes or someone more knowledgeable on noalias than myself. I want to unblock the rest of the review.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99135/new/
https://reviews.llvm.org/D99135
More information about the llvm-commits
mailing list