[PATCH] D41006: [SafepointIRVerifier] Allow non-dereferencing uses of unrelocated or poisoned PHI nodes

Fri Dec 15 09:33:50 PST 2017

anna requested changes to this revision.
anna added a comment.
This revision now requires changes to proceed.

Comments inline.

================
Comment at: lib/IR/SafepointIRVerifier.cpp:260
+/// R + R = R
+/// Where "+" - any operation that somehow merge pointers, U - unrelocated,
+/// R - relocated and P - poisoned. Merge with constant doesn't change anything.
----------------
DaniilSuchkov wrote:
> mkazantsev wrote:
> > DaniilSuchkov wrote:
> > > mkazantsev wrote:
> > > > Maybe instead of using the term "merge pointers" stick to the term "derived pointer"? You use both, and I don't catch what is the difference between them.
> > > By "deriving" I mean f(x) -> x, and by merge f(x1, ..., xN) -> x.
> > I guess "deriving" is actually f(x1) -> x. And what you call "deriving" is just a particular case of "merging". Again, why do we need both?
> Because for deriving there is only one rule: it changes nothing and for merge everything is a bit more complicated. Both "gep/bitcast is merge" and "phi/select is deriving" looks misleading.
> 
> I agree that formally f(x1) -> x is a particular case of f(x1, ..., xN) -> x, but how to name it so that it won't be confusing?
I tend to agree with Max here. We really cannot distinguish between both. 
How about we focus just on the fact that we have multiple sources here? IOW, don't worry about GEPs and bitcasts because they have single source base.
These GEPs and bitcasts don't change the behaviour of unrelocated/relocated, so they shouldn;t affect this discusson. 

================
Comment at: lib/IR/SafepointIRVerifier.cpp:242
+/// GC pointer may be in one of three states: relocated, unrelocated and
+/// poisoned.
+/// Relocated pointer may be used without any restrictions.
----------------
I don't think we want to introduce one more term 'poisoned' here. Specifically, poisoning has different meanings in the optimizer (and sometimes in the GC), and can be confusing.
It looks like `poisoned` pointers are just derived pointers which will be *lexically* from multiple pointers. So something like:
`gep, bitcasts` -> derived pointer from one base
`phis, selects` -> derived pointers that are lexically from multiple base pointers (I say lexically, because we can have phis/selects that statically have derived from exactly one pointer). 

Do we really need to make this distinction? I think it's more confusing. Pls see comment below on naming to make clearer.

================
Comment at: lib/IR/SafepointIRVerifier.cpp:262
+/// R - relocated and P - poisoned. Merge with constant pointer doesn't change
+/// anything. Merge of pointers by itself is always safe, deriving one pointer
+/// from another is always safe too. Derived pointer always has the same state
----------------
As mentioned, I dont think you need to explicitly state out the distinction here. Just maybe a single line at the beginning when explaining `MultiSourceDerivedPtr` (I prefer that instead of `Poisoned` naming), because we can have multiple sources and still not be poisoned.

================
Comment at: lib/IR/SafepointIRVerifier.cpp:267
+/// a) for phi we need to check status of each input *at the end of
+///    corresponding predicessor BB*.
+/// b) for other instructions we need to check status of each input *at the
----------------
Nit: predecessor

================
Comment at: lib/IR/SafepointIRVerifier.cpp:279
+///         \     |
+///      p2 = phi [p] [p1]
+///      p3 = phi [p] [p]
----------------
This is incorrect IR. We cannot have multiple *different* incoming phi values from exactly one predecessor.
With correct IR, I don't think we will have such false positives. If we do have, could you please add a test with FIXME?

https://reviews.llvm.org/D41006