[PATCH] D64215: Add a transform pass to make the executable semantics of poison explicit in the IR

Fri Jul 5 16:32:37 PDT 2019

reames marked an inline comment as done.
reames added a comment.

In D64215#1571424 <https://reviews.llvm.org/D64215#1571424>, @nlopes wrote:

> Just a shameless plug :)
>  We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for `opt` that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.

I'd tried playing with Alive2 a while ago, and had trouble getting it to work.  Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned?  I'd very much like to play with this.

> This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
>  Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).

I'd love to explore options for sharing the semantics here.  What form does Alive2 express them in?

> I guess this is more a FYI.  If you want to support a significant chunk of LLVM, it's going to be a lot of work. Alive2 already has ~1 year of development, and has already found a few bugs in LLVM.

I'm not expecting this to ever get 100% coverage, but I do have a backlog of miscompiles I'm trying to reduce.  :)

================
Comment at: lib/Transforms/Instrumentation/PoisonChecking.cpp:41
+//   are well defined on the specific input used.
+// - Finding/confirming poison specific miscompiles by checking the poison
+//   status of an input/IR pair is the same before and after an optimization
----------------
sanjoy wrote:
> reames wrote:
> > reames wrote:
> > > sanjoy wrote:
> > > > This will have false positives as long as `undef` is a thing right?  Unless the instrumentation added by this pass will produce poison if *any* value of `undef` produces poison (naively it seems this would require calling into a SAT solver at runtime)?
> > > I don't actually follow what you're trying to say.  Can you maybe rephrase w/an example which concerns you?  
> > > 
> > > For context, I'm focusing on poison for the moment, but I think a completely reasonable thing would be to have a parallel approach for undef (1 bit per bit of original value), and then cross propagate if needed.  I have a much weaker understanding of the undef rules, so I started with poison since that seemed a bit more clearly specified and recently discussed.  
> > > 
> > For context, the part which has me confused was your claim of a false positive.  I definitely see how undef can cause a false negative if not accounted for in the poison tracking.
> By "false positive" I meant "incorrectly concluding that an optimization is buggy because this pass fails to detect poison before the optimization but is able to detect poison after the optimization".
> 
> Say you have this program:
> 
> ```
> int c = INT_SMAX +nsw undef;
> ```
> 
> IIUC you'll instrument this to:
> 
> ```
> int c, bool ov = add_with_overflow(INT_SMAX, undef)
> if (ov) trap();
> ```
> 
> which will for only some values of `undef`.  In particular, it may be (depending on the phase of the moon) that running this program does not trap.
> 
> But a pass can legitimately transform the program to:
> 
> ```
> int c = INT_SMAX +nsw 1;  // fold undef to 1
> ```
> 
> which will deterministically trap after instrumentation.
> 
> So if we are unlucky it will "look like" the transformation has a bug -- before the xform the instrumented IR did not trap, but after the xform it traps.
Yep, completely agree.  This is a false positive for the "find problematic transform" case, but a false negative for the "does this program execute UB" case.  I'd added a comment about this to the source, and this is the case that made me realize the false negative/positive terminology was super confusing without being specific about the use case.

In other words, I'm okay with this for the moment.  :)  Future work may explore this further.  

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64215/new/

https://reviews.llvm.org/D64215