[PATCH] D64215: Add a transform pass to make the executable semantics of poison explicit in the IR

Wed Jul 10 09:43:19 PDT 2019

nlopes added a comment.

In D64215#1577194 <https://reviews.llvm.org/D64215#1577194>, @reames wrote:

> In D64215#1572285 <https://reviews.llvm.org/D64215#1572285>, @nlopes wrote:
>
> > In D64215#1572070 <https://reviews.llvm.org/D64215#1572070>, @reames wrote:
> >
> > > In D64215#1571424 <https://reviews.llvm.org/D64215#1571424>, @nlopes wrote:
> > >
> > > > Just a shameless plug :)
> > > >  We've been half secretly working on Alive2 (https://github.com/AliveToolkit/alive2), which includes a plugin for `opt` that can check if an optimization is correct or not. Alive2 also has a standalone tool that accepts 2 IR files instead.
> > >
> > >
> > > I'd tried playing with Alive2 a while ago, and had trouble getting it to work.  Could you maybe update the readme (or other docs) with some instructions on how to use the standalone tool you mentioned?  I'd very much like to play with this.
> >
> >
> > I've just added a short description to the README file: https://github.com/AliveToolkit/alive2#running-standalone-translation-validation-tool-alive-tv
> >  It's fairly simple: it just takes 2 LLVM IR files. Let me know if you have questions or if you find bugs :)
> >
> > >> This tool implements the semantics of poison for many LLVM instructions, and already has some support for memory (which is quite hard to handle).
> > >>  Of course, what this patch does is not the same. This patch is more executable, while Alive2 requires Z3 to reason about the semantics (though it can also execute code very slowly).
> > > 
> > > I'd love to explore options for sharing the semantics here.  What form does Alive2 express them in?
> >
> > That's still a unsolved research problem. No one really knows how to share semantics still.
> >  The semantics in Alive2 are written in C++, using an embedded expression language. While it is potentially possible to reuse that somewhere else, it isn't trivial. See e.g. the ir/instr.cpp file.
>
>
> One thought on sharing.
>
> From what I can tell from a quick look at the code you mentioned, it looks like you're parsing IR into an expression language, then rewriting the expressions to propagate poison - in a fairly similar manner to this code, but over your expression language - and then translating that expression language to SMT.  Is that a good high level summary?

It's like this:

- LLVM IR -> Alive2 IR (close to LLVM, but smaller)
- Alive2 IR -> SMT expressions. As you say, it describes the poison semantics in terms of SMT expressions. The underlying engine handles propagation of undef.
- VcGen: combine SMT expressions to produce a theorem to be proved

> If you used this framework (not necessarily the pass, but the utilities) to rewrite the IR so as to make poison semantics explicit before converting to your expression language, you might be able to factor out a good portion of the poison reasoning from Alive2.  It'd be a fairly major design though, so not sure if that's worth it to you or not.

We could go back from SMT expressions to LLVM IR, that's not very difficult (for arithmetic and other simple things, at least). The complication is the vcgen bit. Doable, but it's quite some work.
I think sharing semantics is not an immediate goal for us. We can share high-level semantics through LangRef and fix it where needed.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64215/new/

https://reviews.llvm.org/D64215