[LLVMdev] How best to represent assume statements in LLVM IR?

Tue Oct 2 16:55:47 PDT 2012

The history of this issue is in http://llvm.org/PR810 - please add any
new insight/ideas there (& read Sabre's (Chris Lattner) ye olde txt
file (mentioned in the bug) on his thoughts on the issue too).

On Mon, Oct 1, 2012 at 1:39 PM, Philip Reames <listmail at philipreames.com> wrote:
> Does anyone have any suggestions on how to best represent an assumption
> statement(*) in IR?  In particular, I want to expose the information implied
> by the assumption to the optimization passes without emitting code (after
> optimization) to check the assumption itself.  I've tried a couple of
> options so far, and none have gotten me quite the right semantics.  Has
> anyone else implemented this or does anyone have suggestions on how best to
> do this?
>
>
> ** Background **
>
> By "assumption statement" I mean the standard "assume(boolean expression)"
> construct that shows up in program verification.  It essentially just
> indicates a programmer (or tool) provided fact that holds at that particular
> line.  It's very similar to an assertion statement except that the condition
> is not checked at runtime. Conceptually, an assumption statement has no
> runtime semantics, but I want to be able to leverage the assumption to prune
> any unnecessary conditionals downstream from the assumption.
>
> To give an example in pseudo code, I'd like to start with something like
> this:
> assume x == 5
> if x != 5 { ...}
> use(x)
>
> And after optimization end up with this:
> use(5)
>
> ** End Background **
>
>
> The general strategy I've been using is to represent the assumption as a
> branch on the conditional value and rely on the optimizer to clean it up.
> The options I've considered within the (assumed not taken) branch are:
> - "unreachable" - With this choice, the optimizer nicely removes all of the
> branches leading to the unreachable statement - i.e. the assumption has no
> cost - but downstream conditionals which should be constant given the
> assumption are not optimized.  This surprises me.  Am I misunderstanding the
> intended use of unreachable here?
> - "call void @llvm.trap() noreturn nounwind" - With this choice, downstream
> conditionals are optimized away, but the trap instruction is emitted - i.e.
> the assumption is checked at runtime.
>
> The next idea I'm considering is to essentially run the optimizer twice.
> First, I'd emit a call to a custom intrinsic (i.e. a wrapper around trap),
> and run the optimizer.  This would have the effect of removing any
> downstream branches implied by the assumption.  Second, I'd insert a custom
> pass to remove calls to the intrinsic and replace them with unreachable.
> Then I'd rerun the full set of optimizations.
>
> Based on the observed behavior, this should get me the desired result.
> However, this is starting to feel like somewhat of a hack and I wanted to
> get feedback from the community before continuing.
>
> Notes:
> - To clarify terminology, the "optimizer" I refer to above is the standard
> set of passes at "-O3".
> - I haven't investigated which of the various optimization passes are
> responsible for each observed behavior.
> - All of my tests were run on a x86_64 system using a slightly modified
> Clang/LLVM 3.0.  I don't believe any of the changes are relevant.  If I need
> to rerun my tests using 3.1, let me know.
>
> Yours,
> Philip Reames
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev