[LLVMdev] How best to represent assume statements in LLVM IR?

Tue Oct 2 14:55:28 PDT 2012

On Tue, 02 Oct 2012 14:43:51 -0700
Philip Reames <listmail at philipreames.com> wrote:

> On 10/02/2012 12:52 AM, Duncan Sands wrote:
> > Hi Philip,
> >
> >> Does anyone have any suggestions on how to best represent an
> >> assumption statement(*) in IR?
> >
> > good question!  There have been various attempts, for example Nick
> > tried teaching the optimizers to not prune the branch to
> > unreachable in
> >
> >    br %cond, label %assumption_holds, %assumption_doesnt_hold
> > assumption_doesnt_hold:
> >    unreachable
> >
> > This then leads to %cond being replaced with true everywhere
> > downstream, which
> > is good.  Unfortunately it also causes a bunch of other
> > optimizations to not
> > occur, and the overall result was not a win.
> >
> > Rafael added "range" metadata to the IR, however it is only for
> > loads. If it
> > could also be attached to the definition of %cond, saying that the
> > value is 1,
> > and various places are taught to understand that, then that might be
> > another
> > possibility.
> It really seems to me that the core problem is that the assumption 
> branch is removed before the information it contained could be
> leveraged to remove downstream code.  If the "unreachable" is
> replaced with a call to an external function marked noreturn (i.e. a
> trap like function), the optimizer does correctly optimize downstream
> code.  This seems to be a classic ordering problem.
> 
> Note: I haven't studied how the downstream code is optimized.  It
> seems odd to me that the information is lost when the branch is
> lost.  I may dig into what's causing the analysis information to be
> dropped - there might simply be a bug somewhere along the way.
> 
> Nick's approach seems to tackle the ordering problem by preventing
> the optimization of the unreachable branch.  As you pointed out, this
> is undesirable since we don't want the code to calculate the
> assumption condition to be retained.
> 
> I'm not quiet clear on how Rafael's solution addresses the problem at 
> hand, so I won't address that.
> 
> I've gone ahead and implemented my hack solution - use an external 
> function to force downstream optimization, replace it with
> unreachable and then re-optimize.  At least for the trivial examples
> I'm testing with, this seems to get the desired behavior.  The
> downside of course is that it requires the all of the other passes to
> be run twice.

How about having an analysis pass that turns these things into SCEVs
that can be associated as predicates with each downstream instruction?
These can stick around so long as this analysis is not invalidated
(this would imply some infrastructure work, but the work would be
incremental in nature).

> 
> If anyone is interested, I can throw the (nearly trivial) code up on 
> github for anyone else who wants it.

Can't hurt :)

 -Hal

> 
>  > However I think people would rather only have such
> > metadata be
> > attached to "inputs" to a function: function parameters, loads from
> > memory etc.
> > In your case, is the assumption usually about a function parameter?
> For the initial examples I'm considering - specifically object 
> invariants and function preconditions - all assumptions are about 
> function parameters.  However, I can think of some examples where
> having the ability to note an assumption about an arbitrary
> intermediate value might be useful.  If I can, I'd like to not
> exclude that case.
> 
> One such example (for a language with array bounds checks):
> int n = external_func_compute_loop_bound();
> assume n < array.size();
> for(int i = 0; i < n; i++) {
>    process(arrray[i]);
> }
> 
> Philip
> 
> *removed extra text for readability*
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory