[PATCH] D29092: PR31729: [GVNHoist] Don't hoist unsafe scalars at -Oz

Thu Feb 23 22:24:34 PST 2017

dberlin added a comment.

In https://reviews.llvm.org/D29092#685418, @hiraditya wrote:

> In https://reviews.llvm.org/D29092#685307, @dberlin wrote:
>
> > In https://reviews.llvm.org/D29092#685273, @mehdi_amini wrote:
> >
> > > In https://reviews.llvm.org/D29092#685265, @dberlin wrote:
> > >
> > > > (Though i quibble with this part:
> > > >
> > > >   // The called function could have undefined behavior or
> > > >   // side-effects, even if marked readnone nounwind.
> > > >   
> > > >
> > > > I'm not sure what we think those side effects are, but we should be modeling them or ignoring them.
> > >
> > >
> > > Divide by zero?
> > >
> > > The only current solution we have for this now is to be conservative with respect to calls. There is a new attribute implemented here https://reviews.llvm.org/D20116 that will help making it "less conservative", by allowing to speculate if there can't be UB whatever arguments are passed to the function. 
> > >  This wouldn't capture this for instance:
> > >
> > >   // can be speculated if we know that the argument is never 0
> > >   int inverse(int n) { return 1/n; }
> > >
> > >
> > > From your GCC experience, do you have a suggestion how to address/model this?
> >
> >
> > First:
> >
> >   int __attribute__((noinline))   inverse(int n) { return 1/n; }
> >    int main(int argc)
> >    {
> >  
> >     int a = 0;
> >     for (int i = 0; i < 5; ++i)
> >         a+= inverse(argc);
> >   return 0
> >   }
> >  
> >
> >
> > Is optimized to return 0 at all opt levels (ie it does not consider it enough of a side effect to keep it alive)
> >  (it's the same whether it's inlined or not)
> >  If you change it to return a, you get:
> >
> >   main (int argc)
> >    {
> >      int i;
> >      int a;
> >      int _16;
> >      _16 = inverse (argc_5(D));
> >      # a_14 = PHI <a_7(3), 0(2)>
> >      # i_15 = PHI <i_8(3), 0(2)>
> >      a_7 = a_14 + _16;
> >      i_8 = i_15 + 1;
> >      if (i_8 != 5)
> >        goto <bb 3>;
> >      else
> >        goto <bb 4>;
> >      return a_7;
> >  
> >    }
> >  
> >
> >
> > Note the hoist.
> >
> > It then unrolls the loop
> >  resulting in:
> >
> >   main (int argc)
> >    {
> >      int a;
> >      int _16;
> >  
> >      _16 = inverse (argc_5(D));
> >      a_2 = _16 * 5;
> >      return a_2;
> >  
> >    }
> >  
> >  
> >
> >
> > If you change it to something it can't prove always executes, it still hoists it to the pre-header, just not above loop condition (obviously).
> >  So i guess my answer is: We are being too conservative here.
>
>
> Since gcc's gvn hoist works in tandem with gvn-pre, the hoist of a call which occurs only on one branch may have been done by gvn-pre. IIUC gcc's gvn-hoist only hoists instructions which are anticipable (IN) but not available (OUT), that too with basic blocks with > 1 successors (ref: gcc/tree-ssa-pre.c:3499). It is unlikely gvn-hoist is hoisting the call in this example. Another limitation of gcc's gvn-hoist is that it only hoists when at least one immediate successor has the instruction in avail_out. So even if an expression is very busy it may not get hoisted. I understand that this allows them to
>  not have any cost model for hoisting too far but code-size wise there will be missed opportunities.

FWIW, as i mentioned offline, these are trivially fixable without a complex algorithm.
They have simply chosen not to because it has not been worth it, AFAIK.

> I agree the design of llvm's gvn-hoist  is not very clean mostly because it went multiple iterations,

Gotta be honest:
At some point, you have to take the code that was a prototype, learn from it, and then write a clean version that is sane.

NewGVN went through a ton of iterations and ugly ass code, as chandler can tell you. At some point, we just sat down, said "okay, this is probably a *design* that works, now how do we make it look at least mildly sane".

> your ideas helped very much to write an optimistic algorithm and incorporate Memory SSA.

https://reviews.llvm.org/D29092