[PATCH] Fix PR 23525 - Separate header mass propagation in irregular loops.

Duncan P. N. Exon Smith dexonsmith at apple.com
Fri Jun 12 13:05:04 PDT 2015


> On 2015-Jun-10, at 08:43, Diego Novillo <dnovillo at google.com> wrote:
> 
> 
> > Secondly, I wonder whether this will regress cases we care about if you
> > don't incorporate the entry frequency point.  E.g., your patch would
> > make the frequencies *way* off for the following contrived example:
> >
> >     assert(n > 0);
> >     int i = n - 2;
> >     if (n % 2 == 0)
> >       goto SeventyFive;
> >
> >     for (; i < n; ++i) {
> >       if (i % n == 75) {
> >     SeventyFive:
> >         foo(i);
> >       } else {
> >         bar(i);
> >       }
> >     }
> >
> > whereas (I think) the current in-tree code would get it roughly correct
> > (by fluke?).
> 
> Actually, my patch addresses this problem in particular.  This code is almost exactly the test case in PR 23525. The in-tree code gets it wrong the same way it gets PR 23525 wrong. The call to foo(i) is almost never taken vs the call to bar(i).

Actually, the call to SeventyFive is taken just more than 50% of the
time in this example.

I think you missed how `i` is initialized in my code snippet:

    int i = n - 2;

The `for` loop always runs exactly 2 times.  If `n` is 74, then
`SeventyFive` executes twice, and the `else` is never executed.
Otherwise, `SeventyFive` executes once, an the `else` executes once as
well.

> We get the following frequencies for this code snippet (I wrapped it into a function that I called 100 times).  The left column is current trunk.  The right column is trunk plus my patch.  I redacted the floating point figures to make them more readable:
> 
>    block-frequency-info: foo                   | block-frequency-info: foo
>    - entry: float = 1.0, int = 8388608         | - entry: float = 1.0, int = 8388608
>    - cond.true: float = 0.99, int = 8388600    | - cond.true: float = 0.99, int = 8388600
>    - cond.false: float = 0.00, int = 8         | - cond.false: float = 0.00, int = 8  
>    - : float = 0.0, int = 0                    | - : float = 0.0, int = 0
>    - cond.end: float = 0.99, int = 8388600     | - cond.end: float = 0.99, int = 8388600
>    - if.then: float = 0.49, int = 4152772      | - if.then: float = 0.49, int = 4152772
>    - if.end: float = 0.50, int = 4235827       | - if.end: float = 0.50, int = 4235827
> => - for.cond: float = 2.49, int = 20971499    | - for.cond: float = 2.88, int = 24197884
>    - for.body: float = 1.49, int = 12582899    | - for.body: float = 1.49, int = 12582899
> => - if.then.5: float = 0.02, int = 249991     | - if.then.5: float = 0.02, int = 249991
> => - SeventyFive: float = 2.49, int = 20971499 | - SeventyFive: float = 0.03, int = 288451
>    - if.else: float = 1.47, int = 12332908     | - if.else: float = 1.47, int = 12332908
>    - if.end.7: float = 2.49, int = 20971499    | - if.end.7: float = 4.58, int = 38428163
> => - for.inc: float = 2.49, int = 20971499     | - for.inc: float = 2.49, int = 20971499
>    - for.end: float = 0.99, int = 8388600      | - for.end: float = 0.99, int = 8388600
> 
> I highlighted the key blocks.  In particular, blocks if.then.5 and SeventyFive. In current trunk, if.then.5 is correctly given a very low frequency (i % 75 is almost never 0), but block SeventyFive is given the same frequency as for.cond and for.inc. That's wrong.
> 
> With this patch, we correctly

*incorrectly*

> compute SeventyFive's frequency to a fairly low value.

This is what I was worried about.



More information about the llvm-commits mailing list