[PATCH] D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop

Wed Jul 31 02:24:15 PDT 2019

ebrevnov added a comment.

In D43256#1606387 <https://reviews.llvm.org/D43256#1606387>, @Carrot wrote:

> In D43256#1605737 <https://reviews.llvm.org/D43256#1605737>, @ebrevnov wrote:
>
> > In D43256#1604069 <https://reviews.llvm.org/D43256#1604069>, @Carrot wrote:
> >
> > > In D43256#1602841 <https://reviews.llvm.org/D43256#1602841>, @hans wrote:
> > >
> > > > In D43256#1550842 <https://reviews.llvm.org/D43256#1550842>, @Carrot wrote:
> > > >
> > > > > In D43256#1545519 <https://reviews.llvm.org/D43256#1545519>, @samparker wrote:
> > > > >
> > > > > > Do you have any benchmark numbers to show that this is generally profitable? From our downstream testing, it is not clear that this change is beneficial.
> > > > >
> > > > >
> > > > > We got performance improvement in our internal search benchmark.
> > > >
> > > >
> > > > How does this transformation impact the benchmark when not using profile data?
> > >
> > >
> > > In plain mode we also got performance improvement, the speedup is a little smaller than FDO Mode.
> >
> >
> > I have a general question/comment.  By now it's more or less evident that benefit of optimization heavily depends  on correctness of profile information. That means in general case there is no way to reason about its effectiveness. Thus I believe it should be turned off if there is no profile.
>
>
> The optimization decision is based on profile information, if a real profile is not available, the statically estimated profile information (generated by BranchProbabilityInfo.cpp) is used. So if an unreasonable probability is generated as in your first case, or if an user program has untypical run time behavior than BPI expected, it may make bad decision.
>  Since you have strong concern about the optimization without real profile information, I will restore the old behavior if no real profile information is available.

I took a deeper look at the second failing case and found something interesting.  In fact, there is a profile information for the loop but it's incomplete. For example there is a profile data for the method entry, loop back edge and some other branches in the loop. Only two branches doesn't have profile. (You may wonder how it's possible. Original application is java program where profile typically available but may be unavailable for inlined  methods). I think your heuristic makes wrong decision due to absence of profile for these two branches which were generated from one select instruction over boolean variable. In most cases you really can't statically estimate if boolean is false or true.

I think we need to be conservative and don't account for potential benefit from branches with out representative profile.

Thanks
Evgeniy

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D43256/new/

https://reviews.llvm.org/D43256