[PATCH] Improve the cost evaluation of LSR

Fri May 1 15:50:07 PDT 2015

Hi David,

> On May 1, 2015, at 3:28 PM, Xinliang David Li <xinliangli at gmail.com> wrote:
> 
> 
> 
> On Fri, May 1, 2015 at 10:30 AM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote:
> Hi Wei,
> 
> The short story is: I do not think this is the way to go.
> 
> Now, the long story.
> 
> I have mixed feelings on the direction of the approach. On one hand, I also think we should optimize for performance as long as register pressure is not a problem. On the other hand, the register pressure estimate at this level too rough to make any useful decisions.
> 
> That is a pretty strong statement (about usefulness of register pressure estimate) :)   I think it is probably reasonably good for cases where pressure is very low and very high -- and this part can be improved independently. 

I have been extreme on my position to emphasize that the register pressure is broken right now. I am afraid we will chase improvement/regressions based on luck if we pursue in that direction without furthest evidences. Eventually we may reach something reasonable, but in the meantime the situation will be uncomfortable. I would need to do a gymnastic to keep the existing improvements while moving toward the right solution. That is why I was proposing to see how bad we are if we keep it simple first, then we can move toward a better model.

The register pressure estimation may really make sense, but if that is the case I believe we should make it an analysis pass to at least unify what we are doing in the loop vectorizer.
Like I said the current pressure estimate is broken:
1. The estimation is based only on the number of register required to materialize the formulae, again IIRC. Therefore if it exceeds the register pressure, then yes, we will exceed the register pressure. But when we do not exceed the register pressure, then we cannot say anything since we do not consider the code in the loop body.
2. The number of registers given by the TTI is really rough. For instance, it does not account for aliasing between scalar and vector, it sees float and int as part of the same register class.

> 
> Your current approach illustrate this point. Indeed, IIRC NumRegs only gives you the number of registers you need to materialize the formulae, we do not consider how many register we already need in the loop or through the loop. Therefore, I believe by tweaking the body of the loop in your motivating example (i.e., adding just enough live ranges), we can bloat the register pressure with the new rating and have spill within the loop, whereas we wouldn’t with the previous rating.
> 
> If we don't want spills, but spills occur, then it is matter of tuning register pressure estimation -- can even be made more conservative. In fact, if we consider spills in the overall cost, we can even deliberately allow some spills to reduce overall cost (but that requires very precise pressure estimate).

Sure.

>  
> 
> I also mentioned that in the related PR, but I believe the way to go is: not to care on register pressure and just rate the cost of the loop body. However, this implies the backends are able to recover from the register pressure bloat and I believe we are not quite here.
> 
> 
> The cost should include 'potential' spills due to register pressure.
> 
>  
> *How do we move?
> 
> I would suggest we add an internal option to make LSR more aggressive w.r.t. to register pressure, and fix all the problems that rise in the backends. Then, we can turn that option on by default.
> 
> We want to generate optimal code sequence with minimal cost --- that is not equivalent to 'the most aggressive LSR'.   Do we already know possible ways to fix the problem once the damage is already made (high reg pressure …)?

That is the point, we do not know and to me that would be the first information we should seek to determine the best direction. My guts say we may not be able to recover and indeed a register pressure estimation would come handy, but I like facts :).

>  
> 
> - What if other people still believe this is the right way to move?
> 
> Like I said, I do not think this is the right way to go. Now, if other people believe it is, I would at least expect that you supply more details numbers. In particular, what are the actual numbers (not just the geometric means) and what are the regressions, why we do not care or how do we plan to fix them.
> 
> I suggest Wei to refactor the change in a way so that it can be turned on/off with an option. When that is ready, the community can help with the performance testing on their favorite platforms with their favorite benchmarks. It may turns out to be better for everyone, who knows :)

If we move toward this heuristic, I really suggest we make it a pass or utility to provide register pressure estimation at the IR level. The loop vectorizer and also probably GVN, LICM, etc. could use such information.

Cheers,
-Quentin

> 
> David
> 
> 
>  
> 
> Cheers,
> -Quentin
> 
> 
> REPOSITORY
>   rL LLVM
> 
> http://reviews.llvm.org/D9429 <http://reviews.llvm.org/D9429>
> 
> EMAIL PREFERENCES
>   http://reviews.llvm.org/settings/panel/emailpreferences/ <http://reviews.llvm.org/settings/panel/emailpreferences/>
> 
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150501/e8a8116d/attachment.html>