[PATCH] Improve the cost evaluation of LSR

Fri May 1 11:46:47 PDT 2015

> On May 1, 2015, at 11:32 AM, Wei Mi <wmi at google.com> wrote:
> 
> Hi Quentin,
> 
> Thanks for explaining your idea to the LSR problem in detail.
> 
>> Now, the long story.
>> 
>> I have mixed feelings on the direction of the approach. On one hand, I also think we should optimize for performance as long as register pressure is not a problem. On the other hand, the register pressure estimate at this level too rough to make any useful decisions.
>> 
> 
> I agree with you, except that the register pressure estimate cannot
> make any useful decisions.
> I understand that the register pressure estimation cannot be very
> precise in an early stage like LSR, but it could still be useful,
> especially when the real register pressure in a loop is very low or
> very high. When the register pressure in a loop is close to the number
> of available registers,  I admit my patch can intend to use more
> register, but the intention is still to reduce instruction number in
> other place, like reducing recurrance adds at the end of loop or
> reducing NumAddParts for a LSRUse. Perf impact in such case will be
> hard to tell, because we may have more spills, but have less add insns
> in the loop at the same time.
> 
>> Your current approach illustrate this point. Indeed, IIRC NumRegs only gives you the number of registers you need to materialize the formulae, we do not consider how many register we already need in the loop or through the loop. Therefore, I believe by tweaking the body of the loop in your motivating example (i.e., adding just enough live ranges), we can bloat the register pressure with the new rating and have spill within the loop, whereas we wouldn’t with the previous rating.
> 
> I hope for the test tweaked, although the new rating may have more
> spills, it will have less add insns. If it is not the case, it is a
> bug in the new rating I should look at. The new rating have no reason
> to increase NumRegs when it cannot reduce InstNumCost.

I agree and I do not think your patch has a bug in this regard :). My point was the register pressure estimate as it is does not make sense NumRegs != Register pressure of the loop, IIRC.

> 
>> 
>> I also mentioned that in the related PR, but I believe the way to go is: not to care on register pressure and just rate the cost of the loop body. However, this implies the backends are able to recover from the register pressure bloat and I believe we are not quite here.
> 
> I agree it is a difficut way to go.
> 
>> 
>> *How do we move?
>> 
>> I would suggest we add an internal option to make LSR more aggressive w.r.t. to register pressure, and fix all the problems that rise in the backends. Then, we can turn that option on by default.
> 
> That is a good suggestion! We can have more testcases then by
> comparing several approaches.
> 
>> 
>> - What if other people still believe this is the right way to move?
>> 
>> Like I said, I do not think this is the right way to go. Now, if other people believe it is, I would at least expect that you supply more details numbers. In particular, what are the actual numbers (not just the geometric means) and what are the regressions, why we do not care or how do we plan to fix them.
> 
> The improvement is: 1% for Ad Delivery, 1.5% for licence place
> detection, 1.5% for object recognition, 3% for a matrix computation
> library. I also tested spec2000 and found almost no perf difference.
> I didn't see significant regressions caused by LSR now. Actually I saw
> at first then I fixed those regressions and collected some testcases.
> There were still some regressions but caused by other side effect
> after analysis. However all the tests are only for x86. I believe it
> will have many regressions on other platforms.
> 
> Thanks,
> Wei.