[llvm] r204690 - Register Allocator: check other options before using a CSR for the first time.

Duncan P. N. Exon Smith dexonsmith at apple.com
Mon Apr 7 11:03:18 PDT 2014


On 2014 Apr 7, at 12:18, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:

> 
> On 2014 Apr 4, at 22:46, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
>> On 2014 Mar 24, at 17:16, Manman Ren <manman.ren at gmail.com> wrote:
>> 
>>> Author: mren
>>> Date: Mon Mar 24 19:16:25 2014
>>> New Revision: 204690
>>> 
>>> URL: http://llvm.org/viewvc/llvm-project?rev=204690&view=rev
>>> Log:
>>> Register Allocator: check other options before using a CSR for the first time.
>>> 
>>> When register allocator's stage is RS_Spill, we choose spill over using the CSR
>>> for the first time, if the spill cost is lower than CSRCost. 
>>> When register allocator's stage is < RS_Split, we choose pre-splitting over
>>> using the CSR for the first time, if the cost of splitting is lower than
>>> CSRCost.
>>> 
>>> CSRCost is set with command-line option "regalloc-csr-first-time-cost". The
>>> default value is 0 to generate the same codes as before this commit.
>>> 
>>> With a value of 15 (1 << 14 is the entry frequency), I measured performance
>>> gain of 3% on 253.perlbmk and 1.7% on 197.parser, with instrumented PGO,
>>> on an arm device.
>>> 
>>> rdar://16162005
>>> 
>>> Added:
>>> llvm/trunk/test/CodeGen/AArch64/ragreedy-csr.ll
>>> Modified:
>>> llvm/trunk/lib/CodeGen/RegAllocGreedy.cpp
>> 
>> Hi Manman,
>> 
>> This commit relies on the entry frequency being 1<<14, but the API
>> of BlockFrequencyInfo does not guarantee anything about the entry
>> frequency.  While the current implementation happens to use 1<<14,
>> the patch I'm working on sets it based on how branchy a particular
>> function is, which causes this test to fail.
>> 
>> The cost value (e.g., 15) needs to be compared to a ratio between
>> the actual entry frequency and the block frequency in question.
>> 
>> I'm happy to fix this, but there are a couple of possible directions
>> and I'm not sure which is best.
>> 
>> 1. Keep the meaning (and name) of CSRCost and
>>   -regalloc-csr-first-time-cost unchanged.  This requires scaling
>>   the cost by the ratio of entry frequency to 1<<14.
> 
> After looking at the code it's interfacing with, it looks like option
> (1) is superior.  I've attached a patch that implements that.
> 
> However, the test is still failing after the patch.  Digging deeper, the
> BFI pass in trunk assigns block frequencies of 0 to the false branches (e.g., cond.false.i.i) due to loss of precision. Somehow this (or another
> change?) changes the control flow in calcGlobalSplitCost().  In particular,
> the loop through Cand.ActiveBlocks finds that upper.exit has no RegIn, so
> its relatively large block frequency gets added to GlobalCost.
> 
> I'm still looking into it, but let me know if you have any insight.

Tracked it down.  SpillPlacement.cpp has a Threshold that should also be
scaled.

I'll send a review in a new thread that fixes both.



More information about the llvm-commits mailing list