[llvm] r204690 - Register Allocator: check other options before using a CSR for the first time.
Duncan P. N. Exon Smith
dexonsmith at apple.com
Mon Apr 7 11:03:18 PDT 2014
On 2014 Apr 7, at 12:18, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>
> On 2014 Apr 4, at 22:46, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>
>> On 2014 Mar 24, at 17:16, Manman Ren <manman.ren at gmail.com> wrote:
>>
>>> Author: mren
>>> Date: Mon Mar 24 19:16:25 2014
>>> New Revision: 204690
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=204690&view=rev
>>> Log:
>>> Register Allocator: check other options before using a CSR for the first time.
>>>
>>> When register allocator's stage is RS_Spill, we choose spill over using the CSR
>>> for the first time, if the spill cost is lower than CSRCost.
>>> When register allocator's stage is < RS_Split, we choose pre-splitting over
>>> using the CSR for the first time, if the cost of splitting is lower than
>>> CSRCost.
>>>
>>> CSRCost is set with command-line option "regalloc-csr-first-time-cost". The
>>> default value is 0 to generate the same codes as before this commit.
>>>
>>> With a value of 15 (1 << 14 is the entry frequency), I measured performance
>>> gain of 3% on 253.perlbmk and 1.7% on 197.parser, with instrumented PGO,
>>> on an arm device.
>>>
>>> rdar://16162005
>>>
>>> Added:
>>> llvm/trunk/test/CodeGen/AArch64/ragreedy-csr.ll
>>> Modified:
>>> llvm/trunk/lib/CodeGen/RegAllocGreedy.cpp
>>
>> Hi Manman,
>>
>> This commit relies on the entry frequency being 1<<14, but the API
>> of BlockFrequencyInfo does not guarantee anything about the entry
>> frequency. While the current implementation happens to use 1<<14,
>> the patch I'm working on sets it based on how branchy a particular
>> function is, which causes this test to fail.
>>
>> The cost value (e.g., 15) needs to be compared to a ratio between
>> the actual entry frequency and the block frequency in question.
>>
>> I'm happy to fix this, but there are a couple of possible directions
>> and I'm not sure which is best.
>>
>> 1. Keep the meaning (and name) of CSRCost and
>> -regalloc-csr-first-time-cost unchanged. This requires scaling
>> the cost by the ratio of entry frequency to 1<<14.
>
> After looking at the code it's interfacing with, it looks like option
> (1) is superior. I've attached a patch that implements that.
>
> However, the test is still failing after the patch. Digging deeper, the
> BFI pass in trunk assigns block frequencies of 0 to the false branches (e.g., cond.false.i.i) due to loss of precision. Somehow this (or another
> change?) changes the control flow in calcGlobalSplitCost(). In particular,
> the loop through Cand.ActiveBlocks finds that upper.exit has no RegIn, so
> its relatively large block frequency gets added to GlobalCost.
>
> I'm still looking into it, but let me know if you have any insight.
Tracked it down. SpillPlacement.cpp has a Threshold that should also be
scaled.
I'll send a review in a new thread that fixes both.
More information about the llvm-commits
mailing list