[PATCH] RegAlloc: Account for a variable entry block frequency
Manman Ren
manman.ren at gmail.com
Tue Apr 8 11:19:20 PDT 2014
On Mon, Apr 7, 2014 at 8:25 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
> Thanks for the review!
>
> On 2014 Apr 7, at 20:47, Manman Ren <manman.ren at gmail.com> wrote:
>
> > unsigned calculateRegionSplitCost(LiveInterval &VirtReg,
> > AllocationOrder &Order,
> > - BlockFrequency &BestCost,
> > + const BlockFrequency &AlternateCost,
> > unsigned RAGreedy::calculateRegionSplitCost(LiveInterval &VirtReg,
> > AllocationOrder &Order,
> > - BlockFrequency &BestCost,
> > + const BlockFrequency
> &AlternateCost,
> > unsigned &NumCands,
> > bool IgnoreCSR) {
> > + BlockFrequency BestCost = AlternateCost;
> > --> The above should be separated (is there a reason for these changes?)
>
> Without this change, the argument (sometimes CSRCost) gets modified
> and reset to zero. This becomes a problem when CSRCost is initialized
> only once per MachineFunction. This change prevents that.
>
Got it.
>
> Since no callers check it after the call, a non-const reference was
> strange API anyway.
>
Currently no caller uses BestCost, but the argument BestCost passes in the
best cost
so far and it returns the best cost afterwards.
I am okay with the change. Please commit,
Thanks,
Manman
> I've split it into a separate patch below.
> > + uint64_t FixedEntry = 1 << 14;
> > + if (ActualEntry < FixedEntry)
> > + CSRCost *= BranchProbability(ActualEntry, FixedEntry);
> > + else if (ActualEntry <= FixedEntry * 16)
> > + // Invert the fraction and divide.
> > + CSRCost /= BranchProbability(FixedEntry, ActualEntry);
> > + else
> > + // Can't use BranchProbability in general, since it takes 32-bit
> numbers.
> > + // This path is faster anyway, albeit less precise.
> > + CSRCost = CSRCost.getFrequency() * (ActualEntry >> 14);
> > --> Are we trying to be precise when possible with this if statement?
> Why a fixed 16 in the 2nd condition?
>
> I was prematurely optimizing, and only guaranteeing 4 bits of
> precision so that it "usually" took the fast path. The new patch is
> far less magical:
>
> - it uses UINT32_MAX instead of FixedEntry*16 (probably no slower in
> practice), and
>
> - it uses ActualEntry/FixedEntry instead of ActualEntry>>14 (exact
> same math, but more human-readable).
>
> > Thanks,
> > Manman
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140408/d73e46b3/attachment.html>
More information about the llvm-commits
mailing list