[llvm-dev] Register Allocation Graph Coloring algorithm and Others

Mon Dec 18 14:35:06 PST 2017

Leslie Zhai <lesliezhai at llvm.org.cn> writes:

> But I like to practice and learn :)
> https://github.com/xiangzhai/llvm/blob/avr/lib/CodeGen/RegAllocGraphColoring.cpp#L327because theory is not always correct, or misunderstood by people, so I
> want to compare solutionByHEA, IRA, Greedy, PBQP and other algorithms.

That is a very good way to learn.  Learn by doing and observing how
results change as parameters vary.  You will never stop learning.  :)

> Thanks for your lessons to correct my mistakes, such as memory=bad
> register=good, and I need to find the answer *when* memory is worse
> than a register, I am maintaining AVR target, there are 32 general
> registers, 32K flash, 2K sram
> http://www.atmel.com/Images/Atmel-42735-8-bit-AVR-Microcontroller-ATmega328-328P_Datasheet.pdfso perhaps to MCU, memory might be expensive than register? but what
> about AMDGPU or VLIW processors? I don't have experienced on them,
> please teach me.

I do not have much experience with those architectures either.  As I
said, the "best" algorithm for register allocation is very
target-dependent.  What works well on AVR might work very poorly on a
GPU.  The only way to know is to test, test, test.  Of course one can
make some educated guesses to narrow the amount of testing.  Many times
a "good" allocator is "good enough" on many targets.  I work for a
company that tries to squeeze every last bit of performance out of
codes.  We're a bit fanatical that way so we try lots of things.  Most
places aren't that obsessive.  :)

> I am reading LLVM's code SpillXXX, LiveRangeXXX, RegisterCoalescer,
> etc. to get the whole view of CodeGen.

Those are great places to learn about register allocation!  They can
also be complicated and a bit daunting.  The folks on the LLVM list can
help guide you but you will also do well just making observations,
stepping through with a debugger, etc.  I certainly don't claim to
understand all of the nuances in this code.  Lots of people have
contributed to it over the years.

> I am reading Dr. Rhydian Lewis's book: A Guide to Graph Colouring:
> Algorithms and Applications
> http://www.springer.com/us/book/9783319257280   and other papers, even
> if HEA is not the best solution, I still want to practice and see the
> benchmark, I am not computing professionals, I am only 34 olds,
> perhaps I have enough time to waste :)

I am not familiar with that book but lots of reading will do you well.
There's an endless supply of papers to look at.  And practice, practice,
practice.  You seem to be on the right track!

                             -David

> 在 2017年12月19日 00:41, dag at cray.com 写道:
>> Leslie Zhai <lesliezhai at llvm.org.cn> writes:
>>
>>> * Memory (20 - 100 cycles) is expensive than Register (1 cycle), but
>>> it has to spill code when PhysReg is unavailable
>> As Vladimir said, the cache makes this kind of analysis much more
>> tricky.  It's not necessarily the case that memory=bad and
>> register=good.  Since there are tradeoffs involved, one needs to
>> evaluate different strategies to determine *when* memory is worse than a
>> register.  It may very well be the case that leaving something in memory
>> frees up a register for something much more important to use it.  All of
>> the register allocation algorithms try to determine this kind of thing
>> through various heuristics.  Which heuristic is most effective is highly
>> target-dependent.
>>
>> In my experience, changing heuristics can swing performance 20% or more
>> in some cases.  Today's processors and memory systems are so complex
>> that 2nd- and even 3rd-order effects become important.
>>
>> It is very, very wrong on today's machines to use # of spills as a
>> metric to determine the "goodness" of an allocation.  Determining *what*
>> to spill is much more important than the raw number of spills.  Many
>> times I have a seen codes generated with more spills perform much better
>> than the code generated with fewer spills.  Almost all of the papers
>> around the time of Chaiten-Briggs used # of spills as the metric.  That
>> may have been appropriate at that time but take those results with giant
>> grains of salt today.  Of course they are still very good papers for
>> understanding algorithms and heuristics.
>>
>> The best way I know to determine what's best for your target is to run a
>> whole bunch of *real* codes on them, trying different allocation
>> algorithms and heuristics.  It is a lot of work, still worthy of a
>> Ph.D. even today.  Register allocation is not a "solved" problem and
>> probably never will be as architectures continue to change and become
>> ever more diverse.
>>
>>> * Folding spill code into instructions, handling register coallescing,
>>> splitting live ranges, doing rematerialization, doing shrink wrapping
>>> are harder than RegAlloc
>> Again, as Vladimir said, they are all part of register allocation.
>> Sometimes they are implemented as separate passes but logically they all
>> contribute work to the task of assigning registers as efficiently as
>> possible.  And all of them use heuristics.  Choosing when and when not
>> to, for example, coalesce can be important.  Splitting live ranges is
>> logically the opposite of coalescing.  Theoretically one can "undo" a
>> bad coalescing decision by re-splitting the live range but it's usually
>> not that simple as other decisions in the interim can make that tricky
>> if not impossible.  It is a delicate balancing act.
>>
>>> * The papers by Briggs and Chaiten contradict[2] themselves when
>>> examine the text of the paper vs. the pseudocode provided?
>> As others have said, I don't recall any inconsistencies but that doesn't
>> mean there aren't bugs and/or incomplete descriptions open to
>> interpretation.  One of my biggest issues with our computer academic
>> culture is that we do not value repeatability.  It is virtually
>> impossible to take a paper, implement the algorithm as described (or as
>> best as possible given ambiguity) and obtain the same results.  Heck,
>> half my Ph.D. dissertation was dissecting a published paper, examining
>> the sensitivity of the described algorithm to various parts of the
>> described heuristic that were ambiguous.  By interpreting the heuristic
>> description in different ways I observed very different results.  I read
>> papers for the algorithms, heuristics and ideas in them.  I pay no
>> attention to results because in the real world we have to implement the
>> ideas and test them ourselves to see if they will help us.
>>
>> Peter is right to point you to Preston.  He is very accessible, friendly
>> and helpful.  I had the good fortune to work with him for a few years
>> and he taught me a lot.  He has much real-world experience on codes
>> actually used in production.  That experience is gold.
>>
>> Good luck to you!  You're entering a world that some computing
>> professionals think is a waste of time because "we already know how to
>> do that."  Those of us in the trenches know better.  :)
>>
>>                                 -David