<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Dec 18, 2017, at 8:16 PM, Vladimir Makarov via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">On 12/18/2017 07:07 PM, Michael Clark wrote:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">Hi Leslie,<br class=""><br class="">I suggest adding these 3 papers to your reading list.<br class=""><br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>Register allocation for programs in SSA-form<br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>Sebastian Hack, Daniel Grund, and Gerhard Goos<br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span><a href="http://www.rw.cdl.uni-saarland.de/~grund/papers/cc06-ra_ssa.pdf" class="">http://www.rw.cdl.uni-saarland.de/~grund/papers/cc06-ra_ssa.pdf</a><br class=""><br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>Simple and Efficient Construction of Static Single Assignment Form<br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>Matthias Braun , Sebastian Buchwald , Sebastian Hack , Roland Leißa , Christoph Mallon , and Andreas Zwinkau<br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>https://www.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf<br class=""><br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>Optimal register allocation for SSA-form programs in polynomial time<br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>Sebastian Hack, Gerhard Goos<br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>http://web.cs.ucla.edu/~palsberg/course/cs232/papers/HackGoos-ipl06.pdf<br class=""><br class="">A lot of the earlier literature regarding the register allocation problem describes the general graph colouring problem as NP-complete, however previous research in Register Allocation has been in the context of programs that were not in SSA form. i.e. the Chaitin-Briggs paper states that register allocation is NP-complete and proposes an iterative algorithm.<br class=""><br class=""><br class=""></blockquote><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">I am sceptical about SSA-register allocators for high quality code generation. But I should acknowledge although I tried a lot of RA algorithms I never tried SSA one.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">One task of RA is to deal with moves but when you are destroying SSA you are creating moves or swaps. Of course there are optimizations to decrease its number when you are going out of SSA. But IMHO a good RA should see all moves created before or during own work.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">As I already wrote coloring is just one out of many task of RA. All these tasks in a good RA should be solved in cooperation. Optimistic coloring is good and fast enough (although building a conflict graph for it takes a lot of time). I think better results can be obtained by improving other tasks than by improving optimistic coloring.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">For example, nobody mentioned irregular file architectures (with subregisters and register subclasses) like x86/x86-64. Even regular file architectures with caller/callee saved hard registers have irregularity because it creates register subclasses, e.g. class of saved general registers is a subclass of the overall general register class. Usually hard registers are present in IR and if some pseudos conflict with the hard registers, it also create a lot (intersected) subclasses.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><br class=""><div>I’m not sure this is the best place to start a discussion about register allocation in general, but I’ll bait :)</div><div><br class=""></div><div>Having worked with SSA regalloc in the past (and with LLVM now) I have to defend it: You can (and should) apply post-realloc copy coalescing algorithms for SSA allocators. The additional copies are a good thing, they are the reason for a less constrained graph with reduced register pressure meaning you can get by with less spills and reloads. Trading spills/reloads for copies is a good deal on most modern architectures.</div><div><br class=""></div><div>And I would take your statement that “a good RA should see all moves created before or during own work” as an endorsement of SSA regallocs: PHIs are just another form of COPY and very visible and SSA allocators cannot do pre-RA coalescing (of the phi induced copies) so they will see all the copies.</div><div><br class=""></div><div>SSA regalloc can handle classes/constraints just fine (you need to insert some more copies, but that is also true for most traditional reallocs). You are right in that SSA regalloc cannot directly deal with sub registers. I implemented support for that via a pre-pass in the past that merges subvalues before regalloc to get a regular regalloc problem again, which may be a bit tricky to pull off when your target has additional register constraints at the same time (which wasn’t the case for me). But for typical CPUs SSA regalloc is just fine: Libfirm where a lot of the SSA regalloc work happened sports high quality x86/x86_64/sparc code generators.</div><div><br class=""></div><div>And I also agree that coloring is only part of the problem and while it is the most researched talked about, I would also say that in practice spill code placement, tweaks in your representation, how to deal with two address code and other constraints, how good your dematerialization is etc. has a bigger impact than your particular choice of coloring algorithm. And for regular architectures I found SSA to make things easier :)</div><div><br class=""></div><div>- Matthias</div></div></body></html>