<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">For the register class union solution, what I was saying is if you have the larger union register class, the spilling happens automatically via live-range splitting (i.e., we don’t hit the inline spiller).<div class=""><br class=""></div><div class="">For the iterative regalloc idea (where you may hit the inline spiller problem), you would need to look at AMDGPU. I don’t remember how it works on top of my head.<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Dec 18, 2019, at 12:39 PM, Hendrik Greving <<a href="mailto:hendrik.greving.smi@gmail.com" class="">hendrik.greving.smi@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Ok, thanks. Except the question was meant slightly different. Less w.r.t. organizing the register classes, and more w.r.t. implementation. I've noticed for instance that when trying to model this straight forwardly by writing a vreg from spills and reading this from fills (not further elaborated here), that the spiller can't handle vreg def-use pairs: there are assertions making sure a spill does not have any uses , e.g. see InlineSpiller.cpp, allDefsAreDead() calls. This made me wonder if this is supported natively at all.</div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Dec 18, 2019 at 12:02 PM Quentin Colombet <<a href="mailto:qcolombet@apple.com" class="">qcolombet@apple.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;" class="">Hi Hendrik,<div class=""><br class=""></div><div class="">This question is a recurring one. Check for instance <a href="https://lists.llvm.org/pipermail/llvm-dev/2016-February/095457.html" target="_blank" class="">https://lists.llvm.org/pipermail/llvm-dev/2016-February/095436.html</a> for a related conversation.</div><div class=""><br class=""></div><div class="">What this conservation boils down to is that you can achieve that by providing a larger register class that contains the union of the registers that are used with where they can be spilled.</div><div class=""><br class=""></div><div class="">For instance, let say you have a register class GPR that can be spilled into SPR.</div><div class="">You would create three register classes: GPR, SPR and GPR_union_SPR. GPR_union_SPR is never explicitly used in any real instruction (i.e., it does not appear in any MC description), but will give a way to regalloc to relax the constraints on available registers when doing live-range splitting.</div><div class=""><br class=""></div><div class="">Let say you have the following code:</div><div class=""><font face="Menlo" class="">V1(gpr) = op</font></div><div class=""><font face="Menlo" class="">… // <— too high gpr pressure</font></div><div class=""><font face="Menlo" class="">= op V1(gpr)</font></div><div class=""><br class=""></div><div class="">What RA will do is first split the live range of V1:</div><div class=""><div class=""><font face="Menlo" class="">V1(gpr) = op</font></div><div class=""><font face="Menlo" class="">V2 = copy V1</font></div><div class=""><font face="Menlo" class="">… // <— too high gpr pressure</font></div><div class=""><font face="Menlo" class="">V3(gpr) = copy V2</font></div><div class=""><font face="Menlo" class="">= op V3(gpr)</font></div></div><div class=""><br class=""></div><div class="">Now, V2 does not need to be constrained on gpr anymore and will end up using GPR_union_SPR. So effectively, if there is no GPR available for V2, an SPR will be used and thus V1 will be “spilled” to a GPR.</div><div class=""><br class=""></div><div class="">Disclaimer: The live-range splitting may act up and it may not be as straight forward to apply this solution but the idea remains valid.</div><div class=""> </div><div class="">AMDGPU does something a bit different IIRC, it basically runs the allocator several times:</div><div class="">- First they allocate GPRs and spill them into SPR, since SPR registers are not taken into account during this iteration there is no issue for creating new live ranges during spilling for these ones</div><div class="">- Second they allocate SPRs and they get spilled to memory.</div><div class=""><br class=""></div><div class="">Cheers,</div><div class="">-Quentin<br class=""><div class=""><br class=""><blockquote type="cite" class=""><div class="">On Dec 17, 2019, at 1:47 PM, Hendrik Greving via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class=""><div class=""><div dir="ltr" class="">Hello, for an architecture that doesn't have a good way to load/store a given register class to memory, is it instead easy to spill/fill from another register class instead?<div class="">e.g.</div><div class="">- storeRegToStack/loadRegFromStack use a pseudo instruction and add virtual register operand is not supported (spill optimization doesn't seem to like this).</div><div class="">- AMDGPU backend seems to do sth. similar?</div><div class=""><br class=""></div><div class="">The only way to safely do it seems to use register scavenger to get a temp register, and spill this in eliminateFrameIndex? Is there an obvious way to spill to a register instead? Thanks in advance for any hints</div></div>

_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank" class="">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class=""></div></blockquote></div><br class=""></div></div></blockquote></div>

</div></blockquote></div><br class=""></div></body></html>