<div dir="ltr"><div><div>I have an issue that I've been wrestling with for quite some time and I'm hoping that someone with a deeper understanding of the register allocator can help me with.<br><br></div>Namely, I am trying to teach RA to split a live range rather than allocating a CSR. I've attempted a very large number of tweaks to the costs (both existing and experimental ones that I've added). However, despite all of that, I can't seem to get RA to split the following:<br><span style="font-family:monospace,monospace"><br>Â 1 BB#0: derived from LLVM BB %entry <br>Â 2Â Â Â Â Live Ins: %X3<br>Â 3Â Â Â Â Â Â Â Â %vreg15<def> = COPY %X3; G8RC:%vreg15<br>Â 4Â Â Â Â Â Â Â Â %vreg4<def> = CMPLDI %vreg15, 0; CRRC:%vreg4 G8RC:%vreg15<br>Â 5Â Â Â Â Â Â Â Â %vreg11:sub_32<def,read-undef> = LI 0; G8RC:%vreg11<br>Â 6Â Â Â Â Â Â Â Â BCC 68, %vreg4, <BB#1>; CRRC:%vreg4<br>Â 7Â Â Â Â Successors according to CFG: BB#4(0x30000000 / 0x80000000 = 37.50%) BB#1(0x50000000 / 0x80000000 = 62.50%)<br>Â 8Â Â Â Â Â Â Â Â <br>Â 9 BB#4:Â Â <br>Â 10Â Â Â Â Predecessors according to CFG: BB#0<br>Â 11Â Â Â Â Â Â Â Â B <BB#3><br>Â 12Â Â Â Â Successors according to CFG: BB#3(?%)<br>Â 13Â Â Â Â <br>Â 14 BB#1: derived from LLVM BB %if.end<br>Â 15Â Â Â Â Predecessors according to CFG: BB#0<br>Â 16Â Â Â Â Â Â Â Â %vreg6<def> = ADDIStocHA %X2, <ga:@a>; G8RC_and_G8RC_NOX0:%vreg6<br>Â 17Â Â Â Â Â Â Â Â %vreg7<def> = LDtocL <ga:@a>, %vreg6, %X2<imp-use>; mem:LD8[GOT] G8RC_and_G8RC_NOX0:%vreg7,%vreg6<br>Â 18Â Â Â Â Â Â Â Â %vreg8<def> = LWA 0, %vreg7; mem:LD4[@a](tbaa=!3)(dereferenceable) G8RC:%vreg8 G8RC_and_G8RC_NOX0:%vreg7<br>Â 19Â Â Â Â Â Â Â Â %vreg9<def> = CMPLD %vreg8, %vreg15; CRRC:%vreg9 G8RC:%vreg8,%vreg15<br>Â 20Â Â Â Â Â Â Â Â BCC 68, %vreg9, <BB#3>; CRRC:%vreg9<br>Â 21Â Â Â Â Â Â Â Â B <BB#2><br>Â 22Â Â Â Â Successors according to CFG: BB#2(0x30000000 / 0x80000000 = 37.50%) BB#3(0x50000000 / 0x80000000 = 62.50%)<br>Â 23Â Â Â Â Â Â Â Â <br>Â 24 BB#2: derived from LLVM BB %if.then2<br>Â 25Â Â Â Â Predecessors according to CFG: BB#1<br>Â 26Â Â Â Â Â Â Â Â ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use><br>Â 27Â Â Â Â Â Â Â Â %vreg16<def> = COPY %vreg15; G8RC:%vreg16,%vreg15<br>Â 28Â Â Â Â Â Â Â Â BL8_NOP <ga:@callVoid>, <regmask **LONG LIST**>, %X3<imp-def,dead><br>Â 29Â Â Â Â Â Â Â Â ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use><br>Â 30Â Â Â Â Â Â Â Â ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use> 31Â Â Â Â Â Â Â Â %X3<def> = COPY %vreg16; G8RC:%vreg16<br>Â 32Â Â Â Â Â Â Â Â BL8_NOP <ga:@callNonVoid>, <regmask **LONG LIST**>, %X3<imp-use>, %X2<imp-use>, %R1<imp-def>, %X3<imp-def><br>Â 33Â Â Â Â Â Â Â Â ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use><br>Â 34Â Â Â Â Â Â Â Â %vreg11<def> = COPY %X3; G8RC:%vreg11 35Â Â Â Â Successors according to CFG: BB#3(?%)<br>Â 36Â Â Â Â Â Â Â Â <br>Â 37 BB#3: derived from LLVM BB %return<br>Â 38Â Â Â Â Predecessors according to CFG: BB#1 BB#2 BB#4<br>Â 39Â Â Â Â Â Â Â Â %vreg12<def> = EXTSW_32_64 %vreg11:sub_32; G8RC:%vreg12,%vreg11<br>Â 40Â Â Â Â Â Â Â Â %X3<def> = COPY %vreg12; G8RC:%vreg12<br>Â 41Â Â Â Â Â Â Â Â BLR8 %LR8<imp-use>, %RM<imp-use>, %X3<imp-use></span><br><br></div><div>No matter what I do, vreg15 will get a Callee-Saved Register assigned to it. However, this is suboptimal. So what I am trying to accomplish is to split the live range of vreg15 into the paths without the call and the path with the call (BL8_NOP is a call). Then the physical register X3 can be used in the paths <span style="font-family:monospace,monospace">BB#0 -> BB#1 -> BB#3</span> and <span style="font-family:monospace,monospace">BB#0 -> BB#4 -> BB#3</span> and it can be copied to a Callee-Saved Register in <span style="font-family:monospace,monospace">BB#2</span>.<br><br></div><div>Without such a split, vreg15 is assigned a CSR for the entire live range and there is no way to avoid having to save/restore the CSR in the prologue/epilogue. If one of the two paths that did not actually have the call turn out to be the hottest path through the function, there is a lot of wasted cycles in the save/restore because we weren't able to shrink-wrap this function due to the choice RA made.<br><br></div><div>If anyone can offer some ideas on what I should do here, I would truly appreciate it. <br><br></div><div>Nemanja<br></div></div>