<html><body><div style="color:#000; background-color:#fff; font-family:arial, helvetica, sans-serif;font-size:12pt"><div>Hi,</div><div><br></div><div>We are working on a research project whose objective is developing a pre-allocation scheduling algorithm that achieves the optimal balance between exploiting ILP (hiding latencies) and minimizing register pressure. A prototype of our algorithm has been implemented and integrated into an experimental version of LLVM 2.9. Our algorithm is based on a combinatorial optimization approach, which is naturally slower than heuristic approaches. However, our benchmarking (using SPEC CPU2006) shows that for some (but not all) benchmarks our algorithm can produce significantly faster code with a reasonable increase in compile time if we apply it selectively to the hot spots. Work on generating faster code with less increase in compile time is in progress. <br></div><div><br></div><div>One of the
issues that we currently trying to resolve is computing a precise register pressure estimate that correlates very well with the amount of spill code generated by the register allocator (we are using the default linear scan one). Of course, we won't be able to achieve 100% correlation unless we do allocation and scheduling simultaneously, which is not our current goal. Our current goal is limited to enhancing the pre-allocation scheduling phase to achieve the best possible reduction in register pressure without sacrificing ILP. Note that pre-allocation scheduling is done within the basic block while allocation is done globally for the whole function, which makes it even harder to achieve a good correlation.<br></div><div><br></div><div> One factor that is causing our current register pressure estimate to be off is not being able to properly account for live-in and live-out registers (both virtual and physical). As far as we can tell, LLVM represents
live-in regs with CopyFromReg instrs and live-out regs with CopyToReg instrs. However, it looks that in a given basic block, LLVM does not generate CopyToReg instrs for registers that are not defined within that block and does not generate CopyFromReg instsr for regs that are not used within the block. This is causing a problem for us, because a precise register pressure estimate has to take into account all live regs at a given point in the block whether they are defined or used in the block itself or not. Our questions are:</div><div><br></div><div>(1) How can we get information about the registers that are live within a basic block but are not defined within the block? And how can we get that information for regs that are not used in the block?<br></div><div><br></div><div>(2) Can we safely assume that CopyFromReg instrs and CopyToReg instrs are used for the sole purpose of representing live-in and live-our regs? In other words, are there other uses
for these?</div><div>If yes, how do we identify the ones that represent live-in and live-out regs.</div><div><br></div><div>(3) The current physical regsiter limits obtained using <br></div><div>TargetRegisterInfo::getRegPressureLimit()</div><div>seem to be too low (for example, 3 integer regs on x86 32-bit mode). Are these good limits to use in our case? If not, how can we get better limits?</div><div><br></div><div>Thank you in advance!</div><div>-Ghassan</div><div><br></div><div><br></div></div></body></html>