<html><body><div style="color:#000; background-color:#fff; font-family:arial, helvetica, sans-serif;font-size:12pt"><div>Hi,</div><div><br></div><div>I am currently writing a paper 

documenting a research project that we have done on pre-allocation 

instruction scheduling to balance ILP and register pressure. In the 

paper we compare the pre-allocation scheduler that we have developed to 

LLVM's default schedulers for two targets: x86-64 and x86-32. We would 

like to include in our paper some brief descriptions of the two LLVM 

schedulers that we are comparing against and some information about the 

machine model that they are scheduling for.  So, it would be great if 

you could confirm or correct the following information and answer my 

questions below:<br></div><div><br> </div><div>The default scheduler for

 the x86-32 target is the bottom-up register-pressure reduction (BURR) 

scheduler, while for the x86-64 target it is the ILP Scheduler. 

According to the

 brief documentation in the 

source file ScheduleDAGRRList, the BURR is a register pressure reduction

 scheduler, while the ILP is a register-pressure aware scheduler that 

tries to balance ILP and register pressure. <br>

<br>My questions are:</div><div><br></div><div>-Are there any references (such as published research) that describe each/any of these scheduling algorithms? <br></div><div><br></div><div>-

 By examining the source code, it appears that neither scheduler has a 

machine model describing the functional units and the mapping of 

instructions to functional units on the target x86 machine. Is that 

right?</div><div><br></div><div>- Based on the test cases that I have 

analyzed, it looks that the BURR scheduler sets all latencies to 1, 

which essentially eliminates any scheduling for ILP and makes scheduling

 for register pressure reduction the only objective of this scheduler. 

Can you please confirm or correct this?</div><div><br></div><div>-

 Again based on analyzing test cases, it appears that the ILP scheduler 

sets the latencies of DIV and SQRT (both INT and FP) to 10, while the 

latencies of all other instructions are set to 10. Can you please 

confirm or correct

 this observation?</div><div><br></div><div>Apparently, the developers 

of the ILP scheduler assumed that this rough latency model would be 

sufficient to do ILP scheduling on the x86 target, because the x86 

hardware has a good dynamic scheduler. Our testing, however, shows that 

this is the case for most but not all programs. For one particular 

benchmark with a high-degree of ILP, using more precise latency info 

significantly improved performance. Will the LLVM developers be 

interested in adding more precise latency info for the x86 target?</div><div><br></div><div>Thank you in advance!<br></div><div><br></div><div>-Ghassan<br><br></div></div></body></html>