<font size=2 face="sans-serif">Hi Andrew,  <font size=2 face="sans-serif">Actually, my main concern was the spill.

My original text was probably not very clear. Sorry about the confusion.

</font><br><br><font size=2 face="sans-serif">What I wanted to point out was exactly

what Matthias mentioned in his first email. The Pending queue is not considered

in tryCandidate() and so we can end up with a large number of similar instructions

that significantly increase register pressure. In my example it was loads.

The scheduler will take all the loads and schedule them together because

they all have the same latency and so are all in the Available queue. I

agree that in theory that is a good idea however in our case each load

uses up another register and we can technically end up with more consecutive

loads than total available physical registers. If we considered register

pressure we would start by putting all the loads together but then after

a few loads we would be forced to use a div to release some of the register

pressure. </font><br><br><font size=2 face="sans-serif">Having said that, I tried your suggestion

of setting  <tt><font size=2>MicroOpBufferSize=1 </tt> <font size=2 face="sans-serif">and it certainly seems to improve things.

(At least for the example I was looking at.) Thank you for mentioning it.

</font><br><font size=2 face="sans-serif">I'll run a few more tests with that

parameter change and hopefully that will be enough to get what we want.

</font><br><br><font size=2 face="sans-serif">Matthias, Andrew, thank you both for

the help. </font><br><font size=2 face="sans-serif"> </font><br><font size=2 face="sans-serif">Stefan  </font><br><BR>