<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Mar 25, 2014 at 7:24 AM, Rafael Espíndola <span dir="ltr"><<a href="mailto:rafael.espindola@gmail.com" target="_blank">rafael.espindola@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="">On 25 March 2014 09:49, Dan Gohman <<a href="mailto:dan433584@gmail.com">dan433584@gmail.com</a>> wrote:<br>


> Hi Lang,<br>

><br>

> I can reproduce the performance regression on fourinarow, at least. With the<br>

> patch, the code size and static instruction count of the benchmark's one<br>

> embarassingly-hot function is lower, the dynamic instruction count is lower,<br>

> and the stack frame is smaller, but it still runs slower. Instruction<br>

> selection is basically the same, except that there are fewer cmovs. There<br>

> appears to be a minor difference in instruction scheduling in the hot<br>

> function. The regression disappeared when I experimented with non-default<br>

> values for -pre-RA-sched. However, I'm not prepared for the adventure of<br>

> changing the instruction scheduler's heuristics at this time, so I'll just<br>

> let this patch go for now.<br>

<br>

</div>Do you have a small .ll testcase?<br></blockquote><div><br></div><div>Not handy anymore, but it's just MultiSource/Benchmarks/<div>FreeBench/fourinarow/fourinarow with -O3 -flto on x86-64.<br><br></div><div>Dan<br>

<br></div></div></div></div></div>