<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Nov 30, 2013 at 1:56 PM, Rafael Espíndola <span dir="ltr"><<a href="mailto:rafael.espindola@gmail.com" target="_blank">rafael.espindola@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">> I would expect conclusive benchmark results on a large corpus of code tested<br>

> on a broad spectrum of microarchitectures from both Intel and AMD before<br>

> making a change like this. Also, the code size impact needs to be<br>

> quantified; having to break out to a whole new instruction for what would<br>

> otherwise be an immediate has the potential for insane code-size increase<br>

> (clang has something like 30,000 cmpb's and it seems like almost all of them<br>

> have immediate memory operands). Also, not being able to fold a load into<br>

> the comparison will likely increase register pressure.<br>

<br>

</div>I tested this with a .bc from a LTO build of clang. The size of .text<br>

goes from 35468517 bytes to 35589941 bytes, so 1.0034 times larger,<br>

which seems a pretty small code growth.<br></blockquote><div><br></div><div>Thanks for looking into it. Your results seem to confirm Chandler's code size measurements.</div><div><br></div><div>-- Sean Silva</div><div>

 </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Cheers,<br>

Rafael<br>

</blockquote></div><br></div></div>