<div dir="ltr">Thanks for digging into this!<div><br></div><div>Some comments inline:<br><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Dec 28, 2013 at 4:39 PM, Chandler Carruth <span dir="ltr"><<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">Well, after some more prodding, I dug into this *much* more deeply. I can now explain rather precisely the change to bzip2 which causes this patch to improve performance, and interestingly I no longer think this patch is the correct approach. Read on, because this was fun. However, be forewarned, this is more of a blog post than an email. =]<div>


<br></div><div><div>
<br></div><div>So first, let's all get on the same page about exactly what performance problem I'm seeing this patch address (Jim, I *think* when we talked this was the one you were looking at as well, but lemme know if not, I wasn't 100% certain). Here is a snippet of code produced by Clang with this patch reverted for the really hot inner function in the main loop of bzip2 (I've based all my benchmarks and measurements on v1.0.6, the latest open source release):</div>



<div><br></div><div><a href="http://paste2.org/0YGNdVJV" target="_blank">http://paste2.org/0YGNdVJV</a><br></div><div><ol style="margin:0px 0px 0px 45px;padding:0px 0px 0px 1px;border-width:0px 0px 0px 1px;border-left-style:solid;border-left-color:rgb(204,204,204);font-family:InconsolataMedium,monospace;font-size:13px;line-height:16px;vertical-align:baseline;list-style-position:initial;color:rgb(40,40,40);background-color:rgb(248,248,248)">


<li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># BB#2:                                 # %if.end</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rsi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rbp</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%bl</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmpb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%bl</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">jne</span>     <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_1</span></div>

</li></ol></div><div><br></div><div>Now, originally the theory was that cmpb was being slow because if you make it a cmpl it sped up (for me, by around 3.5%). This is what this patch does:</div><div><br></div><div><a href="http://paste2.org/Y7D8cIcV" target="_blank">http://paste2.org/Y7D8cIcV</a><br>


</div><div><ol style="margin:0px 0px 0px 45px;padding:0px 0px 0px 1px;border-width:0px 0px 0px 1px;border-left-style:solid;border-left-color:rgb(204,204,204);font-family:InconsolataMedium,monospace;font-size:13px;line-height:16px;vertical-align:baseline;list-style-position:initial;color:rgb(40,40,40);background-color:rgb(248,248,248)">


<li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># BB#3:                                 # %if.end</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rsi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rbp</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebx</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmpl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">jne</span>     <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_1</span></div>

</li></ol><div><font color="#880000" face="InconsolataMedium, monospace"><span style="line-height:16px;white-space:pre-wrap"><br></span></font></div></div>Ok, cool, this goes faster, but *why*? I was thinking it was around a partial register stall, but I didn't really understand where the stall came from. So I tried switching the above code back to 'cmpb':<div>


<br></div><div><a href="http://paste2.org/h0MbL2jM" target="_blank">http://paste2.org/h0MbL2jM</a><br></div><div><ol style="margin:0px 0px 0px 45px;padding:0px 0px 0px 1px;border-width:0px 0px 0px 1px;border-left-style:solid;border-left-color:rgb(204,204,204);font-family:InconsolataMedium,monospace;font-size:13px;line-height:16px;vertical-align:baseline;list-style-position:initial;color:rgb(40,40,40);background-color:rgb(248,248,248)">


<li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># BB#3:                                 # %if.end</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rsi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rbp</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebx</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmpb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%bl</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">jne</span>     <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_1</span></div>

</li></ol></div><div><br></div><div>What I didn't expect was that this was just as fast (at least for me on the sandybridge machine I was testing on primarily). This got me thinking about how partial register stalls come to be. You write to one portion of the register, and then it gets merged with whatever was already in the high bytes. Well, there are kind of two places where that could happen in the original example (movb, movb, cmpb): either when we write the partial register with movb, or when we read it with cmpb. For more discussion of this from Intel, see their Optimization Reference Manual section 3.5.2.4.</div>

<div><br></div><div>But because we read it with the 'b' suffix, we don't actually read the high bytes! So we shouldn't be seeing a stall on the read side. So it seems like (at least on sandybrige) there is merging of partial registers on *write* even without a read.</div>
</div></div></blockquote><div><br></div><div>Interesting. This seems contrary to Agner's Sandy Bridge information in section 9.11 of <<a href="http://www.agner.org/optimize/microarchitecture.pdf">http://www.agner.org/optimize/microarchitecture.pdf</a>>  "There is no penalty for writing to a partial register unless there is a later read from a larger part of the same register". Maybe you could share your findings with him?<br>
</div><div><br></div><div>-- Sean Silva</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr"><div><div> Then the movb would be enough to cause the slowdown regardless of how we read it. This at least explains the measured performance shifts I see on my sandybridge. When I take the original assembly for all of bzip2, and I modify just these movb instructions to be movzbl, I see roughly the same performance uplift as we get with ToT and Jim's patch.<br>


</div><div><br></div><div>Ok, so it seems that on sandybridge (and I think haswell but I don't have one handy to test as thoroughly) it is important that when we don't need the high bytes of a register for an operation and we are going to movb or movw into it, we use movzbl or movzwl (resp.) to remove the false dependency at write time on the high bytes. I think that would accomplish what this patch set out to do, but in a much more targeted way. It will also catch many more of these issues because movb has the potential to cause this, not just those feeding 'cmp'.</div>


<div><br></div><div>Also, the measurements I dave done indicate that once we do the above, this patch should be reverted. I can't find any performance problems with either 8-bit cmpb (or 16-bit cmpw other than the 1-byte prefix cost in encoding), and using the more precisely sized cmp instructions lets us fold one of the two memory operands above into the cmp instruction. One thing I'm still confused by is why the original code didn't do that fold.</div>

<div><br></div><div>Here is what I would intuitively expect:</div>
<div><br></div><div><a href="http://paste2.org/KgLfPD8g" target="_blank">http://paste2.org/KgLfPD8g</a></div><div><ol style="margin:0px 0px 0px 45px;padding:0px 0px 0px 1px;border-width:0px 0px 0px 1px;border-left-style:solid;border-left-color:rgb(204,204,204);font-family:InconsolataMedium,monospace;font-size:13px;line-height:16px;vertical-align:baseline;list-style-position:initial;color:rgb(40,40,40);background-color:rgb(248,248,248)">

<li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># BB#2:                                 # %if.end</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebx</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rsi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">        </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">  </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline">(</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline">,</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%rbp</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline">),</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit"> </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">        </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(0,0,255)">cmpb</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">    </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%al</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline">,</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit"> </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline">(</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline">,</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%rbx</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline">)</span><br>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">        </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(0,0,255)">jne</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">     </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_1</span></li>

</ol><br>Note that I started from the original version (without this patch) and manually applied this (and all of the other stuff I'm about to discuss). I'm also re-register allocating here. It'll become more obvious later why. =] There is no effective change. we're now loading only one side of the comparison into a register, zext-ing it to ensure no false dependency, and directly pulling the other byte from memory.</div>


<div><br></div><div>However, there is a lot more wrong. Let's start off with the simple and obvious stuff. Why are we taking up two more registers and doing lea's just to add a displacement to the addresses? The addressing modes using the results have an empty displacement! So now we have this:</div>


<div><br></div><div><div><a href="http://paste2.org/mXZmU0df" target="_blank">http://paste2.org/mXZmU0df</a></div><div><ol style="margin:0px 0px 0px 45px;padding:0px 0px 0px 1px;border-width:0px 0px 0px 1px;border-left-style:solid;border-left-color:rgb(204,204,204);font-family:InconsolataMedium,monospace;font-size:13px;line-height:16px;vertical-align:baseline;list-style-position:initial;color:rgb(40,40,40);background-color:rgb(248,248,248)">

<li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># BB#2:                                 # %if.end</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rsi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmpb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdi)</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">jne</span>     <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_1</span></div>

</li></ol></div><div><br></div><div>Cool, this looks really nice now, and benchmarking confirms it. But before we can actually do either of these transforms, we'll also have to fix how this is being used. Time for a more complete example:</div>


<div><br></div><div><a href="http://paste2.org/MjW043aZ" target="_blank">http://paste2.org/MjW043aZ</a><br></div><div><ol style="margin:0px 0px 0px 45px;padding:0px 0px 0px 1px;border-width:0px 0px 0px 1px;border-left-style:solid;border-left-color:rgb(204,204,204);font-family:InconsolataMedium,monospace;font-size:13px;line-height:16px;vertical-align:baseline;list-style-position:initial;color:rgb(40,40,40);background-color:rgb(248,248,248)">


<li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># BB#2:                                 # %if.end</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">leal</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rsi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rbp</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%bl</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmpb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%bl</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">jne</span>     <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_1</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">
</div></li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># ... snip ...</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">
</div></li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(160,160,0)">.LBB27_1:</span>                               <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># %if.then</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmpb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%bl</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">setb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(160,160,0)">.LBB27_37:</span>                              <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># %return</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">popq</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rbx</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">popq</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rbp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">ret</span></div>

</li></ol></div><div><br></div><div>Ok, WTF. We 'cmpb %bl, %al', jump, and then 'cmpb %al, %bl'???? This makes no sense. *every* predecessor looks like this. The reason for the LB27_37 label sneaking around there is because there are even more blocks which end in essentially the same thing but with different registers instead of %al and %bl. Whatever the registers, its always the same pattern: "cmp X, Y;  jne ....;  cmp Y, X;  setb %al". And if we didn't re-do the "cmp", it wouldn't have mattered that they were different registers!!! So lets fix that:</div>


<div><br></div><div><a href="http://paste2.org/z2gwXvAY" target="_blank">http://paste2.org/z2gwXvAY</a><br></div><div><ol style="margin:0px 0px 0px 45px;padding:0px 0px 0px 1px;border-width:0px 0px 0px 1px;border-left-style:solid;border-left-color:rgb(204,204,204);font-family:InconsolataMedium,monospace;font-size:13px;line-height:16px;vertical-align:baseline;list-style-position:initial;color:rgb(40,40,40);background-color:rgb(248,248,248)">


<li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># BB#2:                                 # %if.end</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rsi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">),</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmpb</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">1</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdx</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%rdi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">)</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">jne</span>     <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_1</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="color:rgb(64,128,128);font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit"># ... <snip> ...</span><br>

</div></li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<br></li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(160,160,0)">.LBB27_1:</span>                               <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># %if.then</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">seta</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%al</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">ret</span></div>

</li></ol></div><div><br></div><div>Yea, that's a bit more plausible. =] Note that the 'popq' are gone because while when we fold away all of the lea's we no longer need the two scratch registers. The .LBB27_37 is cone because we no longer have 6 copies re-doing a comparison with different registers.</div>


<div><br></div><div>One final tweak that is totally unrelated to the rest of this. We're missing a DAG combine for the latch of the inner loop here which looks like this:</div><div><br></div><div><a href="http://paste2.org/aO0hBw1O" target="_blank">http://paste2.org/aO0hBw1O</a><br>


</div><div><ol style="margin:0px 0px 0px 45px;padding:0px 0px 0px 1px;border-width:0px 0px 0px 1px;border-left-style:solid;border-left-color:rgb(204,204,204);font-family:InconsolataMedium,monospace;font-size:13px;line-height:16px;vertical-align:baseline;list-style-position:initial;color:rgb(40,40,40);background-color:rgb(248,248,248)">


<li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># BB#36:                                # %if.end396</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">                                        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)">#   in Loop: Header=BB27_14 Depth=1</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">addl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">$8</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%edi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">addl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">$8</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%esi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">xorl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmpl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r8d</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%edi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r8d</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmovbl</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">subl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%edi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmpl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r8d</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%esi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r8d</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmovbl</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">subl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%ebp</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%esi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">decl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r9</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">)</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">addl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">$</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">-</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">8</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r10d</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">jns</span>     <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_14</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">jmp</span>     <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_37</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">
</div></li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># ... <snip> ...</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(160,160,0)"><br>

</span></div></li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(160,160,0)">.LBB27_37:</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">                              </span><span style="font-family:inherit;font-size:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;font-style:italic;vertical-align:baseline;color:rgb(64,128,128)"># %return</span><br>

</div></li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">        </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(0,0,255)">movzbl</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">  </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%al</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline">,</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit"> </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%eax</span><br>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">        </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(0,0,255)">popq</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">    </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%rbx</span><br>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">        </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(0,0,255)">popq</span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">    </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(25,23,124)">%rbp</span><br>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit">        </span><span style="font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;margin:0px;padding:0px;border:0px;vertical-align:baseline;color:rgb(0,0,255)">ret</span></li>

</ol></div><div><br></div><div>I'm always really suspicious when I see cmp followed closely by sub. Indeed, this is a case of us not realizing that the cmp and sub can be combined. When we do that, we also remove this last use of a scratch register (%ebp), confirming that we can get rid of all the popq business. When we do, we also no longer need to jmp, we can just directly ret:</div>


<div><br></div><div><a href="http://paste2.org/v5K9COgf" target="_blank">http://paste2.org/v5K9COgf</a><br></div><div><ol style="margin:0px 0px 0px 45px;padding:0px 0px 0px 1px;border-width:0px 0px 0px 1px;border-left-style:solid;border-left-color:rgb(204,204,204);font-family:InconsolataMedium,monospace;font-size:13px;line-height:16px;vertical-align:baseline;list-style-position:initial;color:rgb(40,40,40);background-color:rgb(248,248,248)">


<li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline"><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)"># BB#36:                                # %if.end396</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">                                        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:italic;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(64,128,128)">#   in Loop: Header=BB27_14 Depth=1</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">addl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">$8</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%edi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">addl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">$8</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%esi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%edi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">subl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r8d</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmovae</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%edi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">movl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%esi</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">subl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r8d</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">cmovae</span>  <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%esi</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">decl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">(</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r9</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">)</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">addl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">$</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">-</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(102,102,102)">8</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%r10d</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">jns</span>     <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(136,0,0)">.LBB27_14</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">xorl</span>    <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span><span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">,</span> <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(25,23,124)">%eax</span></div>

</li><li style="margin:0px 0px 0px 3px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;white-space:pre-wrap">

<div style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline">        <span style="margin:0px;padding:0px;border:0px;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:inherit;line-height:inherit;vertical-align:baseline;color:rgb(0,0,255)">ret</span></div>

</li></ol></div><div><br></div><div>Yea, that looks more like sane code. =]</div><div><br></div><div>All of this combines for another 3.5% on bzip2 on sandybridge, despite just changing the mainGtU function manually by hand.</div>


<div><br></div><div>I'll probably break out some of these issues into bugs and such once I have more minimal test cases. I think the most important thing is to get the correct fix for partial register stalls on sandybridge committed and revert this patch so we don't keep paying a tax for a halfway fix. The rest is gravy, if very interesting all the same.</div>


<div><br></div><div>Just for reference, here is the complete asm before and after:</div><div><br></div><div>Before: <a href="http://paste2.org/CC50nde3" target="_blank">http://paste2.org/CC50nde3</a></div><div>After: <a href="http://paste2.org/J3DHbPJC" target="_blank">http://paste2.org/J3DHbPJC</a></div>

<div>- 100 fewer instructions</div><div>- 30% smaller encoded size after assembling</div><div>- 7% faster on sandybridge, 3.5% faster than with this patch (r195496)</div>
</div><div><br></div><div><br></div><div>Also for context, the benchmarking was done with a little compression algorithm benchmarking tool that I'm working to open source and will hopefully add to the test suite soon after. It's based on bzip2 1.0.6 (latest sources) and I primarily measured performance by compressing linux-3.12.tar with it.</div>
<span class=""><font color="#888888">

<div><br></div><div>-Chandler</div></font></span></div></div>
<br>_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
<br></blockquote></div><br></div></div></div>