<div class="gmail_quote">On 30 September 2011 17:20, Eli Friedman <span dir="ltr"><<a href="mailto:eli.friedman@gmail.com">eli.friedman@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div class="HOEnZb"><div class="h5">On Fri, Sep 30, 2011 at 5:05 PM, Nick Lewycky <<a href="mailto:nlewycky@google.com">nlewycky@google.com</a>> wrote:<br>

> On 30 September 2011 16:30, Eli Friedman <<a href="mailto:eli.friedman@gmail.com">eli.friedman@gmail.com</a>> wrote:<br>

>><br>

>> On Fri, Sep 30, 2011 at 3:26 PM, Nick Lewycky <<a href="mailto:nlewycky@google.com">nlewycky@google.com</a>> wrote:<br>

>> > The attached patch eagerly evaluates bitcasts and GEPs feeding inline<br>

>> > assembly constraints. That reduces the chance that we'll copy the<br>

>> > result into a register. On this testcase:<br>

>> ><br>

>> > void test(short *a,short *b,float *c) {<br>

>> >    char tmp[256];<br>

>> >    __asm__ volatile(<br>

>> >        "%0 %1 %2 %3 %4"<br>

>> >        :<br>

>> >        :"m"(a),"m"(b),"m"(c),"m"(tmp[0]), "m"(tmp[128])<br>

>> >        :"memory","%eax","%ebx","%ecx","%edx","%esi","%edi");<br>

>> > }<br>

>> ><br>

>> > previously we'd run out of registers. Now, we emit:<br>

>> ><br>

>> >        pushl   %ebx<br>

>> >        pushl   %edi<br>

>> >        pushl   %esi<br>

>> >        subl    $268, %esp              # imm = 0x10C<br>

>> >        movl    292(%esp), %eax<br>

>> >        movl    288(%esp), %ecx<br>

>> >        movl    284(%esp), %edx<br>

>> >        movl    %edx, 264(%esp)<br>

>> >        movl    %ecx, 260(%esp)<br>

>> >        movl    %eax, 256(%esp)<br>

>> >        #APP<br>

>> >        264(%esp) 260(%esp) 256(%esp) (%esp) 128(%esp)<br>

>> >        #NO_APP<br>

>> >        addl    $268, %esp              # imm = 0x10C<br>

>> >        popl    %esi<br>

>> >        popl    %edi<br>

>> >        popl    %ebx<br>

>> >        ret<br>

>> ><br>

>> > which clearly has an extra copy, but that's a bug which existed<br>

>> > without my patch anyways. The tmp[0] and tmp[128] don't have any<br>

>> > copies.<br>

>> ><br>

>> > Please review!<br>

>><br>

>> I follow the concept, but I don't think the place you're hacking it in<br>

>> is really appropriate.  This is a fast-isel-specific problem, so it<br>

>> seems like the fix should also be in fast-isel code.<br>

><br>

> Thanks. I don't like where it is now, but I don't see where else to put<br>

> it. It is not fast-isel specific, the same testcase fails at -O2 before and<br>

> works after. Can't explain that!<br>

<br>

</div></div>The given testcase works fine for me at -O2 without your patch... is<br>

there some case which doesn't work?<br></blockquote><div><br></div><div>You're right. I fooled myself at some point by producing a testcase that failed under -O2, but then that testcase failed under GCC anyways.</div>


<div><br></div><div>I'll see if I can move this into FastISel specific code path, somewhere.</div><div><br></div><div>Nick</div><div><br></div></div>