<div dir="ltr"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">On Sat, Dec 1, 2018 at 12:05 PM Stefan Kanthak <<a href="mailto:stefan.kanthak@nexgo.de">stefan.kanthak@nexgo.de</a>> wrote:<br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">"Craig Topper" <<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>> wrote:<br>
<br>
<br>
> Clang's -target option is supposed to take a cpu type and an operating<br>
> system. So "-target i386" is giving it no operatiing system. This is<br>
> preventing frame pointer elimination which is why ebp is being updated. If<br>
> you pass "-target i386-linux" you get sightly better code.<br>
<br>
The frame pointer is but not the point here.<br></blockquote><div><br></div><div>You didn't provide what you think the improved code would be for the multiply. So I wasn't sure.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
> The division/remainder operations are turned into library calls as part of<br>
> instruction selection. This code is somewhat independent of how other calls<br>
> are handled. We probably don't support tail calls in it. Is it really<br>
> realistic that a user would have a non-inlined function that contains just<br>
> a division? Why should we optimize for that case?<br>
<br>
I've seen quite some libraries which implement such functions, calling<br>
just another function having the same prototype, as target-independent<br>
wrappers.<br>
So the question is not whether it's just a division, but in general the<br>
call of a function having the same prototype.<br></blockquote><div><br></div><div>We do support that when there is a call in the original source code. The division/remainder case is special because we're turning an arithmetic operation into a call. This for example works.</div><div><br></div><div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)"><div><span style="color:rgb(0,0,255)">long</span> <span style="color:rgb(0,0,255)">long</span> foo(<span style="color:rgb(0,0,255)">long</span> <span style="color:rgb(0,0,255)">long</span> x, <span style="color:rgb(0,0,255)">long</span> <span style="color:rgb(0,0,255)">long</span> y) {</div><div> <span style="color:rgb(0,0,255)"> return</span> bar(foo, bar);</div><div>}</div></div></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
regards<br>
Stefan<br>
<br>
> On Sat, Dec 1, 2018 at 9:37 AM Stefan Kanthak via llvm-dev <<br>
> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>
> <br>
>> Compile the following functions with "-O3 -target i386"<br>
>> (see <<a href="https://godbolt.org/z/VmKlXL" rel="noreferrer" target="_blank">https://godbolt.org/z/VmKlXL</a>>):<br>
>><br>
>> long long div(long long foo, long long bar)<br>
>> {<br>
>> return foo / bar;<br>
>> }<br>
>><br>
>> On the left the generated code; on the right the expected,<br>
>> properly optimised code:<br>
>><br>
>> div: # @div<br>
>> push ebp |<br>
>> mov ebp, esp |<br>
>> push dword ptr [ebp + 20] |<br>
>> push dword ptr [ebp + 16] |<br>
>> push dword ptr [ebp + 12] |<br>
>> push dword ptr [ebp + 8] |<br>
>> call __divdi3 | jmp __divdi3<br>
>> add esp, 16 |<br>
>> pop ebp |<br>
>> ret |<br>
>><br>
>><br>
>> long long mod(long long foo, long long bar)<br>
>> {<br>
>> return foo % bar;<br>
>> }<br>
>><br>
>> mod: # @mod<br>
>> push ebp |<br>
>> mov ebp, esp |<br>
>> push dword ptr [ebp + 20] |<br>
>> push dword ptr [ebp + 16] |<br>
>> push dword ptr [ebp + 12] |<br>
>> push dword ptr [ebp + 8] |<br>
>> call __moddi3 | jmp __moddi3<br>
>> add esp, 16 |<br>
>> pop ebp |<br>
>> ret |<br>
>><br>
>><br>
>> long long mul(long long foo, long long bar)<br>
>> {<br>
>> return foo * bar;<br>
>> }<br>
>><br>
>> mul: # @mul<br>
>> push ebp<br>
>> mov ebp, esp<br>
>> push esi<br>
>> mov ecx, dword ptr [ebp + 16]<br>
>> mov esi, dword ptr [ebp + 8]<br>
>> mov eax, ecx<br>
>> imul ecx, dword ptr [ebp + 12]<br>
>> mul esi<br>
>> imul esi, dword ptr [ebp + 20]<br>
>> add edx, ecx<br>
>> add edx, esi<br>
>> pop esi<br>
>> pop ebp<br>
>> ret<br>
>> _______________________________________________<br>
>> LLVM Developers mailing list<br>
>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
>><br>
><br>
</blockquote></div></div>