<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 16 May 2018 at 00:38, Steffen Hirschmann via cfe-dev <span dir="ltr"><<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Dear all,<br>
<br>
a while ago I posted odd (in the sense that I cannot explain it)<br>
constexpr behavior to the cfe-user mailing list and never received a<br>
reply. Since my observations are still valid for clang-6.0.0, I am<br>
reposting my original message to cfe-dev.<br>
<br>
tl;dr: It seems that the use of constexpr in this stupid example I ran<br>
across back in February prohibits a certain type of optimization that<br>
clang does. I cannot think of a reason for this behavior, therefore, I<br>
ask you.<br>
<br>
Greetings,<br>
Steffen<br>
<br>
P.S.: This also happens if one defines "fib" correctly (i <= 1). :)<br>
<br>
<br>
On 10:51 Fri 16 Feb , Steffen Hirschmann via cfe-users wrote:<br>
> Dear all,<br>
> <br>
> I was just playing around with a toy example when I noticed an oddity in<br>
> the code generated by clang-5.0.0 (and also in clang-5.0.1) regarding<br>
> constexpr.<br>
> <br>
> Given the code:<br>
> > int fib(int i) { if (i <= 0) return i; else return (fib(i - 1) + fib(i - 2)) % 100; }<br>
> > int main()<br>
> > {<br>
> > int ret = 0;<br>
> > for (int i = 0; i < 10; ++i)<br>
> > ret += fib(39);<br>
> > return ret;<br>
> > }<br>
> <br>
> Compile it with clang++ -O3 and what you get is (gdb disassembly of "main"):<br>
> > 7 {<br>
> > 8 int ret = 0;<br>
> > 9 for (int i = 0; i < 10; ++i)<br>
> > 10 ret += fib(39);<br>
> > 0x00000000004004e0 <+0>: push rax<br>
> > 0x00000000004004e1 <+1>: mov edi,0x27<br>
> > 0x00000000004004e6 <+6>: call 0x400490 <fib(int)><br>
> ><br>
> > 9 for (int i = 0; i < 10; ++i)<br>
> > 0x00000000004004eb <+11>: add eax,eax<br>
> > 0x00000000004004ed <+13>: lea eax,[rax+rax*4]<br>
> ><br>
> > 11 return ret;<br>
> > 0x00000000004004f0 <+16>: pop rcx<br>
> > 0x00000000004004f1 <+17>: ret<br>
> <br>
> A call to fib(39) once followed by a multiplication with 10.<br>
> <br>
> Now, if you make "fib" constexpr, i.e.:<br>
> > constexpr int fib(int i) { if (i <= 0) return i; else return (fib(i - 1) + fib(i - 2)) % 100; }<br>
> <br>
> And, again, compile it with -O3 and disassemble "main":<br>
> > 7 {<br>
> > 8 int ret = 0;<br>
> > 9 for (int i = 0; i < 10; ++i)<br>
> > 10 ret += fib(39);<br>
> > 0x0000000000400490 <+0>: push rbp<br>
> > 0x0000000000400491 <+1>: push rbx<br>
> > 0x0000000000400492 <+2>: push rax<br>
> > 0x0000000000400493 <+3>: mov edi,0x27<br>
> > 0x0000000000400498 <+8>: call 0x400530 <fib(int)><br>
> > 0x000000000040049d <+13>: mov ebx,eax<br>
> > 0x000000000040049f <+15>: mov edi,0x27<br>
> > 0x00000000004004a4 <+20>: call 0x400530 <fib(int)><br>
> > 0x00000000004004a9 <+25>: mov ebp,eax<br>
> > 0x00000000004004ab <+27>: add ebp,ebx<br>
> > 0x00000000004004ad <+29>: mov edi,0x27<br>
> > 0x00000000004004b2 <+34>: call 0x400530 <fib(int)><br>
> > 0x00000000004004b7 <+39>: mov ebx,eax<br>
> > 0x00000000004004b9 <+41>: add ebx,ebp<br>
> > 0x00000000004004bb <+43>: mov edi,0x27<br>
> > 0x00000000004004c0 <+48>: call 0x400530 <fib(int)><br>
> > 0x00000000004004c5 <+53>: mov ebp,eax<br>
> > 0x00000000004004c7 <+55>: add ebp,ebx<br>
> > 0x00000000004004c9 <+57>: mov edi,0x27<br>
> > 0x00000000004004ce <+62>: call 0x400530 <fib(int)><br>
> > 0x00000000004004d3 <+67>: mov ebx,eax<br>
> > 0x00000000004004d5 <+69>: add ebx,ebp<br>
> > 0x00000000004004d7 <+71>: mov edi,0x27<br>
> > 0x00000000004004dc <+76>: call 0x400530 <fib(int)><br>
> > 0x00000000004004e1 <+81>: mov ebp,eax<br>
> > 0x00000000004004e3 <+83>: add ebp,ebx<br>
> > 0x00000000004004e5 <+85>: mov edi,0x27<br>
> > 0x00000000004004ea <+90>: call 0x400530 <fib(int)><br>
> > 0x00000000004004ef <+95>: mov ebx,eax<br>
> > 0x00000000004004f1 <+97>: add ebx,ebp<br>
> > 0x00000000004004f3 <+99>: mov edi,0x27<br>
> > 0x00000000004004f8 <+104>: call 0x400530 <fib(int)><br>
> > 0x00000000004004fd <+109>: mov ebp,eax<br>
> > 0x00000000004004ff <+111>: add ebp,ebx<br>
> > 0x0000000000400501 <+113>: mov edi,0x27<br>
> > 0x0000000000400506 <+118>: call 0x400530 <fib(int)><br>
> > 0x000000000040050b <+123>: mov ebx,eax<br>
> > 0x000000000040050d <+125>: add ebx,ebp<br>
> > 0x000000000040050f <+127>: mov edi,0x27<br>
> > 0x0000000000400514 <+132>: call 0x400530 <fib(int)><br>
> > 0x0000000000400519 <+137>: add eax,ebx<br>
> ><br>
> > 11 return ret;<br>
> > 0x000000000040051b <+139>: add rsp,0x8<br>
> > 0x000000000040051f <+143>: pop rbx<br>
> > 0x0000000000400520 <+144>: pop rbp<br>
> > 0x0000000000400521 <+145>: ret<br>
> <br>
> That's 10 calls to function "fib" (for which the assembly is essentially<br>
> the same as in the example above).<br>
> <br>
> Regardless of whether the function is evaluated at compile time or not,<br>
> it seems odd to me that using constexpr here prohibits clang from<br>
> emitting the very same code as in the non-constexpr example. Note<br>
> however, that if you declare "fib" to be "static constexpr" clang,<br>
> again, emits the multiplication code.<br>
> <br>
> Is there something keeping clang from producing the multiplication code<br>
> for a non-static constexpr example that I don't see? And why is the<br>
> optimization possible again if one makes "fib" static? </blockquote><div> </div><div>The problem is not that constexpr prevents optimizations. The problem is that constexpr implies inline, and inline prevents optimizations. For details, please see</div><div><br></div><div><a href="https://www.playingwithpointers.com/blog/ipo-and-derefinement.html">https://www.playingwithpointers.com/blog/ipo-and-derefinement.html</a><br></div><div><br></div><div>The problem here is that we cannot deduce that 'fib' is side-effect-free, and use that information to call it only once, if it's an inline function, because we don't know that it was "originally" side-effect-free, and it could be derefined to a version with side-effects in a way that makes the transformation to call it only once be somehow non-conforming. However, if 'fib' is not inline, or if it's file-static, then we can transfer that information from 'fib' to its caller, because we know the version of 'fib' we can see is the same one that's actually going to be used at runtime.</div></div></div></div>