<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">On Tue, Aug 20, 2019 at 9:42 AM via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">





<div lang="EN-US">
<div class="gmail-m_3359278419428531052WordSection1">
<p class="MsoNormal">> In -Og mode, it seems that it would equally make sense to take "a very big<br>
> slice around system headers specifically to avoid" debug symbols for code<br>
> that users can't debug.<span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u><u></u></span></p>
<p class="MsoNormal"><a name="m_3359278419428531052__MailEndCompose"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></a></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">Our users seem to like to be able to dump their STL containers, which definitely requires debug symbols for "code they can't debug."</span></p></div></div></blockquote><div><br></div><div>Hmm, I may have muddled things up by mentioning "debug symbols" without fully understanding what people mean by that phrase precisely. I meant "line-by-line debugging information enabling single-step through a bunch of templates that the user doesn't care about and would prefer to see inlined away." Forget debug symbols and focus on inlining, if that'll help avoid my confusion. :)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_3359278419428531052WordSection1"><p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">OTOH being able to more aggressively optimize system-header code even in –Og mode seems reasonable.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">OTOOH most of the system-header code is templates or otherwise inlineable early, and after inlining the distinction between app and sys code really goes away.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u></span></p></div></div></blockquote><div><br></div><div>I believe we'd like to get "inlining early," but the problem is that `-Og` disables inlining. So there is no "after inlining" at the moment.</div><div>Here's a very concrete example: <a href="https://godbolt.org/z/5tTgO4">https://godbolt.org/z/5tTgO4</a></div><div><br></div><div><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">int foo(std::tuple<int, int> t) {</span></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">    return std::get<0>(t);</span></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">}</span></p></div><div><br></div><div>At `-Og` this produces the assembly code</div><div><br></div><div><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">_Z3fooSt5tupleIJiiEE:</span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">  pushq %rax<br></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  callq _ZSt3getILm0EJiiEERNSt13tuple_elementIXT_ESt5tupleIJDpT0_EEE4typeERS4_</span></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  movl (%rax), %eax</span></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  popq %rcx</span></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  retq</span></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">_ZSt3getILm0EJiiEERNSt13tuple_elementIXT_ESt5tupleIJDpT0_EEE4typeERS4_:</span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">  jmp _ZSt12__get_helperILm0EiJiEERT0_RSt11_Tuple_implIXT_EJS0_DpT1_EE<br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">_ZSt12__get_helperILm0EiJiEERT0_RSt11_Tuple_implIXT_EJS0_DpT1_EE:<br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">  jmp _ZNSt11_Tuple_implILm0EJiiEE7_M_headERS0_<br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">_ZNSt11_Tuple_implILm0EJiiEE7_M_headERS0_:<br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">  addq $4, %rdi<br></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  jmp _ZNSt10_Head_baseILm0EiLb0EE7_M_headERS0_</span></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">_ZNSt10_Head_baseILm0EiLb0EE7_M_headERS0_:</span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">  movq %rdi, %rax<br></p>
<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  retq</span></p></div><div><br></div><div>I believe that if John McFarlane's proposal were adopted by Clang, so that inlining-into-system-functions were allowed at `-Og`, then the resulting assembly code would look like this instead, for a much better experience in both debugging and runtime performance:</div><div><br></div><div><div><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">_Z3fooSt5tupleIJiiEE:</span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">  pushq %rax<br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  callq _ZSt3getILm0EJiiEERNSt13tuple_elementIXT_ESt5tupleIJDpT0_EEE4typeERS4_</span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  movl (%rax), %eax</span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  popq %rcx</span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  retq</span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">_ZSt3getILm0EJiiEERNSt13tuple_elementIXT_ESt5tupleIJDpT0_EEE4typeERS4_:</span></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)">  leaq 4(%rdi), %rax<br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">  retq</span></p></div></div><div><span style="font-variant-ligatures:no-common-ligatures"><br></span></div><div>Notice that we still aren't inlining `std::get` into `foo`, because `foo` (as a user function) gets no inlining optimizations at `-Og`. But we do inline and collapse the whole chain of function-template helpers into `std::get` (because `std::get` is a function <i><b>defined</b></i> in a system header). This inlining creates new optimization opportunities, such as combining the `add` and `mov` into a single `lea`.</div><div><br></div><div>HTH,</div><div>–Arthur</div></div></div></div></div></div></div></div>