<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class="">The patch that removes __always_inline__ is <a href="https://reviews.llvm.org/D49240" class="">https://reviews.llvm.org/D49240</a> (ready to go, but waiting for sign off by Eric).</div><div class=""><br class=""></div>In this specific case, it does not appear to be related to inlining, though. Like it’s been said, there seems to be unrolling going on and vectorization also (look for vmovups). I checked your example with my patch that removes always_inline and the result is roughly the same, so I don’t think it’s related to the fact that libc++ uses __always_inline__ for linkage purposes.<div class=""><br class=""></div><div class="">Louis<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Jul 25, 2018, at 12:12, David Blaikie <<a href="mailto:dblaikie@gmail.com" class="">dblaikie@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><br class=""><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Wed, Jul 25, 2018 at 8:57 AM <<a href="mailto:aw1621107@gmail.com" class="">aw1621107@gmail.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple" class=""><div class="m_-3748370816681265819WordSection1"><p class="MsoNormal" style="margin-right:0in;margin-bottom:12.0pt;margin-left:10.5pt">The number of lines of assembly isn't really a good proxy for the performance of some code - mostly due to inlining (one piece of code may be many more lines of assembly because it's not calling large/complicated external functions - or, even taken as a whole (including those external functions) it might still be more efficient to have longer code (because it's more specialized - ie: two calls to one generic function were inlined into two places and each one simplified/optimized a bit for those situations))<u class=""></u><u class=""></u></p></div></div><div lang="EN-US" link="blue" vlink="purple" class=""><div class="m_-3748370816681265819WordSection1"><p class="MsoNormal">Yeah, you’re right; that was also pointed out to me by someone on one of the IRC channels I lurk on. A bit more investigation on Godbolt revealed that the difference could be to unrolling. It was certainly a surprise to me, as I expected that libstdc++ and libc++ would have relatively similar implementations that would produce relatively similar outputs. Guess something about libc++’s implementation is a bit easier for Clang to inspect? In any case, let that be a lesson to me to be a bit more careful about drawing conclusions from code size<u class=""></u><u class=""></u></p></div></div><div lang="EN-US" link="blue" vlink="purple" class=""><div class="m_-3748370816681265819WordSection1"><p class="MsoNormal" style="margin-bottom:12.0pt"><br class=""><br class="">That said, libc++ does have a bunch of forced inlining that's not for performance reasons, but for linkage reasons (to ensure that certain kinds of changes/updates to libc++ don't break existing compiled code/libraries). It's a tradeoff that not every user of libc++ needs to make & there are steps being taken to make that tradeoff more configurable/optional, so far as I understand it.<u class=""></u><u class=""></u></p></div></div><div lang="EN-US" link="blue" vlink="purple" class=""><div class="m_-3748370816681265819WordSection1"><p class="MsoNormal">Huh, that’s interesting. That isn’t what is happening here, though, right? I didn’t see anything that looks like that around the declarations/implementations of emplace() and friends</p></div></div></blockquote><div class=""><br class="">Yeah, probably doesn't come up for the fully dependent template things in the standard library - but maybe some implementation details that are used in there like allocators, etc, might have some of these features. There's a lot of stuff in there - so hard for me to check at a glance. (though you can see it around otehr functions in the form of _LIBCPP_INLINE_VISIBILITY)<br class=""><br class="">- Dave<br class=""> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-US" link="blue" vlink="purple" class=""><div class="m_-3748370816681265819WordSection1"><p class="MsoNormal"><u class=""></u><u class=""></u></p></div></div><div lang="EN-US" link="blue" vlink="purple" class=""><div class="m_-3748370816681265819WordSection1"><p class="MsoNormal"><u class=""></u> <u class=""></u></p><div class=""><div class=""><p class="MsoNormal">On Mon, Jul 23, 2018 at 4:43 PM via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a>> wrote:<u class=""></u><u class=""></u></p></div><blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt" class=""><div class=""><div class=""><p class="MsoNormal">Hello all,<u class=""></u><u class=""></u></p><p class="MsoNormal"> <u class=""></u><u class=""></u></p><p class="MsoNormal">Just a quick question to make sure I’m not missing something.<u class=""></u><u class=""></u></p><p class="MsoNormal"> <u class=""></u><u class=""></u></p><p class="MsoNormal">This program:<u class=""></u><u class=""></u></p><p class="MsoNormal"> <u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-family:Consolas" class="">#include <vector></span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-family:Consolas" class="">void f(std::vector<double>& vec, double val) {</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-family:Consolas" class=""> vals.emplace(std::cbegin(vec), val);</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-family:Consolas" class="">}</span><u class=""></u><u class=""></u></p><p class="MsoNormal"> <u class=""></u><u class=""></u></p><p class="MsoNormal">When compiled with trunk Clang on Godbolt with <span style="font-family:Consolas" class="">-O3 -march=haswell -std=c++17 -stdlib=libstdc++</span>, 132 lines of assembly are produced. If <span style="font-family:Consolas" class="">-stdlib=libc++</span> is used, though, 638 (!) lines of assembly are produced. A few of those lines are due to <span style="font-family:Consolas" class="">f()</span> itself, but it appears the vast majority are due to the implementation of <span style="font-family:Consolas" class="">emplace()</span>. As a partial comparison, GCC trunk produced 136 lines of assembly, and seems to have partially inlined <span style="font-family:Consolas" class="">emplace()</span>, leaving 94 lines of assembly for <span style="font-family:Consolas" class="">_M_realloc_insert</span>.<u class=""></u><u class=""></u></p><p class="MsoNormal"> <u class=""></u><u class=""></u></p><p class="MsoNormal">I can sort of duplicate this on Debian sid, with libc++-dev 6.0.1-1 and <span style="font-family:Consolas" class="">clang++-7</span> (<span style="font-family:Consolas" class="">--version</span> doesn’t appear to give a revision number, unfortunately?). Using libstdc++ results in 176 lines of assembly, and libc++ results in 803 lines of assembly (counted by <span style="font-family:Consolas" class="">wc -l</span>).<u class=""></u><u class=""></u></p><p class="MsoNormal"> <u class=""></u><u class=""></u></p><p class="MsoNormal">Is this something to be worried about? I’m still rather new to performance-related work, so I’m working from a relatively simplistic view of what could be affecting performance. A 4x difference in what could be a commonly-used function seems rather unusual to me, though.<u class=""></u><u class=""></u></p><p class="MsoNormal"> <u class=""></u><u class=""></u></p><p class="MsoNormal">Thanks,<u class=""></u><u class=""></u></p><p class="MsoNormal"> <u class=""></u><u class=""></u></p><p class="MsoNormal">Alex<u class=""></u><u class=""></u></p></div></div><p class="MsoNormal">_______________________________________________<br class="">cfe-dev mailing list<br class=""><a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><u class=""></u><u class=""></u></p></blockquote></div></div></div></blockquote></div></div>
</div></blockquote></div><br class=""></div></body></html>