[cfe-dev] Clang generates absurd amount of assembly for libc++ std::vector::emplace

David Blaikie via cfe-dev cfe-dev at lists.llvm.org
Tue Jul 24 12:35:51 PDT 2018


The number of lines of assembly isn't really a good proxy for the
performance of some code - mostly due to inlining (one piece of code may be
many more lines of assembly because it's not calling large/complicated
external functions - or, even taken as a whole (including those external
functions) it might still be more efficient to have longer code (because
it's more specialized - ie: two calls to one generic function were inlined
into two places and each one simplified/optimized a bit for those
situations))

That said, libc++ does have a bunch of forced inlining that's not for
performance reasons, but for linkage reasons (to ensure that certain kinds
of changes/updates to libc++ don't break existing compiled code/libraries).
It's a tradeoff that not every user of libc++ needs to make & there are
steps being taken to make that tradeoff more configurable/optional, so far
as I understand it.

On Mon, Jul 23, 2018 at 4:43 PM via cfe-dev <cfe-dev at lists.llvm.org> wrote:

> Hello all,
>
>
>
> Just a quick question to make sure I’m not missing something.
>
>
>
> This program:
>
>
>
> #include <vector>
>
> void f(std::vector<double>& vec, double val) {
>
>       vals.emplace(std::cbegin(vec), val);
>
> }
>
>
>
> When compiled with trunk Clang on Godbolt with -O3 -march=haswell
> -std=c++17 -stdlib=libstdc++, 132 lines of assembly are produced. If
> -stdlib=libc++ is used, though, 638 (!) lines of assembly are produced. A
> few of those lines are due to f() itself, but it appears the vast
> majority are due to the implementation of emplace(). As a partial
> comparison, GCC trunk produced 136 lines of assembly, and seems to have
> partially inlined emplace(), leaving 94 lines of assembly for
> _M_realloc_insert.
>
>
>
> I can sort of duplicate this on Debian sid, with libc++-dev 6.0.1-1 and
> clang++-7 (--version doesn’t appear to give a revision number,
> unfortunately?). Using libstdc++ results in 176 lines of assembly, and
> libc++ results in 803 lines of assembly (counted by wc -l).
>
>
>
> Is this something to be worried about? I’m still rather new to
> performance-related work, so I’m working from a relatively simplistic view
> of what could be affecting performance. A 4x difference in what could be a
> commonly-used function seems rather unusual to me, though.
>
>
>
> Thanks,
>
>
>
> Alex
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180724/8f113482/attachment.html>


More information about the cfe-dev mailing list