[libcxx-commits] [libcxx] [libc++] Mark __{emplace, push}_back_slow_path as noinline (PR #94379)
via libcxx-commits
libcxx-commits at lists.llvm.org
Wed Jun 5 16:21:49 PDT 2024
gerben-stavenga wrote:
> > When someone optimizes some code one can equivalent ask "Code is 20% too slow, but how does that matter concretely to you?" Well, I can confidently say I'm not suffering terribly personally.
> > I'm not seeing it as you pasted it, but "return std::move(v);" does indeed do the trick. Never do RVO for vector, always std::move a vector.
> > https://godbolt.org/z/j4q5EcqK9
> > Anyway, I'm just pointing out what, to me at least, is obviously too much code generated for what push_back does. We do suffer from massive binaries and often it's a death by thousands cuts and a lot of code that just is too bloated. The c++ standard due to it's templated nature often contributes significantly. Now I think due to it's fundamental status in the ecosystem it's good to give code generation of standard lib functions a good inspection.
>
> The reason I asked for a concrete 1st order problem caused by the codegen is because I don't think it is "obvious" that there is "too much code generated". Why does the optimizer perform the inlining if it's obviously bad? Should we fix the inliner?
I don't think of this as "a problem of the inliner". The code, as is, is written in a way that on a very fundamental level provides no good option to the compiler. Ie. either, one, inline everything and get good performance on the fast path or, two, outline the slow path and get terrible performance.
A simple
```
for (int i = 0; i < n; i++) x.push_back(i);
```
will be much slower. Because the fast path is suddenly much slower, due to the fact that the pointer to vector escapes. The compiler will store and reload size, because it can't prove the store of i doesn't overwrite the size variable.
All I'm saying is that a redesign of the library such that the potential non-inline fallback functions are static member functions that pass in the data by value and return by value. Now the compiler can choose to inline or outline functions at will and it won't affect performance at all.
>
> To take your example about code "being 20% too slow", I would have to refer to something like "It means my program handles 5% fewer queries a second" as my 1st order concern. It's _too slow_ because it _affects my programs throughput measurably_.
>
> Intuition has been a very poor tool in my experience benchmarking code, and I so I'm asking for something more in order to understand and help resolve your concern.
>
> How did you first come to inspect the assembly for `push_back`?
See above analysis of push_back performance concerns for reasons i inspect push_back.
I saw 5x degradation (with different versions of clang and -o2 vs -o3) due not inlining the throw_bad_length_error (hence this-pointer escapes hence function is 5x too slow).
https://github.com/llvm/llvm-project/pull/94379
More information about the libcxx-commits
mailing list