[libcxx-commits] [libcxx] [libc++] Mark __{emplace, push}_back_slow_path as noinline (PR #94379)

via libcxx-commits libcxx-commits at lists.llvm.org
Tue Jun 4 12:04:14 PDT 2024


EricWF wrote:

> > My tendency here is to trust the compiler. The example output on the code presented in the bug may seem silly, but why wouldn't the compiler inline there? It's the only function in a translation unit, and it contains a couple of lines. I wonder under what circumstances the compiler chooses to not inline the code.
> 
> My point is that the code is wrongheaded. Either the compiler inlines everything and enormous bloat is caused, and if it decides not to inline performance in useful cases drops like a rock. The code should be written that the performance of the fast path is superior independent of the inline/outline decisions.

I agree, it would be nice if we could have it all.

Could you talk about the concrete problem that lead you here? 

This conversation is lacking a grounding. It's easy to point at the "bloat", but why does "bloat" matter concretely to you?
Are you encountering production binaries where push_back is a problematically large portion of the final binary?



> 
> > However, when we prevent the compiler from inlining, we prevent it from optimizing away dead code.
> > Before you change, this silly example is optimized away.
> > ```c++
> > void foo() {
> >     std::vector<int> v;
> >     v.push_back(42);
> > }
> > ```
> 
> Yes due to the special nature of the new/delete functions the compiler could do this with full inlining. I don't think the side effect free no-ops are that important. I think for stdlib cases outline functions could be annotated with some kind of alloc/realloc/free attributes that similar transformations could still happen, but in the end not all that important for actual code.

I think code like this appears in less contrived manners more than one might expect.  The point of the example is to show how little the optimizer can do with such trivial code. If it can't optimize around that, then I expect it to do worse elsewhere.

Here's another less contrived example that compiles away:

```

static std::vector<int> foo(std::vector<int> const& LHS, std::vector<int> const& RHS)
{
  std::vector<int> v(3);
 
  for (int i=0; i < 3; ++i)
    v.push_back(LHS[i] + RHS[i]);

 return v;
}

int main() {
    int sum = 42;
    for (auto x : foo({1, 2, 3}, {1, 2, 3})) {
        sum += x;
    }
    return sum;
}
```


https://github.com/llvm/llvm-project/pull/94379


More information about the libcxx-commits mailing list