[libcxx-commits] [libcxx] [libcxx] Remove Redundant Reset in ~basic_string (PR #164718)
Aiden Grossman via libcxx-commits
libcxx-commits at lists.llvm.org
Fri Oct 24 09:57:53 PDT 2025
boomanaiden154 wrote:
> I'm not sure I follow. Adding dead_on_return seems to work just fine: https://godbolt.org/z/dv4oEq3eK. The IR is taken from this: https://godbolt.org/z/eo3jbco1s
_ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEED2Ev is the function where I added an extra dead_on_return to the this pointer.
It works when you do that, but you're not demonstrating the phase ordering problem that comes up. Inlining currently does not propagate `dead_on_return` information and DSE runs very late. So if you start with something like this:
```c++
void test() { some_string.~basic_string(); }
inline ~basic_string(/*implict this is marked dead on return*/) { __reset_internal_buffer(); }
void __reset_internal_buffer() {
__annotate_delete();
if (__is_long())
__alloc_traits::deallocate(__alloc_, __get_long_pointer(), __get_long_cap());
__rep_.__s = __short();
}
```
We do some inlining and we get the following:
```c++
void test() {
some_string.__annotate_delete();
if (some_string.__is_long())
__alloc_traits::deallocate(__alloc_, some_string.__get_long_pointer(), some_string.__get_long_cap());
some_string.__rep_.__s = __short();
}
```
Now the `dead_on_return` attribute is no more and we get no assistance from the annotation in optimization. It's of course still trivial to pick out here, but it's easy to imagine a complicated scenario where proving aliasing ends up being difficult. This is likely to happen in tons of hot places where string destruction occurs in hot code (which will be a lot). We only run dead store elimination long after inlining, so we cannot delete the stores when they are simple to pick out. That's unlikely to change given the compile time overhead of good AA that's needed for DSE.
Now I think (haven't looked at feasibility/how well it would work) we could theoretically fix this in two different ways:
1. Add a new intrinsic to IR similar to `@llvm.lifetime.end` but more general that the inliner can put in the terminator blocks of functions that have a `dead_on_return` parameter. That feels a bit specific to this case (although would be more general to anything adding `dead_on_return`), and I could forsee there being some design issues that would make implementation a lot more difficult.
2. We could mark `__reset_internal_buffer` as `always_inline` and add a new EarlyDSE pass that only picks out trivially dead stores soon after the AlwaysInliner pass. That would let us eliminate these stores when they are easy to prove dead. I think there would be a decent amount of compile time overhead from this option that might not be tenable.
> Yeah, I'm a bit more happy with the refactoring. Still not a huge fan though.
Yeah, I agree it's not an optimal solution. But neither the standard library or the compiler are magic, so we have to make a tradeoff somewhere. I think that this keeps everything together enough to preserve most of the benefit of the refactoring while preserving performance.
> I can certainly see that rolling such a change out might take some time. However, given that it at least seems to work, do you disagree that adding dead_on_return to destructors is a good idea in general, and once we've done that we can basically revert anything in this patch? IIUC the actual implementation should be fairly trivial. The actual problem is that people probably sometimes rely on the compiler not optimizing dead stores in destructors. I'd be quite happy to take this patch if this is a temporary solution until we've managed to land a patch for dead_on_return.
I don't think it's a bad idea in general. There are two things here:
1. Just adding `dead_on_return` to destructor's in clang's codegen does not actually solve the problem here without more work in the optimization pipeline. This optimization pipeline work would be pretty nontrivial.
2. The amount of work to clean up people relying on UB in our large internal codebase would probably be huge. Even just this patch zeroing destructed strings caused us to have to cleanup 20-30 tests (some tsan failures and some tests just failing). The code was of course all wrong to begin with, but it's still a lot of work to cleanup, or even just file bugs against other teams to cleanup. I would imagine the amount of work cleaning up people relying on destructors in general not zeroing everything would be huge. I think our time would be better spent working on the compiler.
If the leg work ends up getting done in the compiler to optimize these away, I'm perfectly happy reverting this patch. I just think it's going to be a huge amount of work (quarters to a year like Reid mentioned).
> I'm not sure I follow. Are you saying we're unnecessarily initializing stuff multiple times in the string constructors, or was that more of a general comment?
This was a general comment related to us trying to deploy `-ftrivial-auto-var-init` and trying to reduce performance overhead from it.
https://github.com/llvm/llvm-project/pull/164718
More information about the libcxx-commits
mailing list