[libcxx-commits] [libcxx] [llvm] [libcxx] Remove ASan container overflow checks for SSO strings (PR #194208)
Vitaly Buka via libcxx-commits
libcxx-commits at lists.llvm.org
Sun May 3 16:56:39 PDT 2026
vitalybuka wrote:
@boomanaiden154 @philnik777
For context on why I still want to pursue this eventually, I ran some variations of the ASan checks on our internal codebase to see how much value the SSO checks actually provide. Here is the relative count of `container-overflow` reports across three different runs:
1. **Long strings only (Baseline - 100%)**
* Produces no false positives. However some reports looks a little pedantic, but seems worth fixing:
```
std::string scratch;
scratch.reserver(10000); // resize makes it correct
DoStuff(scratch.data(), scratch.capacity());
```
2. **Long strings only + entirely suppressing SSO (190%)**
```
// Initialize the internal buffer to hold __size elements
// The elements and null terminator have to be set by the caller
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 pointer __init_internal_buffer(size_type __size) {
if (__libcpp_is_constant_evaluated())
__rep_ = __rep();
if (__size > max_size())
__throw_length_error();
size_type __min_cap_val = __size;
# if _LIBCPP_ENABLE_ASAN_CONTAINER_CHECKS_FOR_STRING
if (!__libcpp_is_constant_evaluated())
__min_cap_val = std::max(__min_cap_val, size_type(32));
# endif
if (__fits_in_sso(__min_cap_val)) {
__set_short_size(__size);
__annotate_new(__size);
return __get_short_pointer();
} else {
__rep_.__l = __allocate_long_buffer(__alloc_, __size, __min_cap_val);
__annotate_new(__size);
return __get_long_pointer();
}
}
```
* Essentially forcing `reserve(32)` everywhere so strings are always out-of-line.
* This works without issues, doesn't seem to break ABI, and produces nearly 2x the reports of No.1. This clearly shows that checking short strings makes a massive difference in catching bugs.
3. **Annotations for both long and short strings (HEAD as-is - 400%)**
* Produces 4x the reports of the baseline. I haven't investigated the breakdown yet, but the spike is likely a mix of:
* Maybe a few speculative load false-positives, I don't think too many (they need a fix in the compiler or `std::string`).
* Pedantic checks breaking code that takes illegal shortcuts by assuming SSO behavior (not sure what to do about these yet).
* True positives from invalidated pointers and iterators after a `std::move` (though overriding move might help us detect these even in approach No.2).
Given that checking short strings catches significantly more bugs (as seen in run No.2), I think it's worth taking a little time to find a proper solution rather than completely abandoning the feature.
> The basic idea is that we can look at the object representation
Regarding this idea, I suspect the implementation will need to use some `no_sanitize` attributes anyway. The issue is that `std::variant<>` can contain other user-defined types with their own ASan annotations, and inspecting their raw object representation will likely trigger the sanitizer regardless of what we do with `std::string`.
https://github.com/llvm/llvm-project/pull/194208
More information about the libcxx-commits
mailing list