[libcxx-commits] [libcxx] [llvm] [libcxx] Remove ASan container overflow checks for SSO strings (PR #194208)

Vitaly Buka via libcxx-commits libcxx-commits at lists.llvm.org
Sun May 3 16:56:39 PDT 2026


vitalybuka wrote:

@boomanaiden154 @philnik777 

For context on why I still want to pursue this eventually, I ran some variations of the ASan checks on our internal codebase to see how much value the SSO checks actually provide. Here is the relative count of `container-overflow` reports across three different runs:

1. **Long strings only (Baseline - 100%)**
   * Produces no false positives. However some reports looks a little pedantic, but seems worth fixing:
   ```
      std::string scratch;
      scratch.reserver(10000);  // resize makes it correct
     DoStuff(scratch.data(), scratch.capacity());
   ```

2. **Long strings only + entirely suppressing SSO (190%)**
```
// Initialize the internal buffer to hold __size elements
  // The elements and null terminator have to be set by the caller
  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 pointer __init_internal_buffer(size_type __size) {
    if (__libcpp_is_constant_evaluated())
      __rep_ = __rep();

    if (__size > max_size())
      __throw_length_error();

    size_type __min_cap_val = __size;
#  if _LIBCPP_ENABLE_ASAN_CONTAINER_CHECKS_FOR_STRING
    if (!__libcpp_is_constant_evaluated())
      __min_cap_val = std::max(__min_cap_val, size_type(32));
#  endif

    if (__fits_in_sso(__min_cap_val)) {
      __set_short_size(__size);
      __annotate_new(__size);
      return __get_short_pointer();
    } else {
      __rep_.__l = __allocate_long_buffer(__alloc_, __size, __min_cap_val);
      __annotate_new(__size);
      return __get_long_pointer();
    }
  }
```
   * Essentially forcing `reserve(32)` everywhere so strings are always out-of-line. 
   * This works without issues, doesn't seem to break ABI, and produces nearly 2x the reports of No.1. This clearly shows that checking short strings makes a massive difference in catching bugs.

3. **Annotations for both long and short strings (HEAD as-is - 400%)**
   * Produces 4x the reports of the baseline. I haven't investigated the breakdown yet, but the spike is likely a mix of:
      * Maybe a few speculative load false-positives, I don't think too many (they need a fix in the compiler or `std::string`).
      * Pedantic checks breaking code that takes illegal shortcuts by assuming SSO behavior (not sure what to do about these yet).
      * True positives from invalidated pointers and iterators after a `std::move` (though overriding move might help us detect these even in approach No.2).

Given that checking short strings catches significantly more bugs (as seen in run No.2), I think it's worth taking a little time to find a proper solution rather than completely abandoning the feature.

> The basic idea is that we can look at the object representation

Regarding this idea, I suspect the implementation will need to use some `no_sanitize` attributes anyway. The issue is that `std::variant<>` can contain other user-defined types with their own ASan annotations, and inspecting their raw object representation will likely trigger the sanitizer regardless of what we do with `std::string`.

https://github.com/llvm/llvm-project/pull/194208


More information about the libcxx-commits mailing list