[libcxx-commits] [PATCH] D132769: [2b/3][ASan][libcxx] std::basic_string annotations
Nikolas Klauser via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Thu Feb 23 11:43:32 PST 2023
philnik added inline comments.
================
Comment at: libcxx/include/string:867
# endif
- : __r_(std::move(__str.__r_)) {
+ : __r_( (__str.__annotate_delete(), std::move(__str.__r_)) ) {
__str.__default_init();
----------------
The `__annotate_delete()` shouldn't be necessary here, right? The function is already marked `_LIBCPP_STRING_INTERNAL_MEMORY_ACCESS`, so it should be enough to call `__annotate_delete()` right before `__default_init()` if it's needed at all.
================
Comment at: libcxx/include/string:869
__str.__default_init();
+ __str.__annotate_new(0);
std::__debug_db_insert_c(this);
----------------
This seems to be redundant. `__default_init()` already calls `__annotate_new(0)`.
================
Comment at: libcxx/include/string:1767-1770
+ _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS
+ allocator_type& __alloc() _NOEXCEPT { return __r_.second(); }
+ _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS
+ const allocator_type& __alloc() const _NOEXCEPT { return __r_.second(); }
----------------
What exactly is the problem here? The memory should never actually be accessed, right? `__r_.second()` should just return a reference to either
1. memory outside the poisoned area, or
2. a reference inside the poisoned area that will never actually be used in a meaningful way (because of EBO).
================
Comment at: libcxx/include/string:1832
+ const void* __begin, const void* __end, const void* __old_mid, const void* __new_mid) const {
+ if (!__libcpp_is_constant_evaluated() && __begin)
+# if _LIBCPP_CLANG_VER < 1600
----------------
================
Comment at: libcxx/include/string:1853-1865
+ if (__is_long())
+ __annotate_contiguous_container(
+ std::__to_address(__get_long_pointer()),
+ std::__to_address(__get_long_pointer() + __get_long_cap() + 1),
+ std::__to_address(__get_long_pointer() + __get_long_cap() + 1),
+ std::__to_address(__get_long_pointer() + __current_size + 1));
+ else if (__annotate_short_string_check())
----------------
This should be equivalent, or did I miss anything? Also, I think the `+ 1` for the old version is too much. `__get_long_cap()` returns the capacity //including// the null terminator. The `__current_size` argument should also be unnecessary if we assume the string meets the invariants. Assuming I didn't miss anything, this (in part) also applies to the other helpers.
================
Comment at: libcxx/include/string:2070
inline basic_string& __assign_short(const value_type* __s, size_type __n) {
+ size_type __old_sz = size();
+ if (__n > __old_sz)
----------------
Nit: I think `__old_size` would be nicer. I know that's not how it's done elsewhere, but at least I'm not Hessian.
================
Comment at: libcxx/include/string:2647
+ // after __grow_by and correct state is necessery when changing annotations.
+ // __null_terminate_at is changing annotations.
value_type* __p = std::__to_address(__get_pointer());
----------------
What invariant doesn't hold after calling `__grow_by`?
================
Comment at: libcxx/include/string:2704-2706
+ // ASan annotations: that operation also copies annotated
+ // (poisoned in shadow memory) bytes, so normally it should crash but
+ // we disable its instrumentation for correctness reasons
----------------
I think this comments makes more sense above the function, since it's about it in general.
================
Comment at: libcxx/include/string:3353
+ // after __grow_by and correct state is necessery when changing annotations.
+ // __null_terminate_at is changing annotations.
+ traits_type::assign(__p + __pos, __n2, __c);
----------------
Same question here: Which invariants are not met after calling `__grow_by`? (And maybe add a comment to `__grow_by` to mention that it fails to keep the invariants)
================
Comment at: libcxx/include/string:4491
_String::__alloc_traits::select_on_container_copy_construction(__lhs.get_allocator()));
+ __r.__annotate_delete();
auto __ptr = std::__to_address(__r.__get_pointer());
----------------
These annotations shouldn't be necessary. `string(__uninitialized_size_tag(), ...)` already sets the correct annotations.
================
Comment at: libcxx/test/std/strings/basic.string/string.asan/append.pass.cpp:1
+//===----------------------------------------------------------------------===//
+//
----------------
These tests are very libc++-specific, so we should just put them in `test/libcxx/string/basic.string` and then have one test per overload. I know this is quite a bit of refactoring work, but it makes verifying the tests and figuring out what is broken a lot easier when it's necessary. You can just mirror the naming inside `test/std` and add a `.asan` in-between, i.e. `std/.../string_append/initalizer_list.pass.cpp` would become `libcxx/.../string_append/initializer_list.asan.pass.cpp`.
I didn't take a super close look at the tests yet, but other than splitting them up they look quite good.
================
Comment at: libcxx/test/std/strings/basic.string/string.asan/append.pass.cpp:18
+
+template <typename CharT>
+size_t strlen(const CharT* str);
----------------
We normally use `class` here.
================
Comment at: libcxx/test/std/strings/basic.string/string.asan/append.pass.cpp:26-64
+#ifndef TEST_HAS_NO_WIDE_CHARACTERS
+template <>
+size_t strlen<wchar_t>(const wchar_t* str) {
+ return std::wcslen(str);
+}
+#endif
+
----------------
Maybe just use `std::char_traits<T>::length(str)`? This is currently technically UB, since you are adding overloads for a function from the standard without a user-defined type. Also, function template specializations are quite brittle. Just use overloads instead. You never call the `char` version, since the C version is a better match. (https://godbolt.org/z/W8qdeGqPd)
================
Comment at: libcxx/test/std/strings/basic.string/string.asan/append.pass.cpp:68
+void test_simple_append(const CharT* s_init) {
+ typedef std::basic_string<CharT> S;
+ S empty_s;
----------------
Let's use `using` here instead. (Also works in C++03)
================
Comment at: libcxx/test/std/strings/basic.string/string.asan/swap.pass.cpp:74-99
+ {
+ typedef char CharT;
+ test<CharT>('x');
+ }
+#ifndef TEST_HAS_NO_WIDE_CHARACTERS
+ {
+ typedef wchar_t CharT;
----------------
It should be possible to simplify these instantiations to `meta::for_each(meta::character_types(), TestClass());` (probably modulo most vexing parse).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D132769/new/
https://reviews.llvm.org/D132769
More information about the libcxx-commits
mailing list