[libcxx-commits] [PATCH] D132769: [2b/3][ASan][libcxx] std::basic_string annotations

Tacet via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Feb 23 16:40:13 PST 2023


AdvenamTacet added inline comments.


================
Comment at: libcxx/include/__string/extern_template_lists.h:30
 // must never be removed from the stable list.
+#if !defined(_LIBCPP_HAS_NO_ASAN) && (_LIBCPP_CLANG_VER >= 1600)
+// TODO LLVM18: Remove the special-casing (_LIBCPP_CLANG_VER)
----------------
philnik wrote:
> Why is this required?
We don't really want to use external templates here. Those we cannot control.

Without that casing, non-instrumented functions may be used. This leads to false positives. With that change, during compilation, a version with instrumentation is created. I never had a problem of missing instrumentation after that change.

Now when I look at it, `&& (_LIBCPP_CLANG_VER >= 1600)` should be removed, same problem may happen with older LLVM.


================
Comment at: libcxx/include/string:867
 #  endif
-      : __r_(std::move(__str.__r_)) {
+      : __r_( (__str.__annotate_delete(), std::move(__str.__r_)) ) {
     __str.__default_init();
----------------
philnik wrote:
> The `__annotate_delete()` shouldn't be necessary here, right? The function is already marked `_LIBCPP_STRING_INTERNAL_MEMORY_ACCESS`, so it should be enough to call `__annotate_delete()` right before `__default_init()` if it's needed at all.
Unfortunately not. Sometimes instrumentation is turned on here. I tried that kind of solution first.

During las year I saw a lot of inconsistent behavior of ASan with variable initialization in constructors (`... : var_name(value)`) and turning off instrumentations (`_LIBCPP_STRING_INTERNAL_MEMORY_ACCESS`).

In that case, it's necessary  with `test_allocator` (and probably some other of non-zero size, but I didn't confirm). Sometimes that kind of initialization is instrumented, even when function is not.

If we move it to the first line of the functions body (`__r_ = std::move(__str.__r_);`), then you are right.
And then, `__annotate_delete()` does not have to be called every time.

But with unpoisoning every time, `_LIBCPP_STRING_INTERNAL_MEMORY_ACCESS` seems to be not necessary.


================
Comment at: libcxx/include/string:1853-1865
+        if (__is_long())
+          __annotate_contiguous_container(
+              std::__to_address(__get_long_pointer()),
+              std::__to_address(__get_long_pointer() + __get_long_cap() + 1),
+              std::__to_address(__get_long_pointer() + __get_long_cap() + 1),
+              std::__to_address(__get_long_pointer() + __current_size + 1));
+        else if (__annotate_short_string_check())
----------------
philnik wrote:
> This should be equivalent, or did I miss anything? Also, I think the `+ 1` for the old version is too much. `__get_long_cap()` returns the capacity //including// the null terminator. The `__current_size` argument should also be unnecessary if we assume the string meets the invariants. Assuming I didn't miss anything, this (in part) also applies to the other helpers.
It's not same, but with new `__sanitizer_annotate_contiguous_container` should work.
`data()` returns address of content, but with old implementation address of buffer to annotate had to be aligned, therefore a byte with metadata before content had to be treated as char-in-use all the time.
Now it should work.

At the same time, thanks to `_LIBCPP_STRING_INTERNAL_MEMORY_ACCESS` byte after content (Alternate String Layout) may be poisoned all the time (when string is short).
It has a nice feature of always having a poisoned separator between buffer and a next object, even when red zone is not possible (eg. arrays).

That's a nice suggestion. Thx!


================
Comment at: libcxx/include/string:2704-2706
+        // ASan annotations: that operation also copies annotated
+        // (poisoned in shadow memory) bytes, so normally it should crash but
+        // we disable its instrumentation for correctness reasons
----------------
philnik wrote:
> I think this comments makes more sense above the function, since it's about it in general.
Actually, it probably makes more sense to have it next to `#define _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS ...`, as a few functions should have that comment. I will move it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132769/new/

https://reviews.llvm.org/D132769



More information about the libcxx-commits mailing list