[libcxx-commits] [PATCH] D132769: [2b/3][ASan][libcxx] std::basic_string annotations

Tacet via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Sun Jan 22 19:30:47 PST 2023


AdvenamTacet marked an inline comment as done.
AdvenamTacet added inline comments.


================
Comment at: libcxx/include/string:1596
 
-    _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20
-    size_type __get_short_size() const _NOEXCEPT {
-        _LIBCPP_ASSERT(!__r_.first().__s.__is_long_, "String has to be short when trying to get the short size");
-        return __r_.first().__s.__size_;
+    _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS size_type
+    __get_short_size() const _NOEXCEPT {
----------------
philnik wrote:
> I don't think this should be marked `_LIBCPP_INTERNAL_MEMORY_ACCESS`. Or is the size/is_long byte also poisoned for some reason?
Short //size / is_long//is poisoned when it's at the end of the buffer. That's after the content and unaligned, so it has to be poisoned. It may help with detecting BO when short string has maximal size.

So `__get_short_size` should be marked `_LIBCPP_INTERNAL_MEMORY_ACCESS` when metadata byte is at the end.


================
Comment at: libcxx/include/string:1657-1666
+    _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS void __zero() _NOEXCEPT {
+      // ASan annotations:
+      // __zero() may be called when object fields contain random data,
+      // therefore we force caller to handle whole object memory annotations,
+      // as only caller knows if object is already initialized and
+      // __annotate_new() can be safely called.
+      //
----------------
philnik wrote:
> This is now unnecessary.
I'm not sure if I understand. Could you explain?


================
Comment at: libcxx/include/string:1670-1672
+    // The following functions are no-ops outside of AddressSanitizer mode.
+    // We call annotatations only for the default Allocator because other allocators
+    // may not meet the AddressSanitizer alignment constraints.
----------------
philnik wrote:
> AdvenamTacet wrote:
> > philnik wrote:
> > > Couldn't we check whether the allocation satisfies the asan requirements? It looks like the overhead of checking pointer alignment seems quite manageable: https://godbolt.org/z/WeE7Wxx73.
> > We can and now (due to changes in D132522) we do. I hope the check should be safe to remove. I can do it, but then it will be different in std::vector. Do we want to do it now and not (for example) wait for another revision to remove it here and in std::vector?
> I think we should fix it here now and update `vector` in a later patch.
D136765 updates the vector.


================
Comment at: libcxx/include/string:1681
+    _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __annotate_contiguous_container(
+        const void* __beg, const void* __end, const void* __old_mid, const void* __new_mid) const {
+      if (!__libcpp_is_constant_evaluated())
----------------
philnik wrote:
> AdvenamTacet wrote:
> > philnik wrote:
> > > The two characters really don't hurt anybody and makes it a lot nicer to read.
> > I agree, but everywhere else those arguments are called beg, I'm happy to use begin, if inconsistency is not a problem. Is it?
> I guess you mean in `vector`? IMO we should just use proper names here and update `vector` in another patch. That should remove the inconsistency. We also never use `beg` anywhere else in the code base AFAIK, so `vector` is actually inconsistent IMO.
Not only in vector, but also in ASan API/implementation, but I agree that it's more readable in longer form, so I changed it.


================
Comment at: libcxx/include/string:1682-1683
+        const void* __beg, const void* __end, const void* __old_mid, const void* __new_mid) const {
+      if (!__libcpp_is_constant_evaluated())
+        if (__beg && is_same<allocator_type, __default_allocator_type>::value)
+          __sanitizer_annotate_contiguous_container(__beg, __end, __old_mid, __new_mid);
----------------
philnik wrote:
> Could you make this a single `if`?
After changes from D132522 we have to check if `is_same<allocator_type, __default_allocator_type>::value` only for compilers before that patch,, for that only I made a second `if`.


================
Comment at: libcxx/include/string:1777-1780
+      if (__is_long())
+        __annotate_delete<false>();
+      else
+        __annotate_delete<true>();
----------------
philnik wrote:
> Why do you do this special-casing here?
In some places, it is unknown if object is in long or short state, there it has to be runtime check. In other situations, it is known and we can pass that information in template. It's not only about (insignificant) optimization, but it may be necessary if bytes in object contain trash (eg. object in the middle of modification) and `__is_long` may return incorrect value.


================
Comment at: libcxx/include/string:2120-2125
+    // __init may be called after setting shadow memory, then we have to
+    // unpoison object memory.
+    // Chosen solution is to force caller to unpoison memory before call,
+    // as caller knows if memory is already poisoned or not.
+    // It is different than in __zero/__default_init, as there we turned off instrumentation,
+    // but there also caller has to take care of annotations.
----------------
philnik wrote:
> When is `__init` called after poisoning the memory? `__init` should only ever be called by a constructor.
Situations like `__str.__default_init();` in move constructor.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132769/new/

https://reviews.llvm.org/D132769



More information about the libcxx-commits mailing list