[cfe-dev] annotating libc++ to catch buffer overflows in vector/string with asan

Tue Feb 4 09:51:03 PST 2014

Hi Marshall,

On Tue, Feb 4, 2014 at 9:00 PM, Marshall Clow <mclow.lists at gmail.com> wrote:

>
> On Jan 15, 2014, at 1:53 AM, Kostya Serebryany <kcc at google.com> wrote:
>
> Marshall,
> Any chance to start the review this month?
> Slightly updated patch attached.
> FYI: we applied the same approach to libstdc++, ran various tests and
> found dozens of bugs.
> These annotations also help to achieve more precise leak detection with
> LeakSanitizer.
> https://code.google.com/p/address-sanitizer/wiki/ContainerOverflow
>
>
> Kostya —
>
> First off, I’m sorry for the slow response.
>

np, I was having some other fun :)

> Second, I’m very much in favor of adding these capabilities to the
> standard library.
>
> Third, looking over this code, it seems to me to be a lot of changes.
>

This is sadly so...

> You seem to tracking all the changes for the storage in vector,
> discriminating between insertions, deletions, growth and shrinkage. That
> seems overly complicated to me.
>
> Thinking about invariants; we know that std::vector maintains a block of
> memory
> bounded by  [data(), data()+(sizeof(value_type)*capacity())),
> of which the block [data(), data()+(sizeof(value_type)*size))
> is actually in use.
>

Correct.

>
> Why not just turn off “inside the block” ASAN checking upon entry to
> routines that modify the vector, and reestablish it upon return?
>
Because asan-off and asan-on have complexity O(capacity()).
At every point of time the vector looks like GGGGGGPBBB,
where G is good 8-bytes, B is bad 8-bytes and P is a single partially good
8-byte word.
The status of every 8 bytes in application is encoded in a single byte in
shadow.
turn-off or turn-on operation will require to write capacity()/8 bytes to
the shadow.

>
> So, for push_back() (as an example), we could write:
>
> template <class _Tp, class _Allocator>
> inline _LIBCPP_INLINE_VISIBILITY
> void
> vector<_Tp, _Allocator>::push_back(const_reference __x)
> {
>     __turn_off_ASAN_internal_for(data());
>     if (this->__end_ != this->__end_cap())
>     {
>         __alloc_traits::construct(this->__alloc(),
>                                   _VSTD::__to_raw_pointer(this->__end_),
> __x);
>         ++this->__end_;
>     }
>     else
>         __push_back_slow_path(__x);
>     __turn_on_ASAN_internal_for(data(), data()+size());  // or whatever
> parameters ASAN needs
> }
>
> [ In actuality, I wouldn’t do it this way, I’d put the calls in the
> constructor/destructor of an object; that way fast returns and exceptions
> would not interfere, but you get the idea. ]
>
>
> If this patch has any chance to get committed eventually, I'd like to
> understand your preference regarding the testing.
> One approach is to test it with asan: then all you need is to run all
> existing vector tests built with
> clang -fsanitize=address -D_LIBCPP_ADDRESS_SANITIZER_ANNOTATIONS
> But this will tie the testing process to clang, I am not sure if that is
> acceptable for you (totally ok for me).
> Another approach is to provide a fake standalone implementation of
> __sanitizer_annotate_contiguous_container;
> it will be a slightly less accurate testing, but will not depend on clang.
>
>
> I think to test this we’ll have to augment the libc++ test process.
> It currently has two kinds of tests - ones that fail to compile, and ones
> that compile and run successfully.
> We’ll either need a third kind of test, one that is supposed to fail at
> runtime, -or-, an ASAN hook to catch ASAN errors and record them.
> I’m leaning towards the first one - I don’t think we want to deliberately
> invoke undefined behavior (which is what this is), and expect anything
> further from the test.
>

Some tests could be "ones that compile and run successfully"
We can validate the shadow of the vector using
__asan_memory_region_is_poisoned.
I have a test that does various operations on vector and then calls this:

extern "C" bool __asan_address_is_poisoned(void *addr);

template <class T> void TestVector(std::vector<T> &v) {
  if (v.capacity() == 0) return;
  char *beg = (char*)v.data();
  char *end = beg + v.capacity() * sizeof(T);
  char *mid = beg + v.size() * sizeof(T);
  for (char *p = beg; p < mid; p++)
    assert(!__asan_address_is_poisoned(p));
  for (char *p = mid ; p < end; p++)
    assert(__asan_address_is_poisoned(p));
}

I'd also like to have at least a couple of tests that are supposed to fail
at runtime,
just like the lit-style tests we have for the rest of LLVM.

In both cases we will need to test libc++ with the fresh clang compiler
(which supports asan and asan's annotations)
Will this work?

--kcc

>
> What do you think?
>
> — Marshall
>
>
> Thanks,
>
> --kcc
>
>
>
> On Tue, Dec 10, 2013 at 6:07 PM, Kostya Serebryany <kcc at google.com> wrote:
>
>> Here is another version; this time I've been able to build and run
>> Chromium with libc++ and these annotations.
>>
>> --kcc
>>
>>
>> On Mon, Dec 9, 2013 at 6:58 PM, Kostya Serebryany <kcc at google.com> wrote:
>>
>>> Marshall,
>>> Here is a more complete patch for asan vector annotations.
>>> It passes all tests in libcxx/test/containers/sequences/vector, but
>>> doesn't yet work on larger programs (I tried Chrome).
>>> Two questions:
>>>    1. Does it look like something potentially committable?
>>>    2. Would you suggest some bigger test suite specifically targeted to
>>> test std::vector?
>>>
>>> Thanks,
>>>
>>> --kcc
>>>
>>>
>>>
>>> On Wed, Nov 20, 2013 at 6:35 PM, Kostya Serebryany <kcc at google.com>wrote:
>>>
>>>> Attached is an intermediate patch that passes my tests (also attached).
>>>> Few questions:
>>>>
>>>> The patch gets a bit too verbose and it has a few repetitions, e.g. we
>>>> have this snippet repeated multiple times:
>>>> +            __annotate_contiguous_container(data(), data() +
>>>> capacity(),
>>>> +                                            data() + size(), data() +
>>>> size() + 1);
>>>>
>>>> WDYT about adding a private method to vector like this (and a couple
>>>> more)?
>>>>
>>>> void __annotate_size_increase(size_t __n) {
>>>>     __annotate_contiguous_container(data(), data() + capacity(),
>>>>                                             data() + size(), data() +
>>>> size() + __n);
>>>> }
>>>>
>>>> In order to test the annotation properly I need to do something like
>>>> this as often as possible:
>>>> template <class T> void TestVector(std::vector<T> &v) {
>>>>
>>>>   if (v.capacity() == 0) return;
>>>>
>>>>   char *beg = (char*)&v.front();
>>>>
>>>>   char *end = beg + v.capacity() * sizeof(T);
>>>>
>>>>   char *mid = beg + v.size() * sizeof(T);
>>>>
>>>>   for (char *p = beg; p < mid; p++)
>>>>
>>>>     assert(!__asan_address_is_poisoned(p));
>>>>
>>>>   for (char *p = mid ; p < end; p++)
>>>>
>>>>     assert(__asan_address_is_poisoned(p));
>>>>
>>>> }
>>>>
>>>> Is it possible to add code like this as another private method of
>>>> vector (under some ifdef)?
>>>>
>>>> --kcc
>>>>
>>>>
>>>> On Tue, Nov 19, 2013 at 1:57 PM, Kostya Serebryany <kcc at google.com>wrote:
>>>>
>>>>>
>>>>>> > Do you like the idea?
>>>>>>
>>>>>> Yes - very much so.
>>>>>>
>>>>>> > Any comments about the prototype patch?
>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>> 1) I would implement it with inline functions other than macros.
>>>>>>
>>>>>> Like this:
>>>>>>
>>>>>>
>>>>>> extern "C" void __sanitizer_annotate_contiguous_container(void *,
>>>>>> void *, void *, void *);
>>>>>> void inline __annotate_contiguous container
>>>>>>         ( const void *__beg, const void *__end, const void
>>>>>> *__old_mid, const void *new_mid )
>>>>>> #if __has_feature(address_sanitizer)
>>>>>>         {  __sanitizer_annotate_contiguous_container(beg, end,
>>>>>> old_mid, new_mid); }
>>>>>> #else
>>>>>>         {}
>>>>>> #endif
>>>>>>
>>>>>
>>>>> Makes sense, see below.
>>>>> We''l probably want to move __annotate_contiguous container into a
>>>>> separate file so that both string and vector use it.
>>>>> But let's do the whole thing with vector before moving to string.
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> 2) I'd like more information about what
>>>>>> __sanitizer_annotate_contiguous_container does ;-)
>>>>>>
>>>>>
>>>>> In short, it [un]poisons the parts of shadow memory that correspond to
>>>>> the container's memory.
>>>>> More details in compiler-rt files:
>>>>>  include/sanitizer/common_interface_defs.h and lib/asan/asan_poisoning.cc
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> > Any suggestion how to test these annotations in libc++ (we need
>>>>>> some kind of DEATH tests)?
>>>>>>
>>>>>> We already have failing tests in libc++.
>>>>>> they are named XXXX.fail.cpp.
>>>>>>
>>>>>
>>>>> These are tests that fail to compile.
>>>>> for asan annotations we need tests that build and run, and then either
>>>>> fail at run-time or check the state of shadow.
>>>>> Like these:
>>>>> compiler-rt/lib/asan/lit_tests/TestCases/contiguous_container.cc
>>>>> compiler-rt/lib/asan/lit_tests/TestCases/contiguous_container_crash.cc
>>>>>
>>>>> Suggestions?
>>>>>
>>>>> --kcc
>>>>>
>>>>> --- include/vector      (revision 195116)
>>>>> +++ include/vector      (working copy)
>>>>> @@ -288,6 +288,16 @@
>>>>>
>>>>>  _LIBCPP_BEGIN_NAMESPACE_STD
>>>>>
>>>>> +extern "C" void __sanitizer_annotate_contiguous_container(
>>>>> +  const void *, const void *, const void *, const void *);
>>>>> +void inline __annotate_contiguous_container
>>>>> +    (const void *__beg, const void *__end, const void *__old_mid,
>>>>> const void *__new_mid)
>>>>> +#if __has_feature(address_sanitizer)
>>>>> +    {  __sanitizer_annotate_contiguous_container(__beg, __end,
>>>>> __old_mid, __new_mid); }
>>>>> +#else
>>>>> +    {}
>>>>> +#endif
>>>>> +
>>>>>  template <bool>
>>>>>  class __vector_base_common
>>>>>  {
>>>>> @@ -1525,7 +1535,12 @@
>>>>>      // __v.push_back(_VSTD::forward<_Up>(__x));
>>>>>      __alloc_traits::construct(__a,
>>>>> _VSTD::__to_raw_pointer(__v.__end_), _VSTD::forward<_Up>(__x));
>>>>>      __v.__end_++;
>>>>> +    if (data())
>>>>> +      __annotate_contiguous_container(
>>>>>  +          data(), data() + capacity(), data() + size(), data() +
>>>>> capacity());
>>>>>      __swap_out_circular_buffer(__v);
>>>>> +    __annotate_contiguous_container(data(), data() + capacity(),
>>>>> +                                          data() + capacity(), data()
>>>>> + size());
>>>>>  }
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
> <vector_asan.patch>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140204/0c19847d/attachment.html>