[cfe-dev] annotating libc++ to catch buffer overflows in vector/string with asan

Kostya Serebryany kcc at google.com
Fri Feb 21 03:01:20 PST 2014


FTR: we have similar changes in our branch of libstdc++:
http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=207517
These annotations helped us detect ~15 bugs, and counting.

--kcc


On Tue, Feb 4, 2014 at 9:51 PM, Kostya Serebryany <kcc at google.com> wrote:

>
> Hi Marshall,
>
> On Tue, Feb 4, 2014 at 9:00 PM, Marshall Clow <mclow.lists at gmail.com>wrote:
>
>>
>> On Jan 15, 2014, at 1:53 AM, Kostya Serebryany <kcc at google.com> wrote:
>>
>> Marshall,
>> Any chance to start the review this month?
>> Slightly updated patch attached.
>> FYI: we applied the same approach to libstdc++, ran various tests and
>> found dozens of bugs.
>> These annotations also help to achieve more precise leak detection with
>> LeakSanitizer.
>> https://code.google.com/p/address-sanitizer/wiki/ContainerOverflow
>>
>>
>> Kostya —
>>
>> First off, I’m sorry for the slow response.
>>
>
> np, I was having some other fun :)
>
>
>> Second, I’m very much in favor of adding these capabilities to the
>> standard library.
>>
>> Third, looking over this code, it seems to me to be a lot of changes.
>>
>
> This is sadly so...
>
>
>> You seem to tracking all the changes for the storage in vector,
>> discriminating between insertions, deletions, growth and shrinkage. That
>> seems overly complicated to me.
>>
>> Thinking about invariants; we know that std::vector maintains a block of
>> memory
>> bounded by  [data(), data()+(sizeof(value_type)*capacity())),
>> of which the block [data(), data()+(sizeof(value_type)*size))
>> is actually in use.
>>
>
> Correct.
>
>
>>
>> Why not just turn off “inside the block” ASAN checking upon entry to
>> routines that modify the vector, and reestablish it upon return?
>>
> Because asan-off and asan-on have complexity O(capacity()).
> At every point of time the vector looks like GGGGGGPBBB,
> where G is good 8-bytes, B is bad 8-bytes and P is a single partially good
> 8-byte word.
> The status of every 8 bytes in application is encoded in a single byte in
> shadow.
> turn-off or turn-on operation will require to write capacity()/8 bytes to
> the shadow.
>
>
>
>>
>> So, for push_back() (as an example), we could write:
>>
>> template <class _Tp, class _Allocator>
>> inline _LIBCPP_INLINE_VISIBILITY
>> void
>> vector<_Tp, _Allocator>::push_back(const_reference __x)
>> {
>>     __turn_off_ASAN_internal_for(data());
>>     if (this->__end_ != this->__end_cap())
>>     {
>>         __alloc_traits::construct(this->__alloc(),
>>                                   _VSTD::__to_raw_pointer(this->__end_),
>> __x);
>>         ++this->__end_;
>>     }
>>     else
>>         __push_back_slow_path(__x);
>>     __turn_on_ASAN_internal_for(data(), data()+size());  // or whatever
>> parameters ASAN needs
>> }
>>
>> [ In actuality, I wouldn’t do it this way, I’d put the calls in the
>> constructor/destructor of an object; that way fast returns and exceptions
>> would not interfere, but you get the idea. ]
>>
>>
>> If this patch has any chance to get committed eventually, I'd like to
>> understand your preference regarding the testing.
>> One approach is to test it with asan: then all you need is to run all
>> existing vector tests built with
>> clang -fsanitize=address -D_LIBCPP_ADDRESS_SANITIZER_ANNOTATIONS
>> But this will tie the testing process to clang, I am not sure if that is
>> acceptable for you (totally ok for me).
>> Another approach is to provide a fake standalone implementation of
>> __sanitizer_annotate_contiguous_container;
>> it will be a slightly less accurate testing, but will not depend on
>> clang.
>>
>>
>> I think to test this we’ll have to augment the libc++ test process.
>> It currently has two kinds of tests - ones that fail to compile, and ones
>> that compile and run successfully.
>> We’ll either need a third kind of test, one that is supposed to fail at
>> runtime, -or-, an ASAN hook to catch ASAN errors and record them.
>> I’m leaning towards the first one - I don’t think we want to deliberately
>> invoke undefined behavior (which is what this is), and expect anything
>> further from the test.
>>
>
> Some tests could be "ones that compile and run successfully"
> We can validate the shadow of the vector using
> __asan_memory_region_is_poisoned.
> I have a test that does various operations on vector and then calls this:
>
> extern "C" bool __asan_address_is_poisoned(void *addr);
>
> template <class T> void TestVector(std::vector<T> &v) {
>   if (v.capacity() == 0) return;
>   char *beg = (char*)v.data();
>   char *end = beg + v.capacity() * sizeof(T);
>   char *mid = beg + v.size() * sizeof(T);
>   for (char *p = beg; p < mid; p++)
>     assert(!__asan_address_is_poisoned(p));
>   for (char *p = mid ; p < end; p++)
>     assert(__asan_address_is_poisoned(p));
> }
>
> I'd also like to have at least a couple of tests that are supposed to fail
> at runtime,
> just like the lit-style tests we have for the rest of LLVM.
>
> In both cases we will need to test libc++ with the fresh clang compiler
> (which supports asan and asan's annotations)
> Will this work?
>
> --kcc
>
>
>
>>
>> What do you think?
>>
>> — Marshall
>>
>>
>> Thanks,
>>
>> --kcc
>>
>>
>>
>> On Tue, Dec 10, 2013 at 6:07 PM, Kostya Serebryany <kcc at google.com>wrote:
>>
>>> Here is another version; this time I've been able to build and run
>>> Chromium with libc++ and these annotations.
>>>
>>> --kcc
>>>
>>>
>>> On Mon, Dec 9, 2013 at 6:58 PM, Kostya Serebryany <kcc at google.com>wrote:
>>>
>>>> Marshall,
>>>> Here is a more complete patch for asan vector annotations.
>>>> It passes all tests in libcxx/test/containers/sequences/vector, but
>>>> doesn't yet work on larger programs (I tried Chrome).
>>>> Two questions:
>>>>    1. Does it look like something potentially committable?
>>>>    2. Would you suggest some bigger test suite specifically targeted to
>>>> test std::vector?
>>>>
>>>> Thanks,
>>>>
>>>> --kcc
>>>>
>>>>
>>>>
>>>> On Wed, Nov 20, 2013 at 6:35 PM, Kostya Serebryany <kcc at google.com>wrote:
>>>>
>>>>> Attached is an intermediate patch that passes my tests (also
>>>>> attached). Few questions:
>>>>>
>>>>> The patch gets a bit too verbose and it has a few repetitions, e.g. we
>>>>> have this snippet repeated multiple times:
>>>>> +            __annotate_contiguous_container(data(), data() +
>>>>> capacity(),
>>>>> +                                            data() + size(), data() +
>>>>> size() + 1);
>>>>>
>>>>> WDYT about adding a private method to vector like this (and a couple
>>>>> more)?
>>>>>
>>>>> void __annotate_size_increase(size_t __n) {
>>>>>     __annotate_contiguous_container(data(), data() + capacity(),
>>>>>                                             data() + size(), data() +
>>>>> size() + __n);
>>>>> }
>>>>>
>>>>> In order to test the annotation properly I need to do something like
>>>>> this as often as possible:
>>>>> template <class T> void TestVector(std::vector<T> &v) {
>>>>>
>>>>>   if (v.capacity() == 0) return;
>>>>>
>>>>>   char *beg = (char*)&v.front();
>>>>>
>>>>>   char *end = beg + v.capacity() * sizeof(T);
>>>>>
>>>>>   char *mid = beg + v.size() * sizeof(T);
>>>>>
>>>>>   for (char *p = beg; p < mid; p++)
>>>>>
>>>>>     assert(!__asan_address_is_poisoned(p));
>>>>>
>>>>>   for (char *p = mid ; p < end; p++)
>>>>>
>>>>>     assert(__asan_address_is_poisoned(p));
>>>>>
>>>>> }
>>>>>
>>>>> Is it possible to add code like this as another private method of
>>>>> vector (under some ifdef)?
>>>>>
>>>>> --kcc
>>>>>
>>>>>
>>>>> On Tue, Nov 19, 2013 at 1:57 PM, Kostya Serebryany <kcc at google.com>wrote:
>>>>>
>>>>>>
>>>>>>> > Do you like the idea?
>>>>>>>
>>>>>>> Yes - very much so.
>>>>>>>
>>>>>>> > Any comments about the prototype patch?
>>>>>>>
>>>>>>> Yes.
>>>>>>>
>>>>>>> 1) I would implement it with inline functions other than macros.
>>>>>>>
>>>>>>> Like this:
>>>>>>>
>>>>>>>
>>>>>>> extern "C" void __sanitizer_annotate_contiguous_container(void *,
>>>>>>> void *, void *, void *);
>>>>>>> void inline __annotate_contiguous container
>>>>>>>         ( const void *__beg, const void *__end, const void
>>>>>>> *__old_mid, const void *new_mid )
>>>>>>> #if __has_feature(address_sanitizer)
>>>>>>>         {  __sanitizer_annotate_contiguous_container(beg, end,
>>>>>>> old_mid, new_mid); }
>>>>>>> #else
>>>>>>>         {}
>>>>>>> #endif
>>>>>>>
>>>>>>
>>>>>> Makes sense, see below.
>>>>>> We''l probably want to move __annotate_contiguous container into a
>>>>>> separate file so that both string and vector use it.
>>>>>> But let's do the whole thing with vector before moving to string.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> 2) I'd like more information about what
>>>>>>> __sanitizer_annotate_contiguous_container does ;-)
>>>>>>>
>>>>>>
>>>>>> In short, it [un]poisons the parts of shadow memory that correspond
>>>>>> to the container's memory.
>>>>>> More details in compiler-rt files:
>>>>>>  include/sanitizer/common_interface_defs.h and lib/asan/asan_poisoning.cc
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> > Any suggestion how to test these annotations in libc++ (we need
>>>>>>> some kind of DEATH tests)?
>>>>>>>
>>>>>>> We already have failing tests in libc++.
>>>>>>> they are named XXXX.fail.cpp.
>>>>>>>
>>>>>>
>>>>>> These are tests that fail to compile.
>>>>>> for asan annotations we need tests that build and run, and then
>>>>>> either fail at run-time or check the state of shadow.
>>>>>> Like these:
>>>>>> compiler-rt/lib/asan/lit_tests/TestCases/contiguous_container.cc
>>>>>> compiler-rt/lib/asan/lit_tests/TestCases/contiguous_container_crash.cc
>>>>>>
>>>>>> Suggestions?
>>>>>>
>>>>>> --kcc
>>>>>>
>>>>>> --- include/vector      (revision 195116)
>>>>>> +++ include/vector      (working copy)
>>>>>> @@ -288,6 +288,16 @@
>>>>>>
>>>>>>  _LIBCPP_BEGIN_NAMESPACE_STD
>>>>>>
>>>>>> +extern "C" void __sanitizer_annotate_contiguous_container(
>>>>>> +  const void *, const void *, const void *, const void *);
>>>>>> +void inline __annotate_contiguous_container
>>>>>> +    (const void *__beg, const void *__end, const void *__old_mid,
>>>>>> const void *__new_mid)
>>>>>> +#if __has_feature(address_sanitizer)
>>>>>> +    {  __sanitizer_annotate_contiguous_container(__beg, __end,
>>>>>> __old_mid, __new_mid); }
>>>>>> +#else
>>>>>> +    {}
>>>>>> +#endif
>>>>>> +
>>>>>>  template <bool>
>>>>>>  class __vector_base_common
>>>>>>  {
>>>>>> @@ -1525,7 +1535,12 @@
>>>>>>      // __v.push_back(_VSTD::forward<_Up>(__x));
>>>>>>      __alloc_traits::construct(__a,
>>>>>> _VSTD::__to_raw_pointer(__v.__end_), _VSTD::forward<_Up>(__x));
>>>>>>      __v.__end_++;
>>>>>> +    if (data())
>>>>>> +      __annotate_contiguous_container(
>>>>>>  +          data(), data() + capacity(), data() + size(), data() +
>>>>>> capacity());
>>>>>>      __swap_out_circular_buffer(__v);
>>>>>> +    __annotate_contiguous_container(data(), data() + capacity(),
>>>>>> +                                          data() + capacity(),
>>>>>> data() + size());
>>>>>>  }
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>> <vector_asan.patch>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140221/91a1e825/attachment.html>


More information about the cfe-dev mailing list