[llvm-dev] RFC: Extend UBSan with qsort checks

Paul Pluzhnikov via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 13 16:43:21 PST 2016


On Wed, Jan 13, 2016 at 4:17 PM, Kostya Serebryany <kcc at google.com> wrote:
> Inviting Paul to the party (he wrote the libstdc++ sort checker).
>
> On Tue, Jan 12, 2016 at 11:09 PM, Yury Gribov <y.gribov at samsung.com> wrote:
>>
>> On 01/13/2016 10:08 AM, Yury Gribov wrote:
>>>
>>> On 01/13/2016 09:57 AM, Kostya Serebryany wrote:
>>>>
>>>> On Tue, Jan 12, 2016 at 10:28 PM, Yury Gribov <y.gribov at samsung.com>
>>>> wrote:
>>>>
>>>>> On 01/13/2016 03:10 AM, Kostya Serebryany wrote:
>>>>>
>>>>>> FTR, here is one way to implement this in the library:
>>>>>>
>>>>>>
>>>>>> https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h
>>>>>>
>>>>>> Search for "check sort predicate for strict weak ordering"
>>>>>>
>>>>>
>>>>> Nice, although this wouldn't catch violations of transitivity (which is
>>>>> probably the most important type of bug).
>>>>>
>>>>>
>>>> I thought it would...
>>>
>>>
>>> But debug operator() only knows about 2 inputs whereas for transitivity
>>> you'll need to consider at least 3.

As I privately replied to Kostya, I was only willing to add constant overhead.

SortChecker adds N*N overhead:
https://github.com/yugr/sortcheck/blob/master/src/sortchecker.c#L427
except he caps it at N=32 on line 413 (and thus will not detect all
violations either).
Adding up to 1024 comparisons to large sort()s maybe isn't so bad.

>>>> Can you give a test?
>>
>>
>> Here's an example of transitivity violation in GCC:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68988

Here is one from GDB:
https://sourceware.org/bugzilla/show_bug.cgi?id=19361


>>
>>>>
>>>>>
>>>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov via llvm-dev <
>>>>>>
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>> (+correct cfe-dev list)
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Yuri,
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> UndefinedBehaviorSanitizer currently does not check for undefined
>>>>>>>>> behaviors which result from improper usage of standard library
>>>>>>>>> functions.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> It's not really an undefined behavior per se: you just violate the
>>>>>>>> contract provided by library functions (at least I haven't found a
>>>>>>>> clause
>>>>>>>> in C++ standard
>>>>>>>> that specifies that the behavior of smth. like std::sort() with
>>>>>>>> invalid
>>>>>>>> operator< is invalid). The actual behavior of these functions is
>>>>>>>> determined by library authors, and can be anything from UB to
>>>>>>>> unspecified
>>>>>>>> result, to nicely formatted error report (if you're using a debug
>>>>>>>> version
>>>>>>>> of system library):
>>>>>>>> I think I actually encountered all these cases.
>>>>>>>>
>>>>>>>> That said, the benefit of checking the sorting routines is clear:
>>>>>>>> I'm
>>>>>>>> not
>>>>>>>> surprised you were able to catch numerous bugs with your tool.
>>>>>>>>
>>>>>>>> UBSan currently doesn't have interceptors, and all of its checks
>>>>>>>> [1] are
>>>>>>>> designed to work without runtime library support, so that it can be
>>>>>>>> used as
>>>>>>>> a mitigation
>>>>>>>> tool (with -fsanitize-trap=undefined). It's challenging to add UBSan
>>>>>>>> interceptors at this point: this would limit the tool portability
>>>>>>>> (at
>>>>>>>> least
>>>>>>>> portability of some functionality),
>>>>>>>> and complicate interoperation of UBSan with another tools. There
>>>>>>>> is an
>>>>>>>> opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and
>>>>>>>> check
>>>>>>>> arguments there.
>>>>>>>> This functionality can be controlled by a runtime flag.
>>>>>>>>
>>>>>>>> Why do you need compiler instrumentation? Do you want to
>>>>>>>> automatically
>>>>>>>> inject hooks to SortChecker into std::sort() functions, so that you
>>>>>>>> don't
>>>>>>>> need to annotate C++ standard library code?
>>>>>>>> We submitted some annotations to libc++ code (e.g. to report
>>>>>>>> containter
>>>>>>>> overflow bugs in sanitizers).
>>>>>>>>
>>>>>>>> [1] -fsanitize=vptr is an only notable exception
>>>>>>>>
>>>>>>>> One notorious instance of such errors is invalid usage of qsort or
>>>>>>>>
>>>>>>>>> bsearch routines (or std::sort and friends in case of C++):
>>>>>>>>> * using comparison function that violates ordering axioms
>>>>>>>>> (reflexivity,
>>>>>>>>> symmetry, transitivity)
>>>>>>>>> * returning unstable results from comparison function
>>>>>>>>> * passing unsorted array to bsearch
>>>>>>>>> * etc.
>>>>>>>>>
>>>>>>>>> Errors like this will usually result in slightly invalid output
>>>>>>>>> (not
>>>>>>>>> fully sorted arrays, rare failed bsearches, etc.) but may as well
>>>>>>>>> cause
>>>>>>>>> aborts on some systems (
>>>>>>>>> https://bugzilla.samba.org/show_bug.cgi?id=3959
>>>>>>>>> ).
>>>>>>>>>
>>>>>>>>> I've recently developed a simple proof-of-concept tool SortChecker
>>>>>>>>> (
>>>>>>>>> https://github.com/yugr/sortcheck) which intercepts calls to
>>>>>>>>> qsort and
>>>>>>>>> friends (via LD_PRELOAD) and performs various sanity checks before
>>>>>>>>> passing
>>>>>>>>> control to libc e.g.
>>>>>>>>> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements
>>>>>>>>> * etc.
>>>>>>>>>
>>>>>>>>> Results were quite inspiring: I've found several errors in popular
>>>>>>>>> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also
>>>>>>>>> pretty
>>>>>>>>> sure most C++ developers have failed to write correct comparison
>>>>>>>>> function
>>>>>>>>> at least once.
>>>>>>>>>
>>>>>>>>> Could SortChecker functionality be considered for integration into
>>>>>>>>> UBSan? Extending SortChecker to C++ land would require
>>>>>>>>> significant work
>>>>>>>>> which has already been done in sanitizers (compiler
>>>>>>>>> instrumentation,
>>>>>>>>> portable parallel code, etc.). On the other hand, ability to
>>>>>>>>> detect new
>>>>>>>>> classes of bugs should benefit UBSan and community.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> -Y
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Alexey Samsonov
>>>>>>>> vonosmas at gmail.com
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Alexey Samsonov
>>>>>>> vonosmas at gmail.com
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



-- 
Paul Pluzhnikov


More information about the llvm-dev mailing list