[llvm-dev] RFC: Extend UBSan with qsort checks

Yury Gribov via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 12 23:09:38 PST 2016


On 01/13/2016 10:08 AM, Yury Gribov wrote:
> On 01/13/2016 09:57 AM, Kostya Serebryany wrote:
>> On Tue, Jan 12, 2016 at 10:28 PM, Yury Gribov <y.gribov at samsung.com>
>> wrote:
>>
>>> On 01/13/2016 03:10 AM, Kostya Serebryany wrote:
>>>
>>>> FTR, here is one way to implement this in the library:
>>>>
>>>> https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h
>>>>
>>>> Search for "check sort predicate for strict weak ordering"
>>>>
>>>
>>> Nice, although this wouldn't catch violations of transitivity (which is
>>> probably the most important type of bug).
>>>
>>>
>> I thought it would...
>
> But debug operator() only knows about 2 inputs whereas for transitivity
> you'll need to consider at least 3.
>
>> Can you give a test?

Here's an example of transitivity violation in GCC: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68988

>>
>>>
>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> (+correct cfe-dev list)
>>>>>
>>>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi Yuri,
>>>>>>
>>>>>> On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>>
>>>>>>> UndefinedBehaviorSanitizer currently does not check for undefined
>>>>>>> behaviors which result from improper usage of standard library
>>>>>>> functions.
>>>>>>>
>>>>>>>
>>>>>> It's not really an undefined behavior per se: you just violate the
>>>>>> contract provided by library functions (at least I haven't found a
>>>>>> clause
>>>>>> in C++ standard
>>>>>> that specifies that the behavior of smth. like std::sort() with
>>>>>> invalid
>>>>>> operator< is invalid). The actual behavior of these functions is
>>>>>> determined by library authors, and can be anything from UB to
>>>>>> unspecified
>>>>>> result, to nicely formatted error report (if you're using a debug
>>>>>> version
>>>>>> of system library):
>>>>>> I think I actually encountered all these cases.
>>>>>>
>>>>>> That said, the benefit of checking the sorting routines is clear: I'm
>>>>>> not
>>>>>> surprised you were able to catch numerous bugs with your tool.
>>>>>>
>>>>>> UBSan currently doesn't have interceptors, and all of its checks
>>>>>> [1] are
>>>>>> designed to work without runtime library support, so that it can be
>>>>>> used as
>>>>>> a mitigation
>>>>>> tool (with -fsanitize-trap=undefined). It's challenging to add UBSan
>>>>>> interceptors at this point: this would limit the tool portability (at
>>>>>> least
>>>>>> portability of some functionality),
>>>>>> and complicate interoperation of UBSan with another tools. There
>>>>>> is an
>>>>>> opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and
>>>>>> check
>>>>>> arguments there.
>>>>>> This functionality can be controlled by a runtime flag.
>>>>>>
>>>>>> Why do you need compiler instrumentation? Do you want to
>>>>>> automatically
>>>>>> inject hooks to SortChecker into std::sort() functions, so that you
>>>>>> don't
>>>>>> need to annotate C++ standard library code?
>>>>>> We submitted some annotations to libc++ code (e.g. to report
>>>>>> containter
>>>>>> overflow bugs in sanitizers).
>>>>>>
>>>>>> [1] -fsanitize=vptr is an only notable exception
>>>>>>
>>>>>> One notorious instance of such errors is invalid usage of qsort or
>>>>>>
>>>>>>> bsearch routines (or std::sort and friends in case of C++):
>>>>>>> * using comparison function that violates ordering axioms
>>>>>>> (reflexivity,
>>>>>>> symmetry, transitivity)
>>>>>>> * returning unstable results from comparison function
>>>>>>> * passing unsorted array to bsearch
>>>>>>> * etc.
>>>>>>>
>>>>>>> Errors like this will usually result in slightly invalid output (not
>>>>>>> fully sorted arrays, rare failed bsearches, etc.) but may as well
>>>>>>> cause
>>>>>>> aborts on some systems (
>>>>>>> https://bugzilla.samba.org/show_bug.cgi?id=3959
>>>>>>> ).
>>>>>>>
>>>>>>> I've recently developed a simple proof-of-concept tool SortChecker (
>>>>>>> https://github.com/yugr/sortcheck) which intercepts calls to
>>>>>>> qsort and
>>>>>>> friends (via LD_PRELOAD) and performs various sanity checks before
>>>>>>> passing
>>>>>>> control to libc e.g.
>>>>>>> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements
>>>>>>> * etc.
>>>>>>>
>>>>>>> Results were quite inspiring: I've found several errors in popular
>>>>>>> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also
>>>>>>> pretty
>>>>>>> sure most C++ developers have failed to write correct comparison
>>>>>>> function
>>>>>>> at least once.
>>>>>>>
>>>>>>> Could SortChecker functionality be considered for integration into
>>>>>>> UBSan? Extending SortChecker to C++ land would require
>>>>>>> significant work
>>>>>>> which has already been done in sanitizers (compiler instrumentation,
>>>>>>> portable parallel code, etc.). On the other hand, ability to
>>>>>>> detect new
>>>>>>> classes of bugs should benefit UBSan and community.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -Y
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Alexey Samsonov
>>>>>> vonosmas at gmail.com
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Alexey Samsonov
>>>>> vonosmas at gmail.com
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the llvm-dev mailing list