[llvm-dev] RFC: Extend UBSan with qsort checks
    Yuri Gribov via llvm-dev 
    llvm-dev at lists.llvm.org
       
    Wed Jan 13 22:37:49 PST 2016
    
    
  
On Thu, Jan 14, 2016 at 3:43 AM, Paul Pluzhnikov <ppluzhnikov at google.com> wrote:
> On Wed, Jan 13, 2016 at 4:17 PM, Kostya Serebryany <kcc at google.com> wrote:
>> Inviting Paul to the party (he wrote the libstdc++ sort checker).
>>
>> On Tue, Jan 12, 2016 at 11:09 PM, Yury Gribov <y.gribov at samsung.com> wrote:
>>>
>>> On 01/13/2016 10:08 AM, Yury Gribov wrote:
>>>>
>>>> On 01/13/2016 09:57 AM, Kostya Serebryany wrote:
>>>>>
>>>>> On Tue, Jan 12, 2016 at 10:28 PM, Yury Gribov <y.gribov at samsung.com>
>>>>> wrote:
>>>>>
>>>>>> On 01/13/2016 03:10 AM, Kostya Serebryany wrote:
>>>>>>
>>>>>>> FTR, here is one way to implement this in the library:
>>>>>>>
>>>>>>>
>>>>>>> https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h
>>>>>>>
>>>>>>> Search for "check sort predicate for strict weak ordering"
>>>>>>>
>>>>>>
>>>>>> Nice, although this wouldn't catch violations of transitivity (which is
>>>>>> probably the most important type of bug).
>>>>>>
>>>>>>
>>>>> I thought it would...
>>>>
>>>>
>>>> But debug operator() only knows about 2 inputs whereas for transitivity
>>>> you'll need to consider at least 3.
>
> As I privately replied to Kostya, I was only willing to add constant overhead.
Sure, there are various conflicting tradeoffs. SortChecker started as
an experiment so I was as agressive as possible.
> SortChecker adds N*N overhead:
> https://github.com/yugr/sortcheck/blob/master/src/sortchecker.c#L427
> except he caps it at N=32 on line 413
I'm afraid it's even N^3 (with N=32 cap). This indeed sounds scary but
I have not seen significant slowdowns when running instrumented
distro.
> (and thus will not detect all
> violations either).
Right, I thought about improving this with testing random 32 elements
instead of the first ones.
> Adding up to 1024 comparisons to large sort()s maybe isn't so bad.
>
>>>>> Can you give a test?
>>>
>>>
>>> Here's an example of transitivity violation in GCC:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68988
>
> Here is one from GDB:
> https://sourceware.org/bugzilla/show_bug.cgi?id=19361
Yeah but this one is still in discussion so I didn't bother to mention it.
>>>>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov via llvm-dev <
>>>>>>>
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>> (+correct cfe-dev list)
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Yuri,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev <
>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> UndefinedBehaviorSanitizer currently does not check for undefined
>>>>>>>>>> behaviors which result from improper usage of standard library
>>>>>>>>>> functions.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> It's not really an undefined behavior per se: you just violate the
>>>>>>>>> contract provided by library functions (at least I haven't found a
>>>>>>>>> clause
>>>>>>>>> in C++ standard
>>>>>>>>> that specifies that the behavior of smth. like std::sort() with
>>>>>>>>> invalid
>>>>>>>>> operator< is invalid). The actual behavior of these functions is
>>>>>>>>> determined by library authors, and can be anything from UB to
>>>>>>>>> unspecified
>>>>>>>>> result, to nicely formatted error report (if you're using a debug
>>>>>>>>> version
>>>>>>>>> of system library):
>>>>>>>>> I think I actually encountered all these cases.
>>>>>>>>>
>>>>>>>>> That said, the benefit of checking the sorting routines is clear:
>>>>>>>>> I'm
>>>>>>>>> not
>>>>>>>>> surprised you were able to catch numerous bugs with your tool.
>>>>>>>>>
>>>>>>>>> UBSan currently doesn't have interceptors, and all of its checks
>>>>>>>>> [1] are
>>>>>>>>> designed to work without runtime library support, so that it can be
>>>>>>>>> used as
>>>>>>>>> a mitigation
>>>>>>>>> tool (with -fsanitize-trap=undefined). It's challenging to add UBSan
>>>>>>>>> interceptors at this point: this would limit the tool portability
>>>>>>>>> (at
>>>>>>>>> least
>>>>>>>>> portability of some functionality),
>>>>>>>>> and complicate interoperation of UBSan with another tools. There
>>>>>>>>> is an
>>>>>>>>> opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and
>>>>>>>>> check
>>>>>>>>> arguments there.
>>>>>>>>> This functionality can be controlled by a runtime flag.
>>>>>>>>>
>>>>>>>>> Why do you need compiler instrumentation? Do you want to
>>>>>>>>> automatically
>>>>>>>>> inject hooks to SortChecker into std::sort() functions, so that you
>>>>>>>>> don't
>>>>>>>>> need to annotate C++ standard library code?
>>>>>>>>> We submitted some annotations to libc++ code (e.g. to report
>>>>>>>>> containter
>>>>>>>>> overflow bugs in sanitizers).
>>>>>>>>>
>>>>>>>>> [1] -fsanitize=vptr is an only notable exception
>>>>>>>>>
>>>>>>>>> One notorious instance of such errors is invalid usage of qsort or
>>>>>>>>>
>>>>>>>>>> bsearch routines (or std::sort and friends in case of C++):
>>>>>>>>>> * using comparison function that violates ordering axioms
>>>>>>>>>> (reflexivity,
>>>>>>>>>> symmetry, transitivity)
>>>>>>>>>> * returning unstable results from comparison function
>>>>>>>>>> * passing unsorted array to bsearch
>>>>>>>>>> * etc.
>>>>>>>>>>
>>>>>>>>>> Errors like this will usually result in slightly invalid output
>>>>>>>>>> (not
>>>>>>>>>> fully sorted arrays, rare failed bsearches, etc.) but may as well
>>>>>>>>>> cause
>>>>>>>>>> aborts on some systems (
>>>>>>>>>> https://bugzilla.samba.org/show_bug.cgi?id=3959
>>>>>>>>>> ).
>>>>>>>>>>
>>>>>>>>>> I've recently developed a simple proof-of-concept tool SortChecker
>>>>>>>>>> (
>>>>>>>>>> https://github.com/yugr/sortcheck) which intercepts calls to
>>>>>>>>>> qsort and
>>>>>>>>>> friends (via LD_PRELOAD) and performs various sanity checks before
>>>>>>>>>> passing
>>>>>>>>>> control to libc e.g.
>>>>>>>>>> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements
>>>>>>>>>> * etc.
>>>>>>>>>>
>>>>>>>>>> Results were quite inspiring: I've found several errors in popular
>>>>>>>>>> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also
>>>>>>>>>> pretty
>>>>>>>>>> sure most C++ developers have failed to write correct comparison
>>>>>>>>>> function
>>>>>>>>>> at least once.
>>>>>>>>>>
>>>>>>>>>> Could SortChecker functionality be considered for integration into
>>>>>>>>>> UBSan? Extending SortChecker to C++ land would require
>>>>>>>>>> significant work
>>>>>>>>>> which has already been done in sanitizers (compiler
>>>>>>>>>> instrumentation,
>>>>>>>>>> portable parallel code, etc.). On the other hand, ability to
>>>>>>>>>> detect new
>>>>>>>>>> classes of bugs should benefit UBSan and community.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -Y
>>>>>>>>>> _______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Alexey Samsonov
>>>>>>>>> vonosmas at gmail.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Alexey Samsonov
>>>>>>>> vonosmas at gmail.com
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
> --
> Paul Pluzhnikov
    
    
More information about the llvm-dev
mailing list