[UBSan] Proposed Sphynx documentation page for UndefinedBehaviorSanitizer

Sean Silva chisophugis at gmail.com
Thu Dec 18 20:44:36 PST 2014


On Wed, Dec 17, 2014 at 6:05 PM, Morrison, Michael <
Michael_Morrison at playstation.sony.com> wrote:
>
>  Hi All –
>
>
>
> I’ve created a first draft of a standalone Sphynx documentation page for
> the UndefinedBehaviorSanitizer (modelled after the ASan page [
> http://clang.llvm.org/docs/AddressSanitizer.html]).  I envision that the
> information below would largely replace the many links to the
> http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
> section of the Clang User’s Manual.
>
>
>
> Please review the page below; you can send comments to me directly.  In
> particular, I’m not sure if the information in the “Supported Platforms”,
> “Limitations”, and “Current Status” sections is correct.  Thanks in advance
> for your review.
>
>
>
>
>
> ==========================
>
> UndefinedBehaviorSanitizer
>
> ==========================
>
>
>
> .. contents::
>
>    :local:
>
>
>
> Introduction
>
> ============
>
>
>
> "Undefined behavior" is a concept known especially in the C and C++
> languages.
>
> Undefined behavior means that the semantics of a certain operation is
> undefined
>
> by the language. For example, using a non-static variable before it has
> been
>
> initialized leads to undefined behavior.
>
>
>
> A program that can execute undefined behavior is not well-formed; thus, a
>
> compiler assumes that undefined behavior cannot happen. It uses this
> assumption
>
> to drive certain kinds of optimizations. When it encounters undefined
> behavior,
>
> a compiler is free to act in any manner it chooses, either as an explicit
>
> choice or as natural fallout from some algorithm.
>
>
>
> Note that undefined behavior is different from “implementation-defined
> behavior.”
>
> For implementation-defined behavior, the program is well-formed and the
> compiler
>
> must select a specific behavior and document the choices that it makes.
>
>
>
> UndefinedBehaviorSanitizer (UBSan) is a fast undefined-behavior detector
>
> implemented in Clang and Compiler-rt. It consists of a compiler
>
> instrumentation module and a run-time library. The tool can detect a
>
> number of types of bugs, for example:
>
>
>
> * Use of a misaligned pointer or a null pointer.
>
> * Load of a ``bool`` value which is neither ``true`` nor ``false``.
>
> * Conversion to, from, or between floating-point types which would
>
>   overflow the destination.
>
> * Floating point or integer division by zero.
>
> * Signed or unsigned integer overflow.
>
>
>
> The UndefinedBehaviorSanitizer has a small runtime cost and no impact on
>
> address-space layout or ABI.
>

Currently this whole introduction section has a totally different vibe (and
hence seems sort of out of place) with the rest of the sanitizer docs. I'd
like to handle the addition of this sort of "marketing"/"here's the
user-facing version of why you should be interested in this" content in a
separate patch. A short, succinct introduction like the other sanitizer
docs is what would fit in best right now.

Currently the root clang docs page is filling up with various sanitizer
docs with no clear indication of how they're related or why you should use
them. I have been wanting for a while now to break out a Sanitizers.rst
page targeted at compiler end-users which gives an overview of the
sanitizers. That page would be a natural place for this "marketing"/"here's
the user-facing version of why you should be interested in this" content
(feel free to submit a patch adding it! It's low on my priority list right
now). One thing to be resolved with this sort of content is that a lot of
good content already exists, like
https://code.google.com/p/thread-sanitizer/wiki/CppManual, and we will need
to decide how to integrate/link to it (thankfully we have easy contact with
the people responsible with those pages to help figure out the best
solution).



>
>
> How to build
>
> ============
>
>
>
> Follow the `clang build instructions <../get_started.html>`_. CMake build
> is
>
> supported.
>
>
>
> Usage
>
> =====
>
>
>
> Compile and link your program with the
> ``-f[no-]sanitize=check1,check2,...``
>
> flag. To link to the appropriate runtime library, you must also provide
> the
>
> ``-fsanitize=`` argument. When using ``-fsanitize=vptr`` (or a group that
>
> includes it, such as ``-fsanitize=undefined``) with a C++ program, the
> link
>
> must be performed by ``clang++``, not ``clang``, in order to link against
>
> the C++-specific parts of the runtime library.
>

I would mention the basic usage of -fsanitize=undefined first and foremost.
That you can fine-tune it is a "detail" that should occupy very little of
the reader's time; merely making them aware that it is possible and
pointing them to more info.


>
>
> You cannot combine more than one of the ``-fsanitize=address``,
>
> ``-fsanitize=thread``, and ``-fsanitize=memory`` checkers in the same
> program.
>

This is not directly ubsan related so I would omit it.


>  The ``-fsanitize=undefined`` checks can be combined with other
> sanitizers.
>
>
>
> .. code-block:: console
>
>
>
>     % cat example_UnsafeDivision.cc
>
>     int32_t unsafe_div_int32_t (int32_t a, int32_t b) {
>
>         return a / b;     // undefined if b==0
>
>     }
>
>
>
>     # Compile and link
>
>     % clang -O1 -g -fsanitize=undefined example_UnsafeDivision.cc
>
>
>
> or:
>

Having both these examples seems redundant. Probably just the second one is
fine since it is likely to match the user's workflow better.


>
>
> .. code-block:: console
>
>
>
>     # Compile
>
>     % clang -O1 -g -fsanitize=undefined -c example_UnsafeDivision.cc
>
>     # Link
>
>     % clang++ -g -fsanitize=undefined example_UnsafeDivision.o
>
>
>
> By default, after a sanitizer diagnoses an issue, it will attempt to
> continue
>
> executing the program if there is a reasonable behavior it can give to the
>
> faulting operation. You can include the ``-fno-sanitize-recover'' flag to
> cause
>
> the program to abort instead.
>
>
>
> You can also provide the ``-fsanitize-undefined-trap-on-error`` flag to
> cause
>
> traps to be emitted rather than calls to runtime libraries when a problem
> is
>
> detected. This option is intended for use in cases where the sanitizer
> runtime
>
> cannot be used (for example, when building libc or a kernel module). This
> option
>
> is only compatible with the sanitizers in the ``undefined-trap`` group.
>
>
>
> Undefined Behavior Checks
>
> -------------------------
>
> Clang provides several ways to check for undefined behavior.
>

This sentence seems incredibly redundant. Also, the title of this section
should be something like "command line options".


>
>
> **-f[no-]sanitize=check1,check2,...**
>
>    Turn on runtime checks for various forms of undefined or suspicious
>
>    behavior.
>
>
>
>    This option controls whether Clang adds runtime checks for various
>
>    forms of undefined or suspicious behavior, and is disabled by
>
>    default. If a check fails, a diagnostic message is produced at
>
>    runtime explaining the problem. The main checks are:
>
>
>
>    -  .. _opt_fsanitize_undefined:
>
>
>
>       ``-fsanitize=undefined``: Fast and compatible undefined behavior
>
>       checker. Enables the undefined behavior checks that have small
>
>       runtime cost and no impact on address space layout or ABI. This
>
>       includes all of the checks listed below other than
>
>       ``unsigned-integer-overflow``.
>
>
>
>    -  ``-fsanitize=undefined-trap``: This includes all sanitizers
>
>       included by ``-fsanitize=undefined``, except those that require
>
>       runtime support. This group of sanitizers is intended to be
>
>       used in conjunction with the ``-fsanitize-undefined-trap-on-error``
>
>       flag. This includes all of the checks listed below other than
>
>       ``unsigned-integer-overflow`` and ``vptr``.
>
>
>
>    The following more fine-grained checks are also available:
>
>
>
>    -  ``-fsanitize=alignment``: Use of a misaligned pointer or creation
>
>       of a misaligned reference.
>
>    -  ``-fsanitize=bool``: Load of a ``bool`` value which is neither
>
>       ``true`` nor ``false``.
>
>    -  ``-fsanitize=bounds``: Out of bounds array indexing, in cases
>
>       where the array bound can be statically determined.
>
>    -  ``-fsanitize=enum``: Load of a value of an enumerated type which
>
>       is not in the range of representable values for that enumerated
>
>       type.
>
>    -  ``-fsanitize=float-cast-overflow``: Conversion to, from, or
>
>       between floating-point types which would overflow the
>
>       destination.
>
>    -  ``-fsanitize=float-divide-by-zero``: Floating point division by
>
>       zero.
>
>    -  ``-fsanitize=function``: Indirect call of a function through a
>
>       function pointer of the wrong type (Linux, C++ and x86/x86_64 only).
>
>    -  ``-fsanitize=integer-divide-by-zero``: Integer division by zero.
>
>    -  ``-fsanitize=nonnull-attribute``: Passing null pointer as a function
>
>       parameter which is declared to never be null.
>
>    -  ``-fsanitize=null``: Use of a null pointer or creation of a null
>
>       reference.
>
>    -  ``-fsanitize=object-size``: An attempt to use bytes which the
>
>       optimizer can determine are not part of the object being
>
>       accessed. The sizes of objects are determined using
>
>       ``__builtin_object_size``, and consequently may be able to detect
>
>       more problems at higher optimization levels.
>
>    -  ``-fsanitize=return``: In C++, reaching the end of a
>
>       value-returning function without returning a value.
>
>    -  ``-fsanitize=returns-nonnull-attribute``: Returning null pointer
>
>       from a function which is declared to never return null.
>
>    -  ``-fsanitize=shift``: Shift operators where the amount shifted is
>
>       greater or equal to the promoted bit-width of the left-hand side
>
>       or less than zero, or where the left-hand side is negative. For a
>
>       signed left shift, also checks for signed overflow in C, and for
>
>       unsigned overflow in C++.
>
>    -  ``-fsanitize=signed-integer-overflow``: Signed integer overflow,
>
>       including all the checks added by ``-ftrapv``, and checking for
>
>       overflow in signed division (``INT_MIN / -1``).
>
>    -  ``-fsanitize=unreachable``: If control flow reaches
>
>       ``__builtin_unreachable``.
>
>    -  ``-fsanitize=unsigned-integer-overflow``: Unsigned integer
>
>       overflows.
>
>    -  ``-fsanitize=vla-bound``: A variable-length array whose bound
>
>       does not evaluate to a positive value.
>
>    -  ``-fsanitize=vptr``: Use of an object whose vptr indicates that
>
>       it is of the wrong dynamic type, or that its lifetime has not
>
>       begun or has ended. Incompatible with ``-fno-rtti``.
>

Maybe it would be better to add an anchor on the list in UsersManual.rst
and link to it.


>
>
> Blacklist
>
> ---------
>
>
>
> UndefinedBehaviorSanitizer supports entity types defined in
>
> :doc:`SanitizerSpecialCaseList` that can be used to suppress error reports
> in the
>
> specified source files or functions. Use
> ``-fsanitize-blacklist=/path/to/blacklist/file``
>
> to disable or modify sanitizer checks for objects listed in the file. You
> can also use
>
> ``-fno-sanitize-blacklist`` to not use a blacklist file if it was
> specified earlier
>
> in the command line.
>

Everything but the first sentence here seems redundant.


>
>
> Supported Platforms
>
> ===================
>
>
>
> UndefinedBehaviorSanitizer is supported on
>
>
>
> * Linux i386/x86\_64 (tested on Ubuntu 12.04);
>
> * MacOS 10.6 - 10.9 (i386/x86\_64).
>
> * Android ARM
>
>
>
> Ports to various other platforms are in progress.
>
>
>
> Limitations
>
> ===========
>
>
>
> * Static linking is not supported.
>
>
>
> Current Status
>
> ==============
>
>
>
> UndefinedBehaviorSanitizer is fully functional on supported platforms
> starting from LLVM
>
> 3.3. The test suite is integrated into CMake build and can be run with
> ``make
>
> check-ubsan`` command.
>
>
>
> More Information
>
> ================
>
>
>
> See `What Every C Programmer Should Know About Undefined Behavior
>
> <http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>`_
> from
>
> the LLVM Project Blog and `A Guide to Undefined Behavior in C and C++
>
> <http://blog.regehr.org/archives/213>`_ from John Regehr's *Embedded in
>
> Academia* blog for an introduction to Undefined Behavior in C and C++.
>
>
>
> `http://www.chromium.org/developers/testing/undefinedbehaviorsanitizer <
> http://www.chromium.org/developers/testing/undefinedbehaviorsanitizer>`_
>
>
>
> Cheers,
>
> Michael
>
> △○×□    お疲れ様です
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141218/61cb3ce1/attachment.html>


More information about the llvm-commits mailing list