r188403 - Add support for -fsanitize-blacklist and default blacklists for DFSan.
Peter Collingbourne
peter at pcc.me.uk
Wed Aug 14 11:54:18 PDT 2013
Author: pcc
Date: Wed Aug 14 13:54:18 2013
New Revision: 188403
URL: http://llvm.org/viewvc/llvm-project?rev=188403&view=rev
Log:
Add support for -fsanitize-blacklist and default blacklists for DFSan.
Also add some documentation.
Differential Revision: http://llvm-reviews.chandlerc.com/D1346
Modified:
cfe/trunk/docs/DataFlowSanitizer.rst
cfe/trunk/docs/DataFlowSanitizerDesign.rst
cfe/trunk/docs/index.rst
cfe/trunk/lib/CodeGen/BackendUtil.cpp
cfe/trunk/lib/Driver/SanitizerArgs.cpp
Modified: cfe/trunk/docs/DataFlowSanitizer.rst
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/DataFlowSanitizer.rst?rev=188403&r1=188402&r2=188403&view=diff
==============================================================================
--- cfe/trunk/docs/DataFlowSanitizer.rst (original)
+++ cfe/trunk/docs/DataFlowSanitizer.rst Wed Aug 14 13:54:18 2013
@@ -2,6 +2,11 @@
DataFlowSanitizer
=================
+.. toctree::
+ :hidden:
+
+ DataFlowSanitizerDesign
+
.. contents::
:local:
@@ -28,6 +33,82 @@ The APIs are defined in the header file
For further information about each function, please refer to the header
file.
+ABI List
+--------
+
+DataFlowSanitizer uses a list of functions known as an ABI list to decide
+whether a call to a specific function should use the operating system's native
+ABI or whether it should use a variant of this ABI that also propagates labels
+through function parameters and return values. The ABI list file also controls
+how labels are propagated in the former case. DataFlowSanitizer comes with a
+default ABI list which is intended to eventually cover the glibc library on
+Linux but it may become necessary for users to extend the ABI list in cases
+where a particular library or function cannot be instrumented (e.g. because
+it is implemented in assembly or another language which DataFlowSanitizer does
+not support) or a function is called from a library or function which cannot
+be instrumented.
+
+DataFlowSanitizer's ABI list file is a :doc:`SanitizerSpecialCaseList`.
+The pass treats every function in the ``uninstrumented`` category in the
+ABI list file as conforming to the native ABI. Unless the ABI list contains
+additional categories for those functions, a call to one of those functions
+will produce a warning message, as the labelling behavior of the function
+is unknown. The other supported categories are ``discard``, ``functional``
+and ``custom``.
+
+* ``discard`` -- To the extent that this function writes to (user-accessible)
+ memory, it also updates labels in shadow memory (this condition is trivially
+ satisfied for functions which do not write to user-accessible memory). Its
+ return value is unlabelled.
+* ``functional`` -- Like ``discard``, except that the label of its return value
+ is the union of the label of its arguments.
+* ``custom`` -- Instead of calling the function, a custom wrapper ``__dfsw_F``
+ is called, where ``F`` is the name of the function. This function may wrap
+ the original function or provide its own implementation. This category is
+ generally used for uninstrumentable functions which write to user-accessible
+ memory or which have more complex label propagation behavior. The signature
+ of ``__dfsw_F`` is based on that of ``F`` with each argument having a
+ label of type ``dfsan_label`` appended to the argument list. If ``F``
+ is of non-void return type a final argument of type ``dfsan_label *``
+ is appended to which the custom function can store the label for the
+ return value. For example:
+
+.. code-block:: c++
+
+ void f(int x);
+ void __dfsw_f(int x, dfsan_label x_label);
+
+ void *memcpy(void *dest, const void *src, size_t n);
+ void *__dfsw_memcpy(void *dest, const void *src, size_t n,
+ dfsan_label dest_label, dfsan_label src_label,
+ dfsan_label n_label, dfsan_label *ret_label);
+
+If a function defined in the translation unit being compiled belongs to the
+``uninstrumented`` category, it will be compiled so as to conform to the
+native ABI. Its arguments will be assumed to be unlabelled, but it will
+propagate labels in shadow memory.
+
+For example:
+
+.. code-block:: none
+
+ # main is called by the C runtime using the native ABI.
+ fun:main=uninstrumented
+ fun:main=discard
+
+ # malloc only writes to its internal data structures, not user-accessible memory.
+ fun:malloc=uninstrumented
+ fun:malloc=discard
+
+ # tolower is a pure function.
+ fun:tolower=uninstrumented
+ fun:tolower=functional
+
+ # memcpy needs to copy the shadow from the source to the destination region.
+ # This is done in a custom function.
+ fun:memcpy=uninstrumented
+ fun:memcpy=custom
+
Example
=======
Modified: cfe/trunk/docs/DataFlowSanitizerDesign.rst
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/DataFlowSanitizerDesign.rst?rev=188403&r1=188402&r2=188403&view=diff
==============================================================================
--- cfe/trunk/docs/DataFlowSanitizerDesign.rst (original)
+++ cfe/trunk/docs/DataFlowSanitizerDesign.rst Wed Aug 14 13:54:18 2013
@@ -140,3 +140,68 @@ associated directly with registers. Loa
all shadow labels corresponding to bytes loaded (which most of the
time will be short circuited by the initial comparison) and stores will
result in a copy of the label to the shadow of all bytes stored to.
+
+Propagating labels through arguments
+------------------------------------
+
+In order to propagate labels through function arguments and return values,
+DataFlowSanitizer changes the ABI of each function in the translation unit.
+There are currently two supported ABIs:
+
+* Args -- Argument and return value labels are passed through additional
+ arguments and by modifying the return type.
+
+* TLS -- Argument and return value labels are passed through TLS variables
+ ``__dfsan_arg_tls`` and ``__dfsan_retval_tls``.
+
+The main advantage of the TLS ABI is that it is more tolerant of ABI mismatches
+(TLS storage is not shared with any other form of storage, whereas extra
+arguments may be stored in registers which under the native ABI are not used
+for parameter passing and thus could contain arbitrary values). On the other
+hand the args ABI is more efficient and allows ABI mismatches to be more easily
+identified by checking for nonzero labels in nominally unlabelled programs.
+
+Implementing the ABI list
+-------------------------
+
+The `ABI list <DataFlowSanitizer.html#abi-list>`_ provides a list of functions
+which conform to the native ABI, each of which is callable from an instrumented
+program. This is implemented by replacing each reference to a native ABI
+function with a reference to a function which uses the instrumented ABI.
+Such functions are automatically-generated wrappers for the native functions.
+For example, given the ABI list example provided in the user manual, the
+following wrappers will be generated under the args ABI:
+
+.. code-block:: llvm
+
+ define linkonce_odr { i8*, i16 } @"dfsw$malloc"(i64 %0, i16 %1) {
+ entry:
+ %2 = call i8* @malloc(i64 %0)
+ %3 = insertvalue { i8*, i16 } undef, i8* %2, 0
+ %4 = insertvalue { i8*, i16 } %3, i16 0, 1
+ ret { i8*, i16 } %4
+ }
+
+ define linkonce_odr { i32, i16 } @"dfsw$tolower"(i32 %0, i16 %1) {
+ entry:
+ %2 = call i32 @tolower(i32 %0)
+ %3 = insertvalue { i32, i16 } undef, i32 %2, 0
+ %4 = insertvalue { i32, i16 } %3, i16 %1, 1
+ ret { i32, i16 } %4
+ }
+
+ define linkonce_odr { i8*, i16 } @"dfsw$memcpy"(i8* %0, i8* %1, i64 %2, i16 %3, i16 %4, i16 %5) {
+ entry:
+ %labelreturn = alloca i16
+ %6 = call i8* @__dfsw_memcpy(i8* %0, i8* %1, i64 %2, i16 %3, i16 %4, i16 %5, i16* %labelreturn)
+ %7 = load i16* %labelreturn
+ %8 = insertvalue { i8*, i16 } undef, i8* %6, 0
+ %9 = insertvalue { i8*, i16 } %8, i16 %7, 1
+ ret { i8*, i16 } %9
+ }
+
+As an optimization, direct calls to native ABI functions will call the
+native ABI function directly and the pass will compute the appropriate label
+internally. This has the advantage of reducing the number of union operations
+required when the return value label is known to be zero (i.e. ``discard``
+functions, or ``functional`` functions with known unlabelled arguments).
Modified: cfe/trunk/docs/index.rst
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/index.rst?rev=188403&r1=188402&r2=188403&view=diff
==============================================================================
--- cfe/trunk/docs/index.rst (original)
+++ cfe/trunk/docs/index.rst Wed Aug 14 13:54:18 2013
@@ -21,6 +21,7 @@ Using Clang as a Compiler
AddressSanitizer
ThreadSanitizer
MemorySanitizer
+ DataFlowSanitizer
SanitizerSpecialCaseList
Modules
FAQ
Modified: cfe/trunk/lib/CodeGen/BackendUtil.cpp
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/BackendUtil.cpp?rev=188403&r1=188402&r2=188403&view=diff
==============================================================================
--- cfe/trunk/lib/CodeGen/BackendUtil.cpp (original)
+++ cfe/trunk/lib/CodeGen/BackendUtil.cpp Wed Aug 14 13:54:18 2013
@@ -208,7 +208,10 @@ static void addThreadSanitizerPass(const
static void addDataFlowSanitizerPass(const PassManagerBuilder &Builder,
PassManagerBase &PM) {
- PM.add(createDataFlowSanitizerPass());
+ const PassManagerBuilderWrapper &BuilderWrapper =
+ static_cast<const PassManagerBuilderWrapper&>(Builder);
+ const CodeGenOptions &CGOpts = BuilderWrapper.getCGOpts();
+ PM.add(createDataFlowSanitizerPass(CGOpts.SanitizerBlacklistFile));
}
void EmitAssemblyHelper::CreatePasses(TargetMachine *TM) {
Modified: cfe/trunk/lib/Driver/SanitizerArgs.cpp
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/SanitizerArgs.cpp?rev=188403&r1=188402&r2=188403&view=diff
==============================================================================
--- cfe/trunk/lib/Driver/SanitizerArgs.cpp (original)
+++ cfe/trunk/lib/Driver/SanitizerArgs.cpp Wed Aug 14 13:54:18 2013
@@ -307,6 +307,9 @@ bool SanitizerArgs::getDefaultBlacklistF
BlacklistFile = "msan_blacklist.txt";
else if (Kind & NeedsTsanRt)
BlacklistFile = "tsan_blacklist.txt";
+ else if (Kind & NeedsDfsanRt)
+ BlacklistFile = "dfsan_abilist.txt";
+
if (BlacklistFile) {
SmallString<64> Path(D.ResourceDir);
llvm::sys::path::append(Path, BlacklistFile);
More information about the cfe-commits
mailing list