[clang] 3dd1d88 - [Clang] Implement labelled type filtering for overflow/truncation sanitizers w/ SSCLs (#107332)

via cfe-commits cfe-commits at lists.llvm.org
Sun Nov 3 23:57:51 PST 2024


Author: Justin Stitt
Date: 2024-11-03T23:57:47-08:00
New Revision: 3dd1d888fb18b40859504e207e57abdc6c43d838

URL: https://github.com/llvm/llvm-project/commit/3dd1d888fb18b40859504e207e57abdc6c43d838
DIFF: https://github.com/llvm/llvm-project/commit/3dd1d888fb18b40859504e207e57abdc6c43d838.diff

LOG: [Clang] Implement labelled type filtering for overflow/truncation sanitizers w/ SSCLs (#107332)

[Related
RFC](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683/5?u=justinstitt)

### Summary

Implement type-based filtering via [Sanitizer Special Case
Lists](https://clang.llvm.org/docs/SanitizerSpecialCaseList.html) for
the arithmetic overflow and truncation sanitizers.

Currently, using the `type:` prefix with these sanitizers does nothing.
I've hooked up the SSCL parsing with Clang codegen so that we don't emit
the overflow/truncation checks if the arithmetic contains an ignored
type.

### Usefulness

You can craft ignorelists that ignore specific types that are expected
to overflow or wrap-around. For example, to ignore `my_type` from
`unsigned-integer-overflow` instrumentation:
```bash
$ cat ignorelist.txt
[unsigned-integer-overflow]
type:my_type=no_sanitize

$ cat foo.c
typedef unsigned long my_type;
void foo() {
  my_type a = ULONG_MAX;
  ++a;
}

$ clang foo.c -fsanitize=unsigned-integer-overflow -fsanitize-ignorelist=ignorelist.txt ; ./a.out
// --> no sanitizer error
```

If a type is functionally intended to overflow, like
[refcount_t](https://kernsec.org/wiki/index.php/Kernel_Protections/refcount_t)
and its associated APIs in the Linux kernel, then this type filtering
would prove useful for reducing sanitizer noise. Currently, the Linux
kernel dealt with this by
[littering](https://elixir.bootlin.com/linux/v6.10.8/source/include/linux/refcount.h#L139
) `__attribute__((no_sanitize("signed-integer-overflow")))` annotations
on all the `refcount_t` APIs. I think this serves as an example of how a
codebase could be made cleaner. We could make custom types that are
filtered out in an ignorelist, allowing for types to be more expressive
-- without the need for annotations. This accomplishes a similar goal to
https://github.com/llvm/llvm-project/pull/86618.


Yet another use case for this type filtering is whitelisting. We could
ignore _all_ types, save a few.

```bash
$ cat ignorelist.txt
[implicit-signed-integer-truncation]
type:*=no_sanitize # ignore literally all types
type:short=sanitize # except `short`

$ cat bar.c
// compile with -fsanitize=implicit-signed-integer-truncation
void bar(int toobig) {
  char a = toobig;  // not instrumented
  short b = toobig; // instrumented
}
```

### Other ways to accomplish the goal of sanitizer
allowlisting/whitelisting
* ignore list SSCL type support (this PR that you're reading)
* [my sanitize-allowlist
branch](https://github.com/llvm/llvm-project/compare/main...JustinStitt:llvm-project:sanitize-allowlist)
- this just implements a sibling flag `-fsanitize-allowlist=`, removing
some of the double negative logic present with `skip`/`ignore` when
trying to whitelist something.
* [Glob
Negation](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683)
- Implement a negation operator to the GlobPattern class so the
ignorelist query can use them to simulate allowlisting


Please let me know which of the three options we like best. They are not
necessarily mutually exclusive.

Here's [another related
PR](https://github.com/llvm/llvm-project/pull/86618) which implements a
`wraps` attribute. This can accomplish a similar goal to this PR but
requires in-source changes to codebases and also covers a wider variety
of integer definedness problems.

### CCs
@kees @vitalybuka @bwendling

---------

Signed-off-by: Justin Stitt <justinstitt at google.com>

Added: 
    clang/test/CodeGen/ubsan-type-ignorelist-category-2.test
    clang/test/CodeGen/ubsan-type-ignorelist-category.test

Modified: 
    clang/docs/ReleaseNotes.rst
    clang/docs/SanitizerSpecialCaseList.rst
    clang/include/clang/AST/ASTContext.h
    clang/lib/AST/ASTContext.cpp
    clang/lib/CodeGen/CGExprScalar.cpp

Removed: 
    


################################################################################
diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 1372e49dfac03c..a8e2e5c1e045ce 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -871,6 +871,14 @@ Sanitizers
   This new flag should allow those projects to enable integer sanitizers with
   less noise.
 
+- Arithmetic overflow sanitizers ``-fsanitize=signed-integer-overflow`` and
+  ``-fsanitize=unsigned-integer-overflow`` as well as the implicit integer
+  truncation sanitizers ``-fsanitize=implicit-signed-integer-truncation`` and
+  ``-fsanitize=implicit-unsigned-integer-truncation`` now properly support the
+  "type" prefix within `Sanitizer Special Case Lists (SSCL)
+  <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html>`_. See that link
+  for examples.
+
 Python Binding Changes
 ----------------------
 - Fixed an issue that led to crashes when calling ``Type.get_exception_specification_kind``.

diff  --git a/clang/docs/SanitizerSpecialCaseList.rst b/clang/docs/SanitizerSpecialCaseList.rst
index c7fb0fa3f8a828..96a7b2fba4ae43 100644
--- a/clang/docs/SanitizerSpecialCaseList.rst
+++ b/clang/docs/SanitizerSpecialCaseList.rst
@@ -15,9 +15,9 @@ file at compile-time.
 Goal and usage
 ==============
 
-Users of sanitizer tools, such as :doc:`AddressSanitizer`, :doc:`ThreadSanitizer`
-or :doc:`MemorySanitizer` may want to disable or alter some checks for
-certain source-level entities to:
+Users of sanitizer tools, such as :doc:`AddressSanitizer`,
+:doc:`ThreadSanitizer`, :doc:`MemorySanitizer` or :doc:`UndefinedBehaviorSanitizer`
+may want to disable or alter some checks for certain source-level entities to:
 
 * speedup hot function, which is known to be correct;
 * ignore a function that does some low-level magic (e.g. walks through the
@@ -48,6 +48,61 @@ Example
   $ clang -fsanitize=address -fsanitize-ignorelist=ignorelist.txt foo.c ; ./a.out
   # No error report here.
 
+Usage with UndefinedBehaviorSanitizer
+=====================================
+
+The arithmetic overflow sanitizers ``unsigned-integer-overflow`` and
+``signed-integer-overflow`` as well as the implicit integer truncation
+sanitizers ``implicit-signed-integer-truncation`` and
+``implicit-unsigned-integer-truncation`` support the ability to adjust
+instrumentation based on type.
+
+By default, supported sanitizers will have their instrumentation disabled for
+types specified within an ignorelist.
+
+.. code-block:: bash
+
+  $ cat foo.c
+  void foo() {
+    int a = 2147483647; // INT_MAX
+    ++a;                // Normally, an overflow with -fsanitize=signed-integer-overflow
+  }
+  $ cat ignorelist.txt
+  [signed-integer-overflow]
+  type:int
+  $ clang -fsanitize=signed-integer-overflow -fsanitize-ignorelist=ignorelist.txt foo.c ; ./a.out
+  # no signed-integer-overflow error
+
+For example, supplying the above ``ignorelist.txt`` to
+``-fsanitize-ignorelist=ignorelist.txt`` disables overflow sanitizer
+instrumentation for arithmetic operations containing values of type ``int``.
+
+The ``=sanitize`` category is also supported. Any types assigned to the
+``sanitize`` category will have their sanitizer instrumentation remain. If the
+same type appears within or across ignorelists with 
diff erent categories the
+``sanitize`` category takes precedence -- regardless of order.
+
+With this, one may disable instrumentation for some or all types and
+specifically allow instrumentation for one or many types -- including types
+created via ``typedef``. This is a way to achieve a sort of "allowlist" for
+supported sanitizers.
+
+.. code-block:: bash
+
+  $ cat ignorelist.txt
+  [implicit-signed-integer-truncation]
+  type:*
+  type:T=sanitize
+
+  $ cat foo.c
+  typedef char T;
+  typedef char U;
+  void foo(int toobig) {
+    T a = toobig;    // instrumented
+    U b = toobig;    // not instrumented
+    char c = toobig; // also not instrumented
+  }
+
 Format
 ======
 

diff  --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 4c1455a3e1bbf2..fb0c051dc19182 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -838,6 +838,9 @@ class ASTContext : public RefCountedBase<ASTContext> {
 
   const NoSanitizeList &getNoSanitizeList() const { return *NoSanitizeL; }
 
+  bool isTypeIgnoredBySanitizer(const SanitizerMask &Mask,
+                                const QualType &Ty) const;
+
   const XRayFunctionFilter &getXRayFilter() const {
     return *XRayFilter;
   }

diff  --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index d248084666d1b0..11e79d296cbec3 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -831,6 +831,15 @@ ASTContext::getCanonicalTemplateTemplateParmDecl(
   return CanonTTP;
 }
 
+/// Check if a type can have its sanitizer instrumentation elided based on its
+/// presence within an ignorelist.
+bool ASTContext::isTypeIgnoredBySanitizer(const SanitizerMask &Mask,
+                                          const QualType &Ty) const {
+  std::string TyName = Ty.getUnqualifiedType().getAsString(getPrintingPolicy());
+  return NoSanitizeL->containsType(Mask, TyName) &&
+         !NoSanitizeL->containsType(Mask, TyName, "sanitize");
+}
+
 TargetCXXABI::Kind ASTContext::getCXXABIKind() const {
   auto Kind = getTargetInfo().getCXXABI().getKind();
   return getLangOpts().CXXABI.value_or(Kind);

diff  --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 9e2c2ad5e0250e..287d911e10ba58 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -197,6 +197,18 @@ static bool CanElideOverflowCheck(const ASTContext &Ctx, const BinOpInfo &Op) {
   if (!Op.mayHaveIntegerOverflow())
     return true;
 
+  if (Op.Ty->isSignedIntegerType() &&
+      Ctx.isTypeIgnoredBySanitizer(SanitizerKind::SignedIntegerOverflow,
+                                   Op.Ty)) {
+    return true;
+  }
+
+  if (Op.Ty->isUnsignedIntegerType() &&
+      Ctx.isTypeIgnoredBySanitizer(SanitizerKind::UnsignedIntegerOverflow,
+                                   Op.Ty)) {
+    return true;
+  }
+
   const UnaryOperator *UO = dyn_cast<UnaryOperator>(Op.E);
 
   if (UO && UO->getOpcode() == UO_Minus &&
@@ -1125,6 +1137,10 @@ void ScalarExprEmitter::EmitIntegerTruncationCheck(Value *Src, QualType SrcType,
   if (!CGF.SanOpts.has(Check.second.second))
     return;
 
+  // Does some SSCL ignore this type?
+  if (CGF.getContext().isTypeIgnoredBySanitizer(Check.second.second, DstType))
+    return;
+
   llvm::Constant *StaticArgs[] = {
       CGF.EmitCheckSourceLocation(Loc), CGF.EmitCheckTypeDescriptor(SrcType),
       CGF.EmitCheckTypeDescriptor(DstType),
@@ -1235,6 +1251,15 @@ void ScalarExprEmitter::EmitIntegerSignChangeCheck(Value *Src, QualType SrcType,
     // Because here sign change check is interchangeable with truncation check.
     return;
   }
+  // Does an SSCL have an entry for the DstType under its respective sanitizer
+  // section?
+  if (DstSigned && CGF.getContext().isTypeIgnoredBySanitizer(
+                       SanitizerKind::ImplicitSignedIntegerTruncation, DstType))
+    return;
+  if (!DstSigned &&
+      CGF.getContext().isTypeIgnoredBySanitizer(
+          SanitizerKind::ImplicitUnsignedIntegerTruncation, DstType))
+    return;
   // That's it. We can't rule out any more cases with the data we have.
 
   CodeGenFunction::SanitizerScope SanScope(&CGF);
@@ -2784,10 +2809,11 @@ llvm::Value *ScalarExprEmitter::EmitIncDecConsiderOverflowBehavior(
       return Builder.CreateNSWAdd(InVal, Amount, Name);
     [[fallthrough]];
   case LangOptions::SOB_Trapping:
-    if (!E->canOverflow())
+    BinOpInfo Info = createBinOpInfoFromIncDec(
+        E, InVal, IsInc, E->getFPFeaturesInEffect(CGF.getLangOpts()));
+    if (!E->canOverflow() || CanElideOverflowCheck(CGF.getContext(), Info))
       return Builder.CreateNSWAdd(InVal, Amount, Name);
-    return EmitOverflowCheckedBinOp(createBinOpInfoFromIncDec(
-        E, InVal, IsInc, E->getFPFeaturesInEffect(CGF.getLangOpts())));
+    return EmitOverflowCheckedBinOp(Info);
   }
   llvm_unreachable("Unknown SignedOverflowBehaviorTy");
 }
@@ -2990,7 +3016,9 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
       value = EmitIncDecConsiderOverflowBehavior(E, value, isInc);
     } else if (E->canOverflow() && type->isUnsignedIntegerType() &&
                CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&
-               !excludeOverflowPattern) {
+               !excludeOverflowPattern &&
+               !CGF.getContext().isTypeIgnoredBySanitizer(
+                   SanitizerKind::UnsignedIntegerOverflow, E->getType())) {
       value = EmitOverflowCheckedBinOp(createBinOpInfoFromIncDec(
           E, value, isInc, E->getFPFeaturesInEffect(CGF.getLangOpts())));
     } else {

diff  --git a/clang/test/CodeGen/ubsan-type-ignorelist-category-2.test b/clang/test/CodeGen/ubsan-type-ignorelist-category-2.test
new file mode 100644
index 00000000000000..4b4f87326dbe59
--- /dev/null
+++ b/clang/test/CodeGen/ubsan-type-ignorelist-category-2.test
@@ -0,0 +1,58 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/order-0.ignorelist -emit-llvm %t/test.c -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/order-1.ignorelist -emit-llvm %t/test.c -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/order-2.ignorelist -emit-llvm %t/test.c -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/order-3.ignorelist -emit-llvm %t/test.c -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/order-4.ignorelist -emit-llvm %t/test.c -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/order-5.ignorelist -emit-llvm %t/test.c -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/order-6.ignorelist -emit-llvm %t/test.c -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/order-7.ignorelist -emit-llvm %t/test.c -o - | FileCheck %s
+
+// The same type can appear multiple times within an ignorelist. This is a test
+// to make sure "=sanitize" has priority regardless of the order in which
+// duplicate type entries appear. This is a precautionary measure; we would
+// much rather eagerly sanitize than silently forgo sanitization.
+
+//--- order-0.ignorelist
+type:int
+type:int=sanitize
+
+//--- order-1.ignorelist
+type:int=sanitize
+type:int
+
+//--- order-2.ignorelist
+type:in*
+type:int=sanitize
+
+//--- order-3.ignorelist
+type:in*=sanitize
+type:int
+
+//--- order-4.ignorelist
+type:int
+type:in*=sanitize
+
+//--- order-5.ignorelist
+type:int=sanitize
+type:in*
+
+//--- order-6.ignorelist
+type:int=sanitize
+type:in*
+
+//--- order-7.ignorelist
+type:int
+type:int=sanitize
+
+
+
+
+//--- test.c
+// CHECK-LABEL: @test
+void test(int A) {
+// CHECK: @llvm.sadd.with.overflow.i32
+  ++A;
+}

diff  --git a/clang/test/CodeGen/ubsan-type-ignorelist-category.test b/clang/test/CodeGen/ubsan-type-ignorelist-category.test
new file mode 100644
index 00000000000000..500c58f2165c43
--- /dev/null
+++ b/clang/test/CodeGen/ubsan-type-ignorelist-category.test
@@ -0,0 +1,98 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// Verify ubsan doesn't emit checks for ignorelisted types
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/int.ignorelist -emit-llvm %t/test.cpp -o - | FileCheck %s --check-prefix=INT
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/nosection.ignorelist -emit-llvm %t/test.cpp -o - | FileCheck %s --check-prefix=INT
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/nosan-same-as-no-category.ignorelist -emit-llvm %t/test.cpp -o - | FileCheck %s --check-prefix=INT
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t/myty.ignorelist -emit-llvm %t/test.cpp -o - | FileCheck %s --check-prefix=MYTY
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=implicit-signed-integer-truncation,implicit-unsigned-integer-truncation -fsanitize-ignorelist=%t/trunc.ignorelist -emit-llvm %t/test.cpp -o - | FileCheck %s --check-prefix=TRUNC
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=implicit-signed-integer-truncation,implicit-unsigned-integer-truncation -fsanitize-ignorelist=%t/docs.ignorelist -emit-llvm %t/test.cpp -o - | FileCheck %s --check-prefix=TRUNC2
+
+//--- int.ignorelist
+[{unsigned-integer-overflow,signed-integer-overflow}]
+type:int
+
+//--- nosection.ignorelist
+type:int
+
+//--- nosan-same-as-no-category.ignorelist
+type:int
+
+//--- myty.ignorelist
+[{unsigned-integer-overflow,signed-integer-overflow}]
+type:*
+type:myty=sanitize
+
+//--- trunc.ignorelist
+[{implicit-signed-integer-truncation,implicit-unsigned-integer-truncation}]
+type:char
+type:unsigned char
+
+//--- docs.ignorelist
+[implicit-signed-integer-truncation]
+type:*
+type:T=sanitize
+
+//--- test.cpp
+// INT-LABEL: ignore_int
+void ignore_int(int A, int B, unsigned C, unsigned D, long E) {
+// INT: llvm.uadd.with.overflow.i32
+  (void)(C+D);
+// INT-NOT: llvm.sadd.with.overflow.i32
+  (void)(A+B);
+// INT: llvm.sadd.with.overflow.i64
+  (void)(++E);
+}
+
+
+typedef unsigned long myty;
+typedef myty derivative;
+// INT-LABEL: ignore_all_except_myty
+// MYTY-LABEL: ignore_all_except_myty
+void ignore_all_except_myty(myty A, myty B, int C, unsigned D, derivative E) {
+// MYTY-NOT: llvm.sadd.with.overflow.i32
+  (void)(++C);
+
+// MYTY-NOT: llvm.uadd.with.overflow.i32
+  (void)(D+D);
+
+// MYTY-NOT: llvm.umul.with.overflow.i64
+  (void)(E*2);
+
+// MYTY: llvm.uadd.with.overflow.i64
+  (void)(A+B);
+}
+
+// INT-LABEL: truncation
+// MYTY-LABEL: truncation
+// TRUNC-LABEL: truncation
+void truncation(char A, int B, unsigned char C, short D) {
+// TRUNC-NOT: %handler.implicit_conversion
+  A = B;
+// TRUNC-NOT: %handler.implicit_conversion
+  A = C;
+// TRUNC-NOT: %handler.implicit_conversion
+  C = B;
+
+// TRUNC: %handler.implicit_conversion
+  D = B;
+
+  (void)A;
+  (void)D;
+}
+
+
+// Matches the example from clang/docs/SanitizerSpecialCaseList.rst
+typedef char T;
+typedef char U;
+// TRUNC2-LABEL: docs_example
+void docs_example(int toobig) {
+// TRUNC2: %handler.implicit_conversion
+  T a = toobig;
+// TRUNC2-NOT: %handler.implicit_conversion
+  U b = toobig;
+// TRUNC2-NOT: %handler.implicit_conversion
+  char c = toobig;
+}
+


        


More information about the cfe-commits mailing list