[cfe-dev] Reporting UBSan issues at compile-time

Wed Mar 22 18:52:48 PDT 2017

Hi,

I've performed some experiments with reporting UBSan diagnostics at
compile-time and think that this is a useful thing to do. I'd like to discuss
the motivation, the approach I took, and some results.

=== Motivation ===

We're interested in fixing UB in our projects and use UBSan to do this.
However, we have lots of software that is easy to build but hard to run, or
hard to test with adequate code coverage (e.g firmware). This limits the amount
of bugs we can catch with UBSan.

It would be nice if we could report UB at compile-time without false positives.
We wouldn't be able to report everything a runtime tool could, but we would be
able to report a large number of real bugs very quickly, just by rebuilding all
our software with a flag enabled.

=== Approach ===

I wrote a simple analysis which detects UB statically by piggybacking off UBSan.
It's actually able to issue decent diagnostics. It only issues a diagnostic if
it finds a call to a UBSan diagnostic handler which post-dominates the function
entry block.

The idea is: if a function unconditionally exhibits UB when called, it's worth
reporting the UB at compile-time.

Here is a full example. This C program has UB because it returns a null pointer
when it shouldn't:

  ```
  __attribute__((returns_nonnull)) int *returns_nonnull(int *p) {
    return p; // Bug: null pointer returned here.
  }

  int main() {
    returns_nonnull((int *)0LL);
    return 0;
  }
  ```

With UBSan enabled, here's the IR we get:

  ```
  define nonnull i32* @returns_nonnull(i32* %p) #0 {
  entry:
    ...
    %1 = icmp ne i32* %p, null, !nosanitize !2
    br i1 %1, label %cont, label %handler.nonnull_return

  handler.nonnull_return:
    call void @__ubsan_handle_nonnull_return(...), !nosanitize !2
    br label %cont, !nosanitize !2

  cont:
    ret i32* %p
  }

  define i32 @main() #0 {
  entry:
    ...
    %call = call nonnull i32* @returns_nonnull(i32* null)
    ret i32 0
  }
  ```

At -O2, LLVM inlines @returns_nonnull and throws away the null check:

  ```
  define i32 @main() local_unnamed_addr #0 {
  entry:
    tail call void @__ubsan_handle_nonnull_return(...), !nosanitize !2
    ret i32 0
  }
  ```

The call to UBSan's diagnostic handler post-dominates the function entry block,
so we report it right away:

  $ clang -fsanitize=undefined -O2 -Xclang -enable-llvm-linter buggy.c
  Undefined behavior: invalid null return value (buggy.c:3:1)

=== Results ===

I packaged up my analysis into LLVM's Lint pass and added a clang option to
enable linting. The initial patch is up for review:

  https://reviews.llvm.org/D30949 - Add an option to enable LLVM IR linting

I built a few internal projects with UBSan, optimizations, and linting enabled.
This exposed real bugs. The only problem was that I got reports about UB in
dead code. Maybe this can be addressed by setting up sanitizer blacklists?

=== Alternatives? ===

We could try implementing something like the STACK UB checker:

  https://people.csail.mit.edu/nickolai/papers/wang-stack-tocs.pdf

I haven't compared my approach vs. STACK in terms of bug-finding efficacy. The
latter does seem harder to implement.

I'm interested in hearing what others think.

thanks,
vedant