[cfe-dev] Reporting UBSan issues at compile-time

Thu Mar 23 13:02:58 PDT 2017

Responses inline --

> Justin:
> 
> So does this only emit the warning if -fsanitize=undefined is set? How
> much overhead would be implied by doing an analysis like this without
> actually emitting the ubsan diagnostic handler?

Yes, you'd need to have -fsanitize=something set. IMO this is nice because it generalizes to other/future sanitizers, and it's totally opt-in. You wouldn't generate diagnostics for issues you aren't interested in.

> Even if the answer to that is "a lot" or "enough to be contentious",
> what about enabling this warning by default when building with
> -fsanitize=undefined? It sounds like the compile time overhead when
> already building with sanitizers would be fairly insignificant.

I think it'd at least be contentious to the analysis without -fsanitize=undefined set. We'd need to change a lot of code in llvm to do the reporting, and it could have compile-time costs.

I'd be in favor of enabling the warning when building with -fsanitize=undefined.

> Assuming turning this on by default with ubsan is too much, it seems to
> me that this overlaps quite a bit with the static analyzer and/or
> clang-tidy - integrating it into one of those might make more sense than
> adding yet another option to -cc1.

Do you mean that the checks should operate on the clang CFG, or that the driver option to enable diagnostics from the middle-end should be a static analyzer driver option?

> dblaikie:
> 
> FWIW - Clang is fairly allergic to emitting diagnostics based on optimization because they tend to present usability problems. They can appear/disappear due to seemingly unrelated changes in the code (that trigger or hinder optimizations that cause the diagnostic path to be hit).

What if we advertise this as a new clang warning: "-Woptimizer". We'd be up front about it only reporting issues the optimizer actually runs in to, and that it would be dependent on the optzn level.

> Usually the idea is to implement these sort of bug finding techniques in Clang's static analyzer. So perhaps there would be a way to feed UBSan's facts/checks into the static analyzer in a more consistent way (I'm sure some of the same checks are implemented there already - but generalizing/unifying UBSan's checks to feed into the static analyzer could be handy).

One roadblock with doing the checks on the clang CFG is that we'd miss issues that occur after LTO. This is an important use-case.

I think an approach where we have one simple LLVM analysis passing diagnostics up to clang is actually a more unified solution. We wouldn't have to reimplement the bug checking on the clang CFG, we'd get feature parity between the static and runtime checks without writing extra code, and the diagnostics would work in all scenarios the optimizer is used in.

> Anna:
> 
> We already have several tools that detect errors, mainly UB, statically. Depending on the type of analysis you choose to write and false positive tolerance I’d suggest extending one of these:
>  - compiler warnings
>  - clang-tidy
>  - clang static analyzer
> 
> I believe it will be trivial to teach the clang static analyzer to catch the bug in the example you provide below.

I have the same concerns about this. I'm not sure how we'd get the clang static analyzer to "see" issues exposed by LTO. I'm also concerned about this being a much larger undertaking, with potentially higher false-positive rates, and without bug-checking parity with UBSan.

> Reid:
> 
> Checking for ubsan handlers that post-dominate a function entry block seems like a weak heuristic. If you put that code inside an if, or if main is inlined into a block that doesn't post-dominate the entry of the caller, we won't warn on it.
> 
> What if we had a way of tagging certain branches as "warn if this branch is optimized to true", and then we had hooks from branch simplification utilities to emit warnings? That seems like it might find a lot more bugs, but it could have false positives in dead code. Do you think that could work?

You're right, this is a weak heuristic. UBSan-inserted branches are marked "!nosanitize", we could probably create diagnostics when we throw such branches away.

I don't have a good solution yet for the dead code issue except to: maintain blacklists, or special-case patterns which lead to dead code (in my case, NSRequestConcreteImplementation).

Thank you all for your comments.

vedant