[llvm-dev] Clang -O0 performs optimizations that undermine dynamic bug-finding tools

Manuel Rigger via llvm-dev llvm-dev at lists.llvm.org
Tue Mar 21 05:46:50 PDT 2017


Hi everyone,

I found that Clang -O0 performs optimizations that undermine dynamic
bug-finding tools.

First, most bug finding tools such as ASan, Valgrind, Dr. Memory, Mudflap,
Purify and Safe Sulong (on which I am working) rely on detecting errors
during the execution of the program. They either insert additional checks
during compile-time or during run-time which are executed when the program
is running. To find errors with these tools, it is necessary that these
errors stay in the program and are not optimized away.

I think it is widely known that bugs are sometimes optimized away when
compiling with optimizations turned on (-O1, -O2, -O3), and that there is a
consensus that this is legit. However, I found that also bugs are optimized
away while compiling with -O0. For example, I recently opened a bug report
on the LLVM sanitizers Github space [1] to describe a case where ASan did
not find an out-of-bounds access (see below).

int count[7] = {0, 0, 0, 0, 0, 0, 0};

int main(int argc, char** args) {
    return count[7];
}

Note, that Clang printed a warning and then optimized the invalid access
away (which is legit since it is UB). However, note that that cases exist
where no warning is printed. For example, consider the following program:

#include <ctype.h>

int main() {
    isalnum(1000000);
    isalpha(1000000);
    iscntrl(1000000);
    isdigit(1000000);
    isgraph(1000000);
    islower(1000000);
    isprint(1000000);
    ispunct(1000000);
    isspace(1000000);
    isupper(1000000);
    isxdigit(1000000);
}

The glibc (on my system) implements the macros by calling __ctype_b_loc()
which returns a lookup array that can be indexed by values between -128 and
255. Thus, I expected that, when compiling with -O0, the calls above would
result in out-of-bounds accesses that (at least in theory) could be
detected by bug finding tools. However, Clang optimizes the calls away, so
bug finding tools have no chance to find the out-of-bounds accesses. Note,
that in this example no warning is printed.

I think the calls are removed since __ctype_b_loc() has an __attribute__
((__const__)). When the attribute is used, Clang -O0 also removes calls in
other instances, for example in the function below. Using pure instead of
const as an attribute has the same effect.

#include <stdio.h>

int arr[10];

void test() __attribute__ ((__const__));

void test(int index) {
    printf("%d\n", arr[index]);
}

int main() {
    test(10000);
}

I have not yet found further cases but I feel unsettled to know that even
when compiling with -O0 Clang optimizes bugs away that then cannot be found
any longer by dynamic bug finding tools. The cases that I presented exhibit
undefined behavior. However, according to the "Principle of least
astonishment", I think that the errors should be compiled in a way so that
bug finding tools can still detect them.

Following, I have the following questions/suggestions:
- Is it known that Clang performs optimizations that hide program bugs,
even when compiling with -O0?
- Are there command line options to specify that no optimizations should be
performed? Until recently, I thought that -O0 had this effect.
- In each case, I would propose to not perform optimizations at -O0 to
allow dynamic bug finding tools to find such bugs, or at least offer a flag
to turn off optimizations altogether.


[1] https://github.com/google/sanitizers/issues/773
[2]
https://refspecs.linuxfoundation.org/LSB_2.0.1/LSB-Core/LSB-Core/baselib---ctype-b-loc.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170321/2f5806d4/attachment.html>


More information about the llvm-dev mailing list