[cfe-dev] Clang's -Wsometimes-uninitialized flag implementation

Ali Shuja Siddiqui (alissidd) via cfe-dev cfe-dev at lists.llvm.org
Wed Jan 20 13:56:31 PST 2021


Thanks for the detailed response!

Best regards,
Ali

From: Richard Smith <richard at metafoo.co.uk>
Date: Wednesday, January 20, 2021 at 2:22 PM
To: Ali Shuja Siddiqui (alissidd) <alissidd at cisco.com>
Cc: cfe-dev at lists.llvm.org <cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] Clang's -Wsometimes-uninitialized flag implementation
On Wed, 20 Jan 2021 at 09:14, Ali Shuja Siddiqui (alissidd) via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
Hello,
I am interested in learning about clang’s -Wsometimes-uninitialized implementation and how is it different than gcc’s -Wmaybe-uninitialized. For example, if we consider this code:

#include <stdlib.h>
int test(int random, int cond1, int cond2) {
  volatile int *ptr;
  if (cond1) { //<<<< cond1
    ptr = (int *)calloc(1, sizeof(int));
  }
  if (cond2) { //<<<< cond2 , independent of cond1
    *ptr = 1;
  }
  return 0;
}

With gcc 8.3.1, I get the following output:
$ gcc -g -c func.c -O1 -Wuninitialized
func.c: In function ‘test’:
func.c:8:10: warning: ‘ptr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     *ptr = 1;
(debug-llvm-11) /nobackup/alissidd/CSCvv46785$ vim func.c
(debug-llvm-11) /nobackup/alissidd/CSCvv46785$ gcc -O1 -Wuninitialized -Wmaybe-uninitialized -c func.c
func.c: In function ‘test’:
func.c:8:10: warning: ‘ptr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     *ptr = 1;
     ~~~~~^~~

However, with clang 11.0.0 I don’t get any warnings:
$ clang -O1 -Wuninitialized -Wsometimes-uninitialized -c func.c
$

It was detected by the static analyzer though:

Scan-build clang -O1 -Wuninitialized -Wsometimes-uninitialized -c func.c
scan-build: Using 'clang-11' for static analysis
func.c:8:10: warning: Dereference of undefined pointer value [core.NullDereference]
    *ptr = 1;
    ~~~~~^~~
func.c:10:10: warning: Potential leak of memory pointed to by 'ptr' [unix.Malloc]
  return 0;
         ^
2 warnings generated.
scan-build: Analysis run complete.
scan-build: 2 bugs found.

I did see the -Wsometimes-uninitialized flag working, if for example, I move the use of the ptr variable outside of a condition:
#include <stdlib.h>
int test(int random, int cond1, int cond2) {
  volatile int *ptr;
  if (cond1) { //<<<< cond1
    ptr = (int *)calloc(1, sizeof(int));
  }
  if (cond2) { //<<<< cond2 , independent of cond1
    *ptr = 1;
  }
  return ptr;
}

$ clang -O1 -Wuninitialized -Wsometimes-uninitialized -c func.c
func.c:10:10: warning: incompatible pointer to integer conversion returning 'volatile int *' from a function with result type 'int' [-Wint-conversion]
  return ptr;
         ^~~
func.c:4:7: warning: variable 'ptr' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
  if (cond1) { //<<<< cond1
      ^~~~~
func.c:10:10: note: uninitialized use occurs here
  return ptr;
         ^~~
func.c:4:3: note: remove the 'if' if its condition is always true
  if (cond1) { //<<<< cond1
  ^~~~~~~~~~~
func.c:3:20: note: initialize the variable 'ptr' to silence this warning
  volatile int *ptr;
                   ^
                    = NULL
2 warnings generated.

I’m new to clang development and would appreciate the help with the following questions:

  *   Are clang’s -Wsometimes-uninitialized and gcc’s -Wmaybe-unintialized comparable?
No. GCC's -Wmaybe-uninitialized is analogous to Clang's -Wconditional-uninitialized. Both warnings diagnose cases where the compiler cannot prove that the variable is initialized at each use. (That is: both warnings aim to avoid false negatives, at the expense of having false positives in cases where the compiler couldn't prove whether the variable was initialized or not.) GCC's -Wmaybe-uninitialized is a check performed after some optimizations (so depends on optimization level) whereas Clang's -Wconditional-uninitialized is a check performed by a simplistic static analysis on the original unoptimized program, so they have different strengths and weaknesses -- Clang's warning is stable in the face of changes to the optimizer and optimization level, whereas GCC's warning can see through the things the optimizer can see through and can eliminate more false positives as a result. (Clang's warning also only works on scalar variables, whereas GCC's can reason about struct members.)


  *
  *   As is shown that a use of an uninitialized variable within a condition could not be detected by the -Wuninitialized/-Wsometimes-uninitialized flag, is this behavior by design?
Yes. The -Wsometimes-uninitialized flag is intended to diagnose cases where there is definitely a bug -- where we can prove the variable can be used while uninitialized -- on the assumption that the program contains no dead code. In your first test case, we cannot be sure the program contains a bug: it could be the case that cond2 is only nonzero when cond1 is also nonzero, in which case the program is correct. But in the second test case, the program definitely contains a bug, assuming the `if (cond1)` is not redundant: if it's possible that that `if` branch is not taken (if there is no dead code), then there is a live execution path through the function that results in the variable being used uninitialized.

Hence the naming: "conditional-uninitialized" / "maybe-uninitialized" warnings appear where the compiler is not sure whether there is an uninitialized use. "sometimes-uninitialized" warnings appear when the compiler is sure that there's an uninitialized use for some real executions of the function. And "uninitialized" warnings appear when the compiler is sure that there's an uninitialized use on every execution of the function. (-Wuninitialized also enables -Wsometimes-uninitialized because in practice people use -Wuninitialized when they want no false positives, and -Wsometimes-uninitialized is intended to have no false positives.)


  *
  *   I’m trying to understand the source code of the -Wuninitialized and -Wsometimes-initialized in particular and I was able to single out the files that maybe used to implement it:

     *   clang/lib/Analysis/UninitializedValues.cpp
     *   clang/include/clang/Analysis/Analyses/UninitializedValues.h
     *   clang/lib/Sema/AnalysisBasedWarnings.cpp
Any pointers about how should I go about reading and understanding it will be very helpful.

The implementation is written following the traditional structure of a data-flow analysis (as an iterated application of a transfer function along a control flow graph), so reading up about data-flow analyses should give you a better idea of what it's doing and how.

Thank you,
Ali
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210120/7300fe65/attachment-0001.html>


More information about the cfe-dev mailing list