[PATCH] D15636: Reduce false positives in printf/scanf format checker

Richard Trieu via cfe-commits cfe-commits at lists.llvm.org
Sat Feb 20 00:59:35 PST 2016


rtrieu added a comment.

In http://reviews.llvm.org/D15636#352532, @AndyG wrote:

> Richard, are you happy with this latest version?  Can I proceed to commit it?
>
> Thanks.


No, this is not ready yet.  This code is old and needs a bit of clean up, and because your patch touches it, you will be one that needs to clean it up before committing your patch.  Mainly, the changes are needed so that you can pass your arg handler properly through the function calls.


================
Comment at: include/clang/Sema/Sema.h:9100-9106
@@ -9099,8 +9099,9 @@
 
   void CheckFormatString(const StringLiteral *FExpr, const Expr *OrigFormatExpr,
                          ArrayRef<const Expr *> Args, bool HasVAListArg,
                          unsigned format_idx, unsigned firstDataArg,
                          FormatStringType Type, bool inFunctionCall,
                          VariadicCallType CallType,
-                         llvm::SmallBitVector &CheckedVarArgs);
+                         llvm::SmallBitVector &CheckedVarArgs,
+                         signed &FirstUncoveredArg);
   
----------------
Remove this from Sema.h and make the function static in SemaChecking.cpp.  It is only called from SemaChecking.cpp anyways and doesn't need access to anything in Sema.

================
Comment at: lib/Sema/SemaChecking.cpp:3236
@@ -3235,1 +3235,3 @@
 namespace {
+struct UncoveredArgHandler {
+  enum { Unknown = -1, AllCovered = -2 };
----------------
Make this a class.  Its members needs to be segmented into private and public parts.

================
Comment at: lib/Sema/SemaChecking.cpp:3237-3239
@@ +3236,5 @@
+struct UncoveredArgHandler {
+  enum { Unknown = -1, AllCovered = -2 };
+  signed FirstUncoveredArg;
+  SmallVector<const Expr *, 4> DiagnosticExprs;
+
----------------
This section should be private section.

================
Comment at: lib/Sema/SemaChecking.cpp:3241-3243
@@ +3240,5 @@
+
+  UncoveredArgHandler() : FirstUncoveredArg(Unknown) { }
+  void Diagnose(Sema &S, bool IsFunctionCall, const Expr *ArgExpr);
+  void Update(signed NewFirstUncoveredArg, const Expr *StrExpr);
+};
----------------
These should be made public.

================
Comment at: lib/Sema/SemaChecking.cpp:3453
@@ +3452,3 @@
+                          Type, InFunctionCall, CallType, CheckedVarArgs,
+                          FirstUncoveredArg);
+      UncoveredArg.Update(FirstUncoveredArg, StrE);
----------------
Pass in UncoveredArg and update within CheckFormatString function

================
Comment at: lib/Sema/SemaChecking.cpp:3529-3535
@@ -3488,1 +3528,9 @@
+
+  // Generate a diagnostic where an uncovered argument is detected.
+  if (UncoveredArg.FirstUncoveredArg >= 0) {
+    unsigned ArgIdx = UncoveredArg.FirstUncoveredArg + firstDataArg;
+    assert(ArgIdx < Args.size() && "ArgIdx outside bounds");
+    UncoveredArg.Diagnose(*this, /*IsFunctionCall*/true, Args[ArgIdx]);
+  }
+
   if (CT != SLCT_NotALiteral)
----------------
FirstUncoveredArg is now a private member so there's two paths, either will work:

1) Create two public function for UncoveredArgHandler,
```
bool UncoveredArgHandler::hasUncoveredArg();
unsigned UncoveredArgHandler::getUncoveredArg();
```
then use:
```
if (UncoveredArg.hasUncoveredArg()) {
  unsigned ArgIdx = UncoveredArg.getUncoveredArg() + firstDataArg;
```

getUncoveredArg() should never return negative numbers as that should remain an implementation detail.

2)  Pass both Args and firstDataArg to Diagnose and do the logic there.

================
Comment at: lib/Sema/SemaChecking.cpp:3603
@@ -3554,3 +3602,3 @@
 
-  void DoneProcessing();
+  void DoneProcessing(signed &FirstUncoveredArg);
 
----------------
Don't change the call signature here, leave it as a zero argument call.  Have another function return the first uncovered argument.

================
Comment at: lib/Sema/SemaChecking.cpp:3840-3892
@@ -3796,5 +3839,55 @@
+
+void UncoveredArgHandler::Update(signed NewFirstUncoveredArg,
+                                 const Expr *StrExpr) {
+  // Don't update if a previous string covers all arguments.
+  if (FirstUncoveredArg == AllCovered)
+    return;
+
+  switch (NewFirstUncoveredArg) {
+    case Unknown:
+      // The string was not fully analysed.  Do nothing.
+      break;
+
+    case UncoveredArgHandler::AllCovered:
+      // A string has been found with all arguments covered, so clear out
+      // the diagnostics.
+      FirstUncoveredArg = AllCovered;
+      DiagnosticExprs.clear();
+      break;
+
+    default:
+      // UncoveredArgHandler tracks the highest uncovered argument index
+      // and with it all the strings that match this index.
+      if (NewFirstUncoveredArg == FirstUncoveredArg)
+        DiagnosticExprs.push_back(StrExpr);
+      else if (NewFirstUncoveredArg > FirstUncoveredArg) {
+        DiagnosticExprs.clear();
+        DiagnosticExprs.push_back(StrExpr);
+        FirstUncoveredArg = NewFirstUncoveredArg;
       }
-    }
+      break;
   }
 }
 
+void UncoveredArgHandler::Diagnose(Sema &S, bool IsFunctionCall,
+                                   const Expr *ArgExpr) {
+  assert(FirstUncoveredArg >= 0 && DiagnosticExprs.size() > 0 &&
+         "Invalid state");
+
+  if (!ArgExpr)
+    return;
+
+  SourceLocation Loc = ArgExpr->getLocStart();
+
+  if (S.getSourceManager().isInSystemMacro(Loc))
+    return;
+
+  PartialDiagnostic PDiag = S.PDiag(diag::warn_printf_data_arg_not_used);
+  for (unsigned i = 1; i < DiagnosticExprs.size(); ++i)
+    PDiag << DiagnosticExprs[i]->getSourceRange();
+
+  CheckFormatHandler::EmitFormatDiagnostic(
+                                  S, IsFunctionCall, DiagnosticExprs[0],
+                                  PDiag, Loc, /*IsStringLocation*/false,
+                                  DiagnosticExprs[0]->getSourceRange());
+}
----------------
Move these two functions up to where the class is defined.

================
Comment at: lib/Sema/SemaChecking.cpp:3886-3887
@@ +3885,4 @@
+  PartialDiagnostic PDiag = S.PDiag(diag::warn_printf_data_arg_not_used);
+  for (unsigned i = 1; i < DiagnosticExprs.size(); ++i)
+    PDiag << DiagnosticExprs[i]->getSourceRange();
+
----------------
Use a range-based for-loop here.

================
Comment at: lib/Sema/SemaChecking.cpp:4982
@@ -4887,3 +4981,3 @@
 
 void Sema::CheckFormatString(const StringLiteral *FExpr,
                              const Expr *OrigFormatExpr,
----------------
Make this a static void function.
Add a new first argument "Sema &S".
```
static void CheckFormatString(Sema &S, const StringLiteral *FExpr,
```

The other changes for changing this function are listed below.  Also, a forward declaration of this function is necessary higher in this file since a previous function will call this function.

================
Comment at: lib/Sema/SemaChecking.cpp:4986-4987
@@ -4891,3 +4985,4 @@
                              bool HasVAListArg, unsigned format_idx,
                              unsigned firstDataArg, FormatStringType Type,
                              bool inFunctionCall, VariadicCallType CallType,
+                             llvm::SmallBitVector &CheckedVarArgs,
----------------
FormatStringType and VariadicCallType need to qualified with "Sema::"

================
Comment at: lib/Sema/SemaChecking.cpp:4989
@@ -4895,1 +4988,3 @@
+                             llvm::SmallBitVector &CheckedVarArgs,
+                             signed &FirstUncoveredArg) {
   
----------------
Pass in an UncoveredArgHandler here instead of an unsigned.

================
Comment at: lib/Sema/SemaChecking.cpp:4994
@@ -4898,3 +4993,3 @@
     CheckFormatHandler::EmitFormatDiagnostic(
       *this, inFunctionCall, Args[format_idx],
       PDiag(diag::warn_format_string_is_wide_literal), FExpr->getLocStart(),
----------------
*this => S
Here and throughout this function

================
Comment at: lib/Sema/SemaChecking.cpp:4995
@@ -4899,3 +4994,3 @@
       *this, inFunctionCall, Args[format_idx],
       PDiag(diag::warn_format_string_is_wide_literal), FExpr->getLocStart(),
       /*IsStringLocation*/true, OrigFormatExpr->getSourceRange());
----------------
PDiag => S.PDiag
Here and throughout this function

================
Comment at: lib/Sema/SemaChecking.cpp:5004
@@ -4908,3 +5003,3 @@
   // Account for cases where the string literal is truncated in a declaration.
   const ConstantArrayType *T = Context.getAsConstantArrayType(FExpr->getType());
   assert(T && "String literal not of constant array type!");
----------------
Context => S.Context
Here and throughout this function

================
Comment at: lib/Sema/SemaChecking.cpp:5031-5032
@@ -4935,4 +5030,4 @@
   
   if (Type == FST_Printf || Type == FST_NSString ||
       Type == FST_FreeBSDKPrintf || Type == FST_OSTrace) {
     CheckPrintfHandler H(*this, FExpr, OrigFormatExpr, firstDataArg,
----------------
FST_* => Sema::FST_*
Here and throughout this function

================
Comment at: lib/Sema/SemaChecking.cpp:5042
@@ -4946,3 +5041,3 @@
                                                   Type == FST_FreeBSDKPrintf))
-      H.DoneProcessing();
+      H.DoneProcessing(FirstUncoveredArg);
   } else if (Type == FST_Scanf) {
----------------
This line should remain the same.  Additional function should be added to check the status of uncovered arguments and pass that to UncoveredArgHandler.

================
Comment at: lib/Sema/SemaChecking.cpp:5051
@@ -4955,3 +5050,3 @@
                                                  Context.getTargetInfo()))
-      H.DoneProcessing();
+      H.DoneProcessing(FirstUncoveredArg);
   } // TODO: handle other formats
----------------
You have modified the Scanf handling function, but there are no additional scanf tests.

================
Comment at: test/Sema/format-strings.c:103
@@ -89,3 +102,3 @@
 }
 
 void check_writeback_specifier()
----------------
Add a test for having the format string as a constant literal defined elsewhere.  This should generate a note to point to the original format string.  You changes removes this note.  I am fine having this listed as a FIXME for a follow up patch.


http://reviews.llvm.org/D15636





More information about the cfe-commits mailing list