[PATCH] D149460: [analyzer] ArrayBoundCheckerV2: suppress false positives from ctype macros

DonĂ¡t Nagy via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 28 09:39:21 PDT 2023


donat.nagy created this revision.
donat.nagy added reviewers: gamesh411, Szelethus, steakhal, dkrupp.
Herald added subscribers: manas, ASDenysPetrov, martong, mikhail.ramalho, a.sidorin, rnkovacs, szepet, baloghadamsoftware, xazax.hun.
Herald added a reviewer: NoQ.
Herald added a project: All.
donat.nagy requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

The checker `alpha.security.ArrayBoundV2` created bug reports in situations when the (tainted) result of `fgetc()` or `getchar()` was passed to one of the `isXXXXX()` macros from ctype.h. This is a common input handling pattern (within the limited toolbox of the C language) and several open source projects contained code where it led to false positive reports; so this commit suppresses ArrayBoundV2 reports generated within the `isXXXXX()` macros. (Note that here even true positive reports would be difficult to understand, as they'd refer to the implementation details of these macros.)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D149460

Files:
  clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
  clang/test/Analysis/taint-generic.c


Index: clang/test/Analysis/taint-generic.c
===================================================================
--- clang/test/Analysis/taint-generic.c
+++ clang/test/Analysis/taint-generic.c
@@ -156,6 +156,28 @@
   Buffer[m] = 1;  //expected-warning {{Out of bound memory access (index is tainted)}}
 }
 
+extern const unsigned short int **__ctype_b_loc (void);
+enum { _ISdigit = 2048 };
+# define isdigit(c) ((*__ctype_b_loc ())[(int) (c)] & (unsigned short int) _ISdigit)
+
+int isdigitImplFalsePositive(void) {
+  // If this code no longer produces a bug report, then consider removing the
+  // special case that disables buffer overflow reports coming from the isXXXXX
+  // macros in ctypes.h.
+  int c = getchar();
+  return ((*__ctype_b_loc ())[(int) (c)] & (unsigned short int) _ISdigit);
+  //expected-warning at -1 {{Out of bound memory access (index is tainted)}}
+}
+
+int isdigitSuppressed(void) {
+  // Same code as above, but reports are suppressed based on macro name:
+  int c = getchar();
+  return isdigit(c); //no-warning
+}
+
+// Some later tests use isdigit as a function, so we need to undef it:
+#undef isdigit
+
 void testUncontrolledFormatString(char **p) {
   char s[80];
   fscanf(stdin, "%s", s);
Index: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
===================================================================
--- clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -42,8 +42,10 @@
   void reportTaintOOB(CheckerContext &C, ProgramStateRef errorState,
                       SVal TaintedSVal) const;
 
+  static bool isFromCtypeMacro(const Stmt *S, ASTContext &AC);
+
 public:
-  void checkLocation(SVal l, bool isLoad, const Stmt*S,
+  void checkLocation(SVal l, bool isLoad, const Stmt *S,
                      CheckerContext &C) const;
 };
 
@@ -155,6 +157,15 @@
   // memory access is within the extent of the base region.  Since we
   // have some flexibility in defining the base region, we can achieve
   // various levels of conservatism in our buffer overflow checking.
+
+  // The header ctype.h (from e.g. glibc) implements the isXXXXX() macros as
+  //   #define isXXXXX(arg) (LOOKUP_TABLE[arg] & BITMASK_FOR_XXXXX)
+  // and incomplete analysis of these leads to false positives. As even
+  // accurate reports would be confusing for the users, just disable reports
+  // from these macros:
+  if (isFromCtypeMacro(LoadS, checkerContext.getASTContext()))
+    return;
+
   ProgramStateRef state = checkerContext.getState();
 
   SValBuilder &svalBuilder = checkerContext.getSValBuilder();
@@ -267,6 +278,25 @@
   checkerContext.emitReport(std::move(BR));
 }
 
+bool ArrayBoundCheckerV2::isFromCtypeMacro(const Stmt *S, ASTContext &ACtx) {
+  SourceLocation Loc = S->getBeginLoc();
+  if (!Loc.isMacroID())
+    return false;
+
+  StringRef MacroName = Lexer::getImmediateMacroName(
+      Loc, ACtx.getSourceManager(), ACtx.getLangOpts());
+
+  if (MacroName.size() < 7 || MacroName[0] != 'i' || MacroName[1] != 's')
+    return false;
+
+  return ((MacroName == "isalnum") || (MacroName == "isalpha") ||
+          (MacroName == "isblank") || (MacroName == "isdigit") ||
+          (MacroName == "isgraph") || (MacroName == "islower") ||
+          (MacroName == "isnctrl") || (MacroName == "isprint") ||
+          (MacroName == "ispunct") || (MacroName == "isspace") ||
+          (MacroName == "isupper") || (MacroName == "isxdigit"));
+}
+
 #ifndef NDEBUG
 LLVM_DUMP_METHOD void RegionRawOffsetV2::dump() const {
   dumpToStream(llvm::errs());


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D149460.517957.patch
Type: text/x-patch
Size: 3582 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20230428/a77606ca/attachment-0001.bin>


More information about the cfe-commits mailing list