[clang] abc8736 - [analyzer] Restrict CallDescription fuzzy builtin matching

Balazs Benics via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 11 01:45:46 PST 2022


Author: Balazs Benics
Date: 2022-02-11T10:45:18+01:00
New Revision: abc873694ff7cde3def6860564521e059cb542f8

URL: https://github.com/llvm/llvm-project/commit/abc873694ff7cde3def6860564521e059cb542f8
DIFF: https://github.com/llvm/llvm-project/commit/abc873694ff7cde3def6860564521e059cb542f8.diff

LOG: [analyzer] Restrict CallDescription fuzzy builtin matching

`CallDescriptions` for builtin functions relaxes the match rules
somewhat, so that the `CallDescription` will match for calls that have
some prefix or suffix. This was achieved by doing a `StringRef::contains()`.
However, this is somewhat problematic for builtins that are substrings
of each other.

Consider the following:

`CallDescription{ builtin, "memcpy"}` will match for
`__builtin_wmemcpy()` calls, which is unfortunate.

This patch addresses/works around the issue by checking if the
characters around the function's name are not part of the 'name'
semantically. In other words, to accept a match for `"memcpy"` the call
should not have alphanumeric (`[a-zA-Z]`) characters around the 'match'.

So, `CallDescription{ builtin, "memcpy"}` will not match on:

 - `__builtin_wmemcpy: there is a `w` alphanumeric character before the match.
 - `__builtin_memcpyFOoBar_inline`: there is a `F` character after the match.
 - `__builtin_memcpyX_inline`: there is an `X` character after the match.

But it will still match for:
 - `memcpy`: exact match
 - `__builtin_memcpy`: there is an _ before the match
 - `__builtin_memcpy_inline`: there is an _ after the match
 - `memcpy_inline_builtinFooBar`: there is an _ after the match

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D118388

Added: 
    

Modified: 
    clang/lib/StaticAnalyzer/Core/CheckerContext.cpp
    clang/unittests/StaticAnalyzer/CallDescriptionTest.cpp

Removed: 
    


################################################################################
diff  --git a/clang/lib/StaticAnalyzer/Core/CheckerContext.cpp b/clang/lib/StaticAnalyzer/Core/CheckerContext.cpp
index 4c684c3ffd9bf..73d5d9489cb78 100644
--- a/clang/lib/StaticAnalyzer/Core/CheckerContext.cpp
+++ b/clang/lib/StaticAnalyzer/Core/CheckerContext.cpp
@@ -55,8 +55,29 @@ bool CheckerContext::isCLibraryFunction(const FunctionDecl *FD,
     if (Name.empty())
       return true;
     StringRef BName = FD->getASTContext().BuiltinInfo.getName(BId);
-    if (BName.contains(Name))
-      return true;
+    size_t start = BName.find(Name);
+    if (start != StringRef::npos) {
+      // Accept exact match.
+      if (BName.size() == Name.size())
+        return true;
+
+      //    v-- match starts here
+      // ...xxxxx...
+      //   _xxxxx_
+      //   ^     ^ lookbehind and lookahead characters
+
+      const auto MatchPredecessor = [=]() -> bool {
+        return start <= 0 || !llvm::isAlpha(BName[start - 1]);
+      };
+      const auto MatchSuccessor = [=]() -> bool {
+        std::size_t LookbehindPlace = start + Name.size();
+        return LookbehindPlace >= BName.size() ||
+               !llvm::isAlpha(BName[LookbehindPlace]);
+      };
+
+      if (MatchPredecessor() && MatchSuccessor())
+        return true;
+    }
   }
 
   const IdentifierInfo *II = FD->getIdentifier();

diff  --git a/clang/unittests/StaticAnalyzer/CallDescriptionTest.cpp b/clang/unittests/StaticAnalyzer/CallDescriptionTest.cpp
index 98c8eb27516f6..f2d85b3bc17c7 100644
--- a/clang/unittests/StaticAnalyzer/CallDescriptionTest.cpp
+++ b/clang/unittests/StaticAnalyzer/CallDescriptionTest.cpp
@@ -487,6 +487,60 @@ TEST(CallDescription, MatchBuiltins) {
       "  __builtin___memset_chk(&x, 0, sizeof(x),"
       "                         __builtin_object_size(&x, 0));"
       "}"));
+
+  {
+    SCOPED_TRACE("multiple similar builtins");
+    EXPECT_TRUE(tooling::runToolOnCode(
+        std::unique_ptr<FrontendAction>(new CallDescriptionAction<>(
+            {{{CDF_MaybeBuiltin, "memcpy", 3}, false},
+             {{CDF_MaybeBuiltin, "wmemcpy", 3}, true}})),
+        R"(void foo(wchar_t *x, wchar_t *y) {
+            __builtin_wmemcpy(x, y, sizeof(wchar_t));
+          })"));
+  }
+  {
+    SCOPED_TRACE("multiple similar builtins reversed order");
+    EXPECT_TRUE(tooling::runToolOnCode(
+        std::unique_ptr<FrontendAction>(new CallDescriptionAction<>(
+            {{{CDF_MaybeBuiltin, "wmemcpy", 3}, true},
+             {{CDF_MaybeBuiltin, "memcpy", 3}, false}})),
+        R"(void foo(wchar_t *x, wchar_t *y) {
+            __builtin_wmemcpy(x, y, sizeof(wchar_t));
+          })"));
+  }
+  {
+    SCOPED_TRACE("lookbehind and lookahead mismatches");
+    EXPECT_TRUE(tooling::runToolOnCode(
+        std::unique_ptr<FrontendAction>(
+            new CallDescriptionAction<>({{{CDF_MaybeBuiltin, "func"}, false}})),
+        R"(
+          void funcXXX();
+          void XXXfunc();
+          void XXXfuncXXX();
+          void test() {
+            funcXXX();
+            XXXfunc();
+            XXXfuncXXX();
+          })"));
+  }
+  {
+    SCOPED_TRACE("lookbehind and lookahead matches");
+    EXPECT_TRUE(tooling::runToolOnCode(
+        std::unique_ptr<FrontendAction>(
+            new CallDescriptionAction<>({{{CDF_MaybeBuiltin, "func"}, true}})),
+        R"(
+          void func();
+          void func_XXX();
+          void XXX_func();
+          void XXX_func_XXX();
+
+          void test() {
+            func(); // exact match
+            func_XXX();
+            XXX_func();
+            XXX_func_XXX();
+          })"));
+  }
 }
 
 } // namespace


        


More information about the cfe-commits mailing list