[clang] [clang-tools-extra] [clang-tidy][dataflow] Add `bugprone-null-check-after-dereference` check (PR #84166)

via cfe-commits cfe-commits at lists.llvm.org
Mon May 27 04:20:31 PDT 2024


================
@@ -0,0 +1,625 @@
+//===-- NullPointerAnalysisModel.cpp ----------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines a generic null-pointer analysis model, used for finding
+// pointer null-checks after the pointer has already been dereferenced.
+//
+// Only a limited set of operations are currently recognized. Notably, pointer
+// arithmetic, null-pointer assignments and _nullable/_nonnull attributes are
+// missing as of yet.
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/AST/Decl.h"
+#include "clang/AST/Expr.h"
+#include "clang/AST/Stmt.h"
+#include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/ASTMatchers/ASTMatchers.h"
+#include "clang/Analysis/CFG.h"
+#include "clang/Analysis/FlowSensitive/CFGMatchSwitch.h"
+#include "clang/Analysis/FlowSensitive/DataflowAnalysis.h"
+#include "clang/Analysis/FlowSensitive/DataflowEnvironment.h"
+#include "clang/Analysis/FlowSensitive/DataflowLattice.h"
+#include "clang/Analysis/FlowSensitive/MapLattice.h"
+#include "clang/Analysis/FlowSensitive/NoopLattice.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/Twine.h"
+
+namespace clang::dataflow {
+
+namespace {
+using namespace ast_matchers;
+
+constexpr char kCond[] = "condition";
+constexpr char kVar[] = "var";
+constexpr char kValue[] = "value";
+constexpr char kIsNonnull[] = "is-nonnull";
+constexpr char kIsNull[] = "is-null";
+
+enum class SatisfiabilityResult {
+  // Returned when the value was not initialized yet.
+  Nullptr,
+  // Special value that signals that the boolean value can be anything.
+  // It signals that the underlying formulas are too complex to be calculated
+  // efficiently.
+  Top,
+  // Equivalent to the literal True in the current environment.
+  True,
+  // Equivalent to the literal False in the current environment.
+  False,
+  // Both True and False values could be produced with an appropriate set of
+  // conditions.
+  Unknown
+};
+
+using SR = SatisfiabilityResult;
+
+// FIXME: These AST matchers should also be exported via the
+// NullPointerAnalysisModel class, for tests
+auto ptrToVar(llvm::StringRef VarName = kVar) {
+  return traverse(TK_IgnoreUnlessSpelledInSource,
----------------
martinboehme wrote:

> I'm not sure if there's a better way to bind to the "root" Expr that I haven't found yet, but the current solution has too many dragons. (Maybe this is also a case of X/Y problem, where if I assign atoms as IsNull/IsNonnull instead of formulas, I wouldn't need to re-assign Values...)

I think this is the underlying issue. When you glean new information about an existing `Value`, that information should be reflected in the flow condition, rather than by replacing the existing value.

As you've discovered, replacing the value of a prvalue expression with a different value does nothing to change the value of a glvalue expression that the prvalue may have originated from (using an lvalue-to-rvalue cast). And I don't think you should be trying to change the value of this original glvalue (I think this is what you mean by the "root" expression?):

*  I don't think there's a reliable way of finding this original glvalue. `TK_IgnoreUnlessSpelledInSource` looks like an attempt to do this that works in some cases, but I'm sure there are other cases where this doesn't work.

*  Indeed, there may not even be any glvalue that the prvalue originated from.

More generally, attempting to do this goes against the grain of value categories in C++. The whole idea of a prvalue is that it is a "pure value" with no storage location associated with it. Operations on prvalues are pure expression evaluation in the sense that this works in functional languages -- they have no way of changing anything that is stored in memory, i.e. they are free of side-effects.

So a check shouldn't be trying to go out and find the glvalue that a prvalue might have been cast from, and then changing the value associated with that glvalue's storage location -- because that would be doing something that isn't possible in C++. If we want to model the fact that we have learned new information about a value, that should be done by changing the flow condition, rather than changing the value itself.

Feel free to ask back if you're unsure how you would do this in the context of this check.

As an aside, I think making this change would also eliminate the restriction that the check currently doesn't deal correctly with references to pointers (as you note in the PR description).

https://github.com/llvm/llvm-project/pull/84166


More information about the cfe-commits mailing list