[clang-tools-extra] [clang-tidy] Switch misc-confusable-identifiers check to a faster algorithm. (PR #130369)
Richard Smith via cfe-commits
cfe-commits at lists.llvm.org
Fri Mar 7 15:38:57 PST 2025
https://github.com/zygoloid created https://github.com/llvm/llvm-project/pull/130369
Optimizations:
- Only build the skeleton for each identifier once, rather than once for each declaration of that identifier.
- Only compute the contexts in which identifiers are declared for identifiers that have the same skeleton as another identifier in the translation unit.
- Only compare pairs of declarations that are declared in related contexts, rather than comparing all pairs of declarations with the same skeleton.
Also simplify by removing the caching of enclosing `DeclContext` sets, because with the above changes we don't even compute the enclosing `DeclContext` sets in common cases. Instead, we terminate the traversal to enclosing `DeclContext`s immediately if we've already found another declaration in that context with the same identifier. (This optimization is not currently applied to the `forallBases` traversal, but could be applied there too if needed.)
This also fixes two bugs that together caused the check to fail to find the issues it was looking for in most cases:
- The old check skipped comparisons of declarations from different contexts unless both declarations were type template parameters. It's unclear what purpose the checks here were intended to serve, but they caused the checker to not warn on instances of the CVE it is intended to detect.
- The old check skipped comparisons of declarations in all base classes other than the first one found by the traversal. This appears to be an oversight, incorrectly returning `false` rather than `true` from the `forallBases` callback, which terminates traversal.
This decreases the runtime of this check, especially in the common case where there are few or no skeletons with two or more different identifiers. From a sample invocation:
```
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.1556 ( 1.4%) 0.1513 ( 1.8%) 0.3069 ( 1.5%) 0.3033 ( 1.5%) before
0.0746 ( 0.7%) 0.0751 ( 0.8%) 0.1498 ( 0.8%) 0.1491 ( 0.8%) after
```
>From 78c9d5da85723040c30f00cb53e77c9dd4cc14b8 Mon Sep 17 00:00:00 2001
From: Richard Smith <richard at metafoo.co.uk>
Date: Fri, 7 Mar 2025 22:49:37 +0000
Subject: [PATCH 1/2] Switch misc-confusable-identifiers check to a faster
algorithm.
- Only build the skeleton for each identifier once.
- Only compute the contexts in which identifiers are declared for
identifiers that actually have collisions.
- Only compare pairs of declarations that are declared in related
contexts, rather than comparing all pairs of declarations with the
same skeleton.
This also fixes several bugs:
- The old check skipped comparisons of declarations from different
contexts unless both declarations were type template parameters.
- The old check skipped comparisons of declarations in all base classes
other than the first one found by the traversal.
This decreases the runtime of this check, especially in the common case
where there are few or no skeletons with two or more different
identifiers. From a sample invocation:
```
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.1556 ( 1.4%) 0.1513 ( 1.8%) 0.3069 ( 1.5%) 0.3033 ( 1.5%) before
0.0746 ( 0.7%) 0.0751 ( 0.8%) 0.1498 ( 0.8%) 0.1491 ( 0.8%) after
```
---
.../misc/ConfusableIdentifierCheck.cpp | 191 +++++++++---------
.../misc/ConfusableIdentifierCheck.h | 21 +-
.../checkers/misc/confusable-identifiers.cpp | 31 +++
3 files changed, 133 insertions(+), 110 deletions(-)
diff --git a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
index 6df565c9a9d69..7b1e23a2c5c2b 100644
--- a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
+++ b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
@@ -89,90 +89,56 @@ static llvm::SmallString<64U> skeleton(StringRef Name) {
return Skeleton;
}
-static bool mayShadowImpl(const DeclContext *DC0, const DeclContext *DC1) {
- return DC0 && DC0 == DC1;
-}
-
-static bool mayShadowImpl(const NamedDecl *ND0, const NamedDecl *ND1) {
- return isa<TemplateTypeParmDecl>(ND0) || isa<TemplateTypeParmDecl>(ND1);
-}
-
-static bool isMemberOf(const ConfusableIdentifierCheck::ContextInfo *DC0,
- const ConfusableIdentifierCheck::ContextInfo *DC1) {
- return llvm::is_contained(DC1->Bases, DC0->PrimaryContext);
-}
-
-static bool enclosesContext(const ConfusableIdentifierCheck::ContextInfo *DC0,
- const ConfusableIdentifierCheck::ContextInfo *DC1) {
- if (DC0->PrimaryContext == DC1->PrimaryContext)
- return true;
-
- return llvm::is_contained(DC0->PrimaryContexts, DC1->PrimaryContext) ||
- llvm::is_contained(DC1->PrimaryContexts, DC0->PrimaryContext);
+namespace {
+struct Entry {
+ const NamedDecl *ND;
+ bool FromDerivedClass;
+};
}
-static bool mayShadow(const NamedDecl *ND0,
- const ConfusableIdentifierCheck::ContextInfo *DC0,
- const NamedDecl *ND1,
- const ConfusableIdentifierCheck::ContextInfo *DC1) {
-
- if (!DC0->Bases.empty() && !DC1->Bases.empty()) {
- // if any of the declaration is a non-private member of the other
- // declaration, it's shadowed by the former
-
- if (ND1->getAccess() != AS_private && isMemberOf(DC1, DC0))
- return true;
-
- if (ND0->getAccess() != AS_private && isMemberOf(DC0, DC1))
+using DeclsWithinContextMap =
+ llvm::DenseMap<const DeclContext *, llvm::SmallVector<Entry, 1>>;
+
+static bool addToContext(DeclsWithinContextMap &DeclsWithinContext,
+ const DeclContext *DC, Entry E) {
+ auto &Decls = DeclsWithinContext[DC];
+ if (!Decls.empty() &&
+ Decls.back().ND->getIdentifier() == E.ND->getIdentifier()) {
+ // Already have a declaration with this identifier in this context. Don't
+ // track another one. This means that if an outer name is confusable with an
+ // inner name, we'll only diagnose the outer name once, pointing at the
+ // first inner declaration with that name.
+ if (Decls.back().FromDerivedClass && !E.FromDerivedClass) {
+ // Prefer the declaration that's not from the derived class, because that
+ // conflicts with more declarations.
+ Decls.back() = E;
return true;
- }
-
- if (!mayShadowImpl(DC0->NonTransparentContext, DC1->NonTransparentContext) &&
- !mayShadowImpl(ND0, ND1))
+ }
return false;
-
- return enclosesContext(DC0, DC1);
-}
-
-const ConfusableIdentifierCheck::ContextInfo *
-ConfusableIdentifierCheck::getContextInfo(const DeclContext *DC) {
- const DeclContext *PrimaryContext = DC->getPrimaryContext();
- auto [It, Inserted] = ContextInfos.try_emplace(PrimaryContext);
- if (!Inserted)
- return &It->second;
-
- ContextInfo &Info = It->second;
- Info.PrimaryContext = PrimaryContext;
- Info.NonTransparentContext = PrimaryContext;
-
- while (Info.NonTransparentContext->isTransparentContext()) {
- Info.NonTransparentContext = Info.NonTransparentContext->getParent();
- if (!Info.NonTransparentContext)
- break;
}
+ Decls.push_back(E);
+ return true;
+}
- if (Info.NonTransparentContext)
- Info.NonTransparentContext =
- Info.NonTransparentContext->getPrimaryContext();
-
+static void addToEnclosingContexts(DeclsWithinContextMap &DeclsWithinContext,
+ const DeclContext *DC, const NamedDecl *ND) {
while (DC) {
- if (!isa<LinkageSpecDecl>(DC) && !isa<ExportDecl>(DC))
- Info.PrimaryContexts.push_back(DC->getPrimaryContext());
- DC = DC->getParent();
- }
-
- if (const auto *RD = dyn_cast<CXXRecordDecl>(PrimaryContext)) {
- RD = RD->getDefinition();
- if (RD) {
- Info.Bases.push_back(RD);
- RD->forallBases([&](const CXXRecordDecl *Base) {
- Info.Bases.push_back(Base);
- return false;
- });
+ DC = DC->getNonTransparentContext()->getPrimaryContext();
+ if (!addToContext(DeclsWithinContext, DC, {ND, false}))
+ return;
+
+ if (const auto *RD = dyn_cast<CXXRecordDecl>(DC)) {
+ RD = RD->getDefinition();
+ if (RD) {
+ RD->forallBases([&](const CXXRecordDecl *Base) {
+ addToContext(DeclsWithinContext, Base, {ND, true});
+ return true;
+ });
+ }
}
- }
- return &Info;
+ DC = DC->getParent();
+ }
}
void ConfusableIdentifierCheck::check(
@@ -181,7 +147,7 @@ void ConfusableIdentifierCheck::check(
if (!ND)
return;
- IdentifierInfo *NDII = ND->getIdentifier();
+ const IdentifierInfo *NDII = ND->getIdentifier();
if (!NDII)
return;
@@ -189,29 +155,68 @@ void ConfusableIdentifierCheck::check(
if (NDName.empty())
return;
- const ContextInfo *Info = getContextInfo(ND->getDeclContext());
+ NameToDecls[NDII].push_back(ND);
+}
- llvm::SmallVector<Entry> &Mapped = Mapper[skeleton(NDName)];
- for (const Entry &E : Mapped) {
- if (!mayShadow(ND, Info, E.Declaration, E.Info))
- continue;
+void ConfusableIdentifierCheck::onEndOfTranslationUnit() {
+ llvm::StringMap<llvm::SmallVector<const IdentifierInfo*, 1>> SkeletonToNames;
+ // Compute the skeleton for each identifier.
+ for (auto &[Ident, Decls] : NameToDecls) {
+ SkeletonToNames[skeleton(Ident->getName())].push_back(Ident);
+ }
- const IdentifierInfo *ONDII = E.Declaration->getIdentifier();
- StringRef ONDName = ONDII->getName();
- if (ONDName == NDName)
+ // Visit each skeleton with more than one identifier.
+ for (auto &[Skel, Idents] : SkeletonToNames) {
+ if (Idents.size() < 2) {
continue;
+ }
- diag(ND->getLocation(), "%0 is confusable with %1") << ND << E.Declaration;
- diag(E.Declaration->getLocation(), "other declaration found here",
- DiagnosticIDs::Note);
- }
+ // Find the declaration contexts that transitively contain each identifier.
+ DeclsWithinContextMap DeclsWithinContext;
+ for (const IdentifierInfo *II : Idents) {
+ for (const NamedDecl *ND : NameToDecls[II]) {
+ addToEnclosingContexts(DeclsWithinContext, ND->getDeclContext(), ND);
+ }
+ }
- Mapped.push_back({ND, Info});
-}
+ // Check to see if any declaration is declared in a context that
+ // transitively contains another declaration with a different identifier but
+ // the same skeleton.
+ for (const IdentifierInfo *II : Idents) {
+ for (const NamedDecl *OuterND : NameToDecls[II]) {
+ const DeclContext *OuterDC = OuterND->getDeclContext()
+ ->getNonTransparentContext()
+ ->getPrimaryContext();
+ for (Entry Inner : DeclsWithinContext[OuterDC]) {
+ // Don't complain if the identifiers are the same.
+ if (OuterND->getIdentifier() == Inner.ND->getIdentifier())
+ continue;
+
+ // Don't complain about a derived-class name shadowing a base class
+ // private member.
+ if (OuterND->getAccess() == AS_private && Inner.FromDerivedClass)
+ continue;
+
+ // If the declarations are in the same context, only diagnose the
+ // later one.
+ if (OuterDC->Equals(
+ Inner.ND->getDeclContext()->getNonTransparentContext()) &&
+ Inner.ND->getASTContext()
+ .getSourceManager()
+ .isBeforeInTranslationUnit(Inner.ND->getLocation(),
+ OuterND->getLocation()))
+ continue;
+
+ diag(Inner.ND->getLocation(), "%0 is confusable with %1")
+ << Inner.ND << OuterND;
+ diag(OuterND->getLocation(), "other declaration found here",
+ DiagnosticIDs::Note);
+ }
+ }
+ }
+ }
-void ConfusableIdentifierCheck::onEndOfTranslationUnit() {
- Mapper.clear();
- ContextInfos.clear();
+ NameToDecls.clear();
}
void ConfusableIdentifierCheck::registerMatchers(
diff --git a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h
index f3b0c8ed00306..bb90fca9af8f3 100644
--- a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h
+++ b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h
@@ -11,7 +11,7 @@
#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_MISC_CONFUSABLE_IDENTIFIER_CHECK_H
#include "../ClangTidyCheck.h"
-#include <unordered_map>
+#include "llvm/ADT/DenseMap.h"
namespace clang::tidy::misc {
@@ -31,23 +31,10 @@ class ConfusableIdentifierCheck : public ClangTidyCheck {
return TK_IgnoreUnlessSpelledInSource;
}
- struct ContextInfo {
- const DeclContext *PrimaryContext;
- const DeclContext *NonTransparentContext;
- llvm::SmallVector<const DeclContext *> PrimaryContexts;
- llvm::SmallVector<const CXXRecordDecl *> Bases;
- };
-
private:
- struct Entry {
- const NamedDecl *Declaration;
- const ContextInfo *Info;
- };
-
- const ContextInfo *getContextInfo(const DeclContext *DC);
-
- llvm::StringMap<llvm::SmallVector<Entry>> Mapper;
- std::unordered_map<const DeclContext *, ContextInfo> ContextInfos;
+ llvm::DenseMap<const IdentifierInfo *,
+ llvm::SmallVector<const NamedDecl *, 1>>
+ NameToDecls;
};
} // namespace clang::tidy::misc
diff --git a/clang-tools-extra/test/clang-tidy/checkers/misc/confusable-identifiers.cpp b/clang-tools-extra/test/clang-tidy/checkers/misc/confusable-identifiers.cpp
index cdfed7edb431d..acaf39973961d 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/misc/confusable-identifiers.cpp
+++ b/clang-tools-extra/test/clang-tidy/checkers/misc/confusable-identifiers.cpp
@@ -74,6 +74,19 @@ template <typename t1, typename tl>
// CHECK-MESSAGES: :[[#@LINE-2]]:20: note: other declaration found here
void f9();
+namespace f10 {
+int il;
+namespace inner {
+ int i1;
+ // CHECK-MESSAGES: :[[#@LINE-1]]:7: warning: 'i1' is confusable with 'il' [misc-confusable-identifiers]
+ // CHECK-MESSAGES: :[[#@LINE-4]]:5: note: other declaration found here
+ int j1;
+ // CHECK-MESSAGES: :[[#@LINE-1]]:7: warning: 'j1' is confusable with 'jl' [misc-confusable-identifiers]
+ // CHECK-MESSAGES: :[[#@LINE+2]]:5: note: other declaration found here
+}
+int jl;
+}
+
struct Base0 {
virtual void mO0();
@@ -103,3 +116,21 @@ struct Derived1 : Base1 {
long mI1(); // no warning: mII is private
};
+
+struct Base2 {
+ long nO0;
+
+private:
+ long nII;
+};
+
+struct Mid2 : Base0, Base1, Base2 {
+};
+
+struct Derived2 : Mid2 {
+ long nOO;
+ // CHECK-MESSAGES: :[[#@LINE-1]]:8: warning: 'nOO' is confusable with 'nO0' [misc-confusable-identifiers]
+ // CHECK-MESSAGES: :[[#@LINE-12]]:8: note: other declaration found here
+
+ long nI1(); // no warning: mII is private
+};
>From 870c0ea0c6dde2232ad63ec51f968007702cef90 Mon Sep 17 00:00:00 2001
From: Richard Smith <richard at metafoo.co.uk>
Date: Fri, 7 Mar 2025 23:37:33 +0000
Subject: [PATCH 2/2] Fix file headers
---
.../clang-tidy/misc/ConfusableIdentifierCheck.cpp | 3 +--
clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
index 7b1e23a2c5c2b..70b948c0784db 100644
--- a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
+++ b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
@@ -1,5 +1,4 @@
-//===--- ConfusableIdentifierCheck.cpp -
-// clang-tidy--------------------------===//
+//===--- ConfusableIdentifierCheck.cpp - clang-tidy -----------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
diff --git a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h
index bb90fca9af8f3..65669fb61961a 100644
--- a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h
+++ b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h
@@ -1,5 +1,4 @@
-//===--- ConfusableIdentifierCheck.h - clang-tidy
-//-------------------------------*- C++ -*-===//
+//===--- ConfusableIdentifierCheck.h - clang-tidy ---------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
More information about the cfe-commits
mailing list