[clang-tools-extra] [clang-tidy] Improved --verify-config when using literal style in config file (PR #85591)

FĂ©lix-Antoine Constantin via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 26 09:35:27 PDT 2024


================
@@ -454,52 +454,31 @@ static constexpr StringLiteral VerifyConfigWarningEnd = " [-verify-config]\n";
 
 static bool verifyChecks(const StringSet<> &AllChecks, StringRef CheckGlob,
                          StringRef Source) {
-  llvm::StringRef Cur, Rest;
+  GlobList Globs(CheckGlob);
   bool AnyInvalid = false;
-  for (std::tie(Cur, Rest) = CheckGlob.split(',');
-       !(Cur.empty() && Rest.empty()); std::tie(Cur, Rest) = Rest.split(',')) {
-    Cur = Cur.trim();
-    if (Cur.empty())
+  for (const auto &Item : Globs.getItems()) {
+    const llvm::Regex &Reg = Item.Regex;
+    const llvm::StringRef Text = Item.Text;
+    if (Text.starts_with("clang-diagnostic"))
       continue;
-    Cur.consume_front("-");
-    if (Cur.starts_with("clang-diagnostic"))
-      continue;
-    if (Cur.contains('*')) {
-      SmallString<128> RegexText("^");
-      StringRef MetaChars("()^$|*+?.[]\\{}");
-      for (char C : Cur) {
-        if (C == '*')
-          RegexText.push_back('.');
-        else if (MetaChars.contains(C))
-          RegexText.push_back('\\');
-        RegexText.push_back(C);
-      }
-      RegexText.push_back('$');
-      llvm::Regex Glob(RegexText);
-      std::string Error;
-      if (!Glob.isValid(Error)) {
-        AnyInvalid = true;
-        llvm::WithColor::error(llvm::errs(), Source)
-            << "building check glob '" << Cur << "' " << Error << "'\n";
-        continue;
-      }
-      if (llvm::none_of(AllChecks.keys(),
-                        [&Glob](StringRef S) { return Glob.match(S); })) {
-        AnyInvalid = true;
+    if (llvm::none_of(AllChecks.keys(), [&Reg](StringRef S) {
+          llvm::errs() << S << '\n';
+          return Reg.match(S);
+        })) {
+      AnyInvalid = true;
+      if (Item.Text.contains('*'))
----------------
felix642 wrote:

You are right, I've changed slightly this function's semantics, but it shouldn't have any observable behaviour change for the end user.
Before my change, we would only create a regex and try to match it if the string contains any '*' character. But, since the GlobList implicitly generates a list of Regexes we always try to match the strings through regexp now. This won't make a difference since the generated regex will only match the exact string if it doesn't contain any '*', but it will be a bit less efficient since we are going through pattern matching rather than string comparison.

I've moved the condition with the '*' character at the of the method to preserve the error message that is displayed to the user in case we can't find any matching check. (Either with "check glob ... doesn't match" if it contains a '*' or "unknown check ..." if it doesn't). 

https://github.com/llvm/llvm-project/pull/85591


More information about the cfe-commits mailing list