[clang] 8eb3470 - [SpecialCaseList] Add option to use Globs instead of Regex to match patterns
Ellis Hoag via cfe-commits
cfe-commits at lists.llvm.org
Fri Sep 1 09:06:17 PDT 2023
Author: Ellis Hoag
Date: 2023-09-01T09:06:11-07:00
New Revision: 8eb34700c2b1847ec6dfb8f92b305b65278d2ec0
URL: https://github.com/llvm/llvm-project/commit/8eb34700c2b1847ec6dfb8f92b305b65278d2ec0
DIFF: https://github.com/llvm/llvm-project/commit/8eb34700c2b1847ec6dfb8f92b305b65278d2ec0.diff
LOG: [SpecialCaseList] Add option to use Globs instead of Regex to match patterns
Add an option in `SpecialCaseList` to use Globs instead of Regex to match patterns. `GlobPattern` was extended in https://reviews.llvm.org/D153587 to support brace expansions which allows us to use patterns like `*/src/foo.{c,cpp}`. It turns out that most patterns only take advantage of `*` so using Regex was overkill and required lots of escaping in practice. This often led to bugs due to forgetting to escape special characters.
Since this would be a breaking change, we temporarily support Regex by default and use Globs when `#!special-case-list-v2` is the first line in the file. Users should switch to the glob format described in https://llvm.org/doxygen/classllvm_1_1GlobPattern.html. For example, `(abc|def)` should become `{abc,def}`.
See discussion in https://reviews.llvm.org/D152762 and https://discourse.llvm.org/t/use-glob-instead-of-regex-for-specialcaselists/71666.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D154014
Added:
Modified:
clang/docs/SanitizerSpecialCaseList.rst
clang/lib/Basic/ProfileList.cpp
clang/lib/Basic/SanitizerSpecialCaseList.cpp
llvm/include/llvm/Support/SpecialCaseList.h
llvm/lib/Support/SpecialCaseList.cpp
llvm/unittests/Support/SpecialCaseListTest.cpp
Removed:
################################################################################
diff --git a/clang/docs/SanitizerSpecialCaseList.rst b/clang/docs/SanitizerSpecialCaseList.rst
index 15e19b9c129ca46..ab39276b0439577 100644
--- a/clang/docs/SanitizerSpecialCaseList.rst
+++ b/clang/docs/SanitizerSpecialCaseList.rst
@@ -15,7 +15,7 @@ file at compile-time.
Goal and usage
==============
-User of sanitizer tools, such as :doc:`AddressSanitizer`, :doc:`ThreadSanitizer`
+Users of sanitizer tools, such as :doc:`AddressSanitizer`, :doc:`ThreadSanitizer`
or :doc:`MemorySanitizer` may want to disable or alter some checks for
certain source-level entities to:
@@ -54,37 +54,48 @@ Format
Ignorelists consist of entries, optionally grouped into sections. Empty lines
and lines starting with "#" are ignored.
-Section names are regular expressions written in square brackets that denote
+.. note::
+
+ In `D154014 <https://reviews.llvm.org/D154014>`_ we transitioned to using globs instead
+ of regexes to match patterns in special case lists. Since this was a
+ breaking change, we will temporarily support the original behavior using
+ regexes. If ``#!special-case-list-v2`` is the first line of the file, then
+ we will use the new behavior using globs. For more details, see
+ `this discourse post <https://discourse.llvm.org/t/use-glob-instead-of-regex-for-specialcaselists/71666>`_.
+
+
+Section names are globs written in square brackets that denote
which sanitizer the following entries apply to. For example, ``[address]``
-specifies AddressSanitizer while ``[cfi-vcall|cfi-icall]`` specifies Control
+specifies AddressSanitizer while ``[{cfi-vcall,cfi-icall}]`` specifies Control
Flow Integrity virtual and indirect call checking. Entries without a section
will be placed under the ``[*]`` section applying to all enabled sanitizers.
-Entries contain an entity type, followed by a colon and a regular expression,
+Entries contain an entity type, followed by a colon and a glob,
specifying the names of the entities, optionally followed by an equals sign and
-a tool-specific category, e.g. ``fun:*ExampleFunc=example_category``. The
-meaning of ``*`` in regular expression for entity names is
diff erent - it is
-treated as in shell wildcarding. Two generic entity types are ``src`` and
+a tool-specific category, e.g. ``fun:*ExampleFunc=example_category``.
+Two generic entity types are ``src`` and
``fun``, which allow users to specify source files and functions, respectively.
Some sanitizer tools may introduce custom entity types and categories - refer to
tool-specific docs.
.. code-block:: bash
+ #!special-case-list-v2
+ # The line above is explained in the note above
# Lines starting with # are ignored.
- # Turn off checks for the source file (use absolute path or path relative
- # to the current working directory):
- src:/path/to/source/file.c
+ # Turn off checks for the source file
+ # Entries without sections are placed into [*] and apply to all sanitizers
+ src:path/to/source/file.c
+ src:*/source/file.c
# Turn off checks for this main file, including files included by it.
# Useful when the main file instead of an included file should be ignored.
mainfile:file.c
# Turn off checks for a particular functions (use mangled names):
- fun:MyFooBar
fun:_Z8MyFooBarv
- # Extended regular expressions are supported:
- fun:bad_(foo|bar)
+ # Glob brace expansions and character ranges are supported
+ fun:bad_{foo,bar}
src:bad_source[1-9].c
- # Shell like usage of * is supported (* is treated as .*):
+ # "*" matches zero or more characters
src:bad/sources/*
fun:*BadFunction*
# Specific sanitizer tools may introduce categories.
@@ -92,10 +103,9 @@ tool-specific docs.
# Sections can be used to limit ignorelist entries to specific sanitizers
[address]
fun:*BadASanFunc*
- # Section names are regular expressions
- [cfi-vcall|cfi-icall]
+ # Section names are globs
+ [{cfi-vcall,cfi-icall}]
fun:*BadCfiCall
- # Entries without sections are placed into [*] and apply to all sanitizers
``mainfile`` is similar to applying ``-fno-sanitize=`` to a set of files but
does not need plumbing into the build system. This works well for internal
diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp
index eea1b1e60ec703f..8fa16e2eb069a52 100644
--- a/clang/lib/Basic/ProfileList.cpp
+++ b/clang/lib/Basic/ProfileList.cpp
@@ -36,8 +36,8 @@ class ProfileSpecialCaseList : public llvm::SpecialCaseList {
bool isEmpty() const { return Sections.empty(); }
bool hasPrefix(StringRef Prefix) const {
- for (auto &SectionIter : Sections)
- if (SectionIter.Entries.count(Prefix) > 0)
+ for (const auto &It : Sections)
+ if (It.second.Entries.count(Prefix) > 0)
return true;
return false;
}
diff --git a/clang/lib/Basic/SanitizerSpecialCaseList.cpp b/clang/lib/Basic/SanitizerSpecialCaseList.cpp
index 2dbf04c6ede9721..b02e868cdaa44e0 100644
--- a/clang/lib/Basic/SanitizerSpecialCaseList.cpp
+++ b/clang/lib/Basic/SanitizerSpecialCaseList.cpp
@@ -37,7 +37,8 @@ SanitizerSpecialCaseList::createOrDie(const std::vector<std::string> &Paths,
}
void SanitizerSpecialCaseList::createSanitizerSections() {
- for (auto &S : Sections) {
+ for (auto &It : Sections) {
+ auto &S = It.second;
SanitizerMask Mask;
#define SANITIZER(NAME, ID) \
diff --git a/llvm/include/llvm/Support/SpecialCaseList.h b/llvm/include/llvm/Support/SpecialCaseList.h
index b6d1b56a09623cc..6dc1a29c5a281d0 100644
--- a/llvm/include/llvm/Support/SpecialCaseList.h
+++ b/llvm/include/llvm/Support/SpecialCaseList.h
@@ -5,47 +5,7 @@
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//===----------------------------------------------------------------------===//
//
-// This is a utility class used to parse user-provided text files with
-// "special case lists" for code sanitizers. Such files are used to
-// define an "ABI list" for DataFlowSanitizer and allow/exclusion lists for
-// sanitizers like AddressSanitizer or UndefinedBehaviorSanitizer.
-//
-// Empty lines and lines starting with "#" are ignored. Sections are defined
-// using a '[section_name]' header and can be used to specify sanitizers the
-// entries below it apply to. Section names are regular expressions, and
-// entries without a section header match all sections (e.g. an '[*]' header
-// is assumed.)
-// The remaining lines should have the form:
-// prefix:wildcard_expression[=category]
-// If category is not specified, it is assumed to be empty string.
-// Definitions of "prefix" and "category" are sanitizer-specific. For example,
-// sanitizer exclusion support prefixes "src", "mainfile", "fun" and "global".
-// Wildcard expressions define, respectively, source files, main files,
-// functions or globals which shouldn't be instrumented.
-// Examples of categories:
-// "functional": used in DFSan to list functions with pure functional
-// semantics.
-// "init": used in ASan exclusion list to disable initialization-order bugs
-// detection for certain globals or source files.
-// Full special case list file example:
-// ---
-// [address]
-// # Excluded items:
-// fun:*_ZN4base6subtle*
-// global:*global_with_bad_access_or_initialization*
-// global:*global_with_initialization_issues*=init
-// type:*Namespace::ClassName*=init
-// src:file_with_tricky_code.cc
-// src:ignore-global-initializers-issues.cc=init
-// mainfile:main_file.cc
-//
-// [dataflow]
-// # Functions with pure functional semantics:
-// fun:cos=functional
-// fun:sin=functional
-// ---
-// Note that the wild card is in fact an llvm::Regex, but * is automatically
-// replaced with .*
+// This file implements a Special Case List for code sanitizers.
//
//===----------------------------------------------------------------------===//
@@ -53,6 +13,7 @@
#define LLVM_SUPPORT_SPECIALCASELIST_H
#include "llvm/ADT/StringMap.h"
+#include "llvm/Support/GlobPattern.h"
#include "llvm/Support/Regex.h"
#include <memory>
#include <string>
@@ -66,6 +27,45 @@ namespace vfs {
class FileSystem;
}
+/// This is a utility class used to parse user-provided text files with
+/// "special case lists" for code sanitizers. Such files are used to
+/// define an "ABI list" for DataFlowSanitizer and allow/exclusion lists for
+/// sanitizers like AddressSanitizer or UndefinedBehaviorSanitizer.
+///
+/// Empty lines and lines starting with "#" are ignored. Sections are defined
+/// using a '[section_name]' header and can be used to specify sanitizers the
+/// entries below it apply to. Section names are globs, and
+/// entries without a section header match all sections (e.g. an '[*]' header
+/// is assumed.)
+/// The remaining lines should have the form:
+/// prefix:glob_pattern[=category]
+/// If category is not specified, it is assumed to be empty string.
+/// Definitions of "prefix" and "category" are sanitizer-specific. For example,
+/// sanitizer exclusion support prefixes "src", "mainfile", "fun" and "global".
+/// "glob_pattern" defines source files, main files, functions or globals which
+/// shouldn't be instrumented.
+/// Examples of categories:
+/// "functional": used in DFSan to list functions with pure functional
+/// semantics.
+/// "init": used in ASan exclusion list to disable initialization-order bugs
+/// detection for certain globals or source files.
+/// Full special case list file example:
+/// ---
+/// [address]
+/// # Excluded items:
+/// fun:*_ZN4base6subtle*
+/// global:*global_with_bad_access_or_initialization*
+/// global:*global_with_initialization_issues*=init
+/// type:*Namespace::ClassName*=init
+/// src:file_with_tricky_code.cc
+/// src:ignore-global-initializers-issues.cc=init
+/// mainfile:main_file.cc
+///
+/// [dataflow]
+/// # Functions with pure functional semantics:
+/// fun:cos=functional
+/// fun:sin=functional
+/// ---
class SpecialCaseList {
public:
/// Parses the special case list entries from files. On failure, returns
@@ -88,7 +88,7 @@ class SpecialCaseList {
/// \code
/// @Prefix:<E>=@Category
/// \endcode
- /// where @Query satisfies wildcard expression <E> in a given @Section.
+ /// where @Query satisfies the glob <E> in a given @Section.
bool inSection(StringRef Section, StringRef Prefix, StringRef Query,
StringRef Category = StringRef()) const;
@@ -97,7 +97,7 @@ class SpecialCaseList {
/// \code
/// @Prefix:<E>=@Category
/// \endcode
- /// where @Query satisfies wildcard expression <E> in a given @Section.
+ /// where @Query satisfies the glob <E> in a given @Section.
/// Returns zero if there is no exclusion entry corresponding to this
/// expression.
unsigned inSectionBlame(StringRef Section, StringRef Prefix, StringRef Query,
@@ -114,19 +114,16 @@ class SpecialCaseList {
SpecialCaseList(SpecialCaseList const &) = delete;
SpecialCaseList &operator=(SpecialCaseList const &) = delete;
- /// Represents a set of regular expressions. Regular expressions which are
- /// "literal" (i.e. no regex metacharacters) are stored in Strings. The
- /// reason for doing so is efficiency; StringMap is much faster at matching
- /// literal strings than Regex.
+ /// Represents a set of globs and their line numbers
class Matcher {
public:
- bool insert(std::string Regexp, unsigned LineNumber, std::string &REError);
+ Error insert(StringRef Pattern, unsigned LineNumber, bool UseRegex);
// Returns the line number in the source file that this query matches to.
// Returns zero if no match is found.
unsigned match(StringRef Query) const;
private:
- StringMap<unsigned> Strings;
+ StringMap<std::pair<GlobPattern, unsigned>> Globs;
std::vector<std::pair<std::unique_ptr<Regex>, unsigned>> RegExes;
};
@@ -134,16 +131,19 @@ class SpecialCaseList {
struct Section {
Section(std::unique_ptr<Matcher> M) : SectionMatcher(std::move(M)){};
+ Section() : Section(std::make_unique<Matcher>()) {}
std::unique_ptr<Matcher> SectionMatcher;
SectionEntries Entries;
};
- std::vector<Section> Sections;
+ StringMap<Section> Sections;
+
+ Expected<Section *> addSection(StringRef SectionStr, unsigned LineNo,
+ bool UseGlobs = true);
/// Parses just-constructed SpecialCaseList entries from a memory buffer.
- bool parse(const MemoryBuffer *MB, StringMap<size_t> &SectionsMap,
- std::string &Error);
+ bool parse(const MemoryBuffer *MB, std::string &Error);
// Helper method for derived classes to search by Prefix, Query, and Category
// once they have already resolved a section entry.
diff --git a/llvm/lib/Support/SpecialCaseList.cpp b/llvm/lib/Support/SpecialCaseList.cpp
index 64f66e0f817924a..ac693eca44be8b4 100644
--- a/llvm/lib/Support/SpecialCaseList.cpp
+++ b/llvm/lib/Support/SpecialCaseList.cpp
@@ -14,58 +14,70 @@
//===----------------------------------------------------------------------===//
#include "llvm/Support/SpecialCaseList.h"
-#include "llvm/ADT/SmallVector.h"
+#include "llvm/Support/LineIterator.h"
#include "llvm/Support/MemoryBuffer.h"
-#include "llvm/Support/Regex.h"
#include "llvm/Support/VirtualFileSystem.h"
+#include <stdio.h>
#include <string>
#include <system_error>
#include <utility>
-#include <stdio.h>
namespace llvm {
-bool SpecialCaseList::Matcher::insert(std::string Regexp,
- unsigned LineNumber,
- std::string &REError) {
- if (Regexp.empty()) {
- REError = "Supplied regexp was blank";
- return false;
- }
+Error SpecialCaseList::Matcher::insert(StringRef Pattern, unsigned LineNumber,
+ bool UseGlobs) {
+ if (Pattern.empty())
+ return createStringError(errc::invalid_argument,
+ Twine("Supplied ") +
+ (UseGlobs ? "glob" : "regex") + " was blank");
+
+ if (!UseGlobs) {
+ // Replace * with .*
+ auto Regexp = Pattern.str();
+ for (size_t pos = 0; (pos = Regexp.find('*', pos)) != std::string::npos;
+ pos += strlen(".*")) {
+ Regexp.replace(pos, strlen("*"), ".*");
+ }
- if (Regex::isLiteralERE(Regexp)) {
- Strings[Regexp] = LineNumber;
- return true;
- }
+ Regexp = (Twine("^(") + StringRef(Regexp) + ")$").str();
- // Replace * with .*
- for (size_t pos = 0; (pos = Regexp.find('*', pos)) != std::string::npos;
- pos += strlen(".*")) {
- Regexp.replace(pos, strlen("*"), ".*");
- }
+ // Check that the regexp is valid.
+ Regex CheckRE(Regexp);
+ std::string REError;
+ if (!CheckRE.isValid(REError))
+ return createStringError(errc::invalid_argument, REError);
- Regexp = (Twine("^(") + StringRef(Regexp) + ")$").str();
+ RegExes.emplace_back(std::make_pair(
+ std::make_unique<Regex>(std::move(CheckRE)), LineNumber));
- // Check that the regexp is valid.
- Regex CheckRE(Regexp);
- if (!CheckRE.isValid(REError))
- return false;
+ return Error::success();
+ }
- RegExes.emplace_back(
- std::make_pair(std::make_unique<Regex>(std::move(CheckRE)), LineNumber));
- return true;
+ auto [It, DidEmplace] = Globs.try_emplace(Pattern);
+ if (DidEmplace) {
+ // We must be sure to use the string in the map rather than the provided
+ // reference which could be destroyed before match() is called
+ Pattern = It->getKey();
+ auto &Pair = It->getValue();
+ if (auto Err = GlobPattern::create(Pattern, /*MaxSubPatterns=*/1024)
+ .moveInto(Pair.first))
+ return Err;
+ Pair.second = LineNumber;
+ }
+ return Error::success();
}
unsigned SpecialCaseList::Matcher::match(StringRef Query) const {
- auto It = Strings.find(Query);
- if (It != Strings.end())
- return It->second;
- for (const auto &RegExKV : RegExes)
- if (RegExKV.first->match(Query))
- return RegExKV.second;
+ for (const auto &[Pattern, Pair] : Globs)
+ if (Pair.first.match(Query))
+ return Pair.second;
+ for (const auto &[Regex, LineNumber] : RegExes)
+ if (Regex->match(Query))
+ return LineNumber;
return 0;
}
+// TODO: Refactor this to return Expected<...>
std::unique_ptr<SpecialCaseList>
SpecialCaseList::create(const std::vector<std::string> &Paths,
llvm::vfs::FileSystem &FS, std::string &Error) {
@@ -94,7 +106,6 @@ SpecialCaseList::createOrDie(const std::vector<std::string> &Paths,
bool SpecialCaseList::createInternal(const std::vector<std::string> &Paths,
vfs::FileSystem &VFS, std::string &Error) {
- StringMap<size_t> Sections;
for (const auto &Path : Paths) {
ErrorOr<std::unique_ptr<MemoryBuffer>> FileOrErr =
VFS.getBufferForFile(Path);
@@ -103,7 +114,7 @@ bool SpecialCaseList::createInternal(const std::vector<std::string> &Paths,
return false;
}
std::string ParseError;
- if (!parse(FileOrErr.get().get(), Sections, ParseError)) {
+ if (!parse(FileOrErr.get().get(), ParseError)) {
Error = (Twine("error parsing file '") + Path + "': " + ParseError).str();
return false;
}
@@ -113,82 +124,79 @@ bool SpecialCaseList::createInternal(const std::vector<std::string> &Paths,
bool SpecialCaseList::createInternal(const MemoryBuffer *MB,
std::string &Error) {
- StringMap<size_t> Sections;
- if (!parse(MB, Sections, Error))
+ if (!parse(MB, Error))
return false;
return true;
}
-bool SpecialCaseList::parse(const MemoryBuffer *MB,
- StringMap<size_t> &SectionsMap,
- std::string &Error) {
- // Iterate through each line in the exclusion list file.
- SmallVector<StringRef, 16> Lines;
- MB->getBuffer().split(Lines, '\n');
+Expected<SpecialCaseList::Section *>
+SpecialCaseList::addSection(StringRef SectionStr, unsigned LineNo,
+ bool UseGlobs) {
+ auto [It, DidEmplace] = Sections.try_emplace(SectionStr);
+ auto &Section = It->getValue();
+ if (DidEmplace)
+ if (auto Err = Section.SectionMatcher->insert(SectionStr, LineNo, UseGlobs))
+ return createStringError(errc::invalid_argument,
+ "malformed section at line " + Twine(LineNo) +
+ ": '" + SectionStr +
+ "': " + toString(std::move(Err)));
+ return &Section;
+}
- unsigned LineNo = 1;
- StringRef Section = "*";
+bool SpecialCaseList::parse(const MemoryBuffer *MB, std::string &Error) {
+ Section *CurrentSection;
+ if (auto Err = addSection("*", 1).moveInto(CurrentSection)) {
+ Error = toString(std::move(Err));
+ return false;
+ }
- for (auto I = Lines.begin(), E = Lines.end(); I != E; ++I, ++LineNo) {
- *I = I->trim();
- // Ignore empty lines and lines starting with "#"
- if (I->empty() || I->startswith("#"))
+ // In https://reviews.llvm.org/D154014 we transitioned to using globs instead
+ // of regexes to match patterns in special case lists. Since this was a
+ // breaking change, we will temporarily support the original behavior using
+ // regexes. If "#!special-case-list-v2" is the first line of the file, then
+ // we will use the new behavior using globs. For more details, see
+ // https://discourse.llvm.org/t/use-glob-instead-of-regex-for-specialcaselists/71666
+ bool UseGlobs = MB->getBuffer().startswith("#!special-case-list-v2\n");
+
+ for (line_iterator LineIt(*MB, /*SkipBlanks=*/true, /*CommentMarker=*/'#');
+ !LineIt.is_at_eof(); LineIt++) {
+ unsigned LineNo = LineIt.line_number();
+ StringRef Line = LineIt->trim();
+ if (Line.empty())
continue;
// Save section names
- if (I->startswith("[")) {
- if (!I->endswith("]")) {
- Error = (Twine("malformed section header on line ") + Twine(LineNo) +
- ": " + *I).str();
- return false;
- }
-
- Section = I->slice(1, I->size() - 1);
-
- std::string REError;
- Regex CheckRE(Section);
- if (!CheckRE.isValid(REError)) {
+ if (Line.startswith("[")) {
+ if (!Line.endswith("]")) {
Error =
- (Twine("malformed regex for section ") + Section + ": '" + REError)
+ ("malformed section header on line " + Twine(LineNo) + ": " + Line)
.str();
return false;
}
+ if (auto Err = addSection(Line.drop_front().drop_back(), LineNo, UseGlobs)
+ .moveInto(CurrentSection)) {
+ Error = toString(std::move(Err));
+ return false;
+ }
continue;
}
- // Get our prefix and unparsed regexp.
- std::pair<StringRef, StringRef> SplitLine = I->split(":");
- StringRef Prefix = SplitLine.first;
- if (SplitLine.second.empty()) {
+ // Get our prefix and unparsed glob.
+ auto [Prefix, Postfix] = Line.split(":");
+ if (Postfix.empty()) {
// Missing ':' in the line.
- Error = (Twine("malformed line ") + Twine(LineNo) + ": '" +
- SplitLine.first + "'").str();
+ Error = ("malformed line " + Twine(LineNo) + ": '" + Line + "'").str();
return false;
}
- std::pair<StringRef, StringRef> SplitRegexp = SplitLine.second.split("=");
- std::string Regexp = std::string(SplitRegexp.first);
- StringRef Category = SplitRegexp.second;
-
- // Create this section if it has not been seen before.
- if (!SectionsMap.contains(Section)) {
- std::unique_ptr<Matcher> M = std::make_unique<Matcher>();
- std::string REError;
- if (!M->insert(std::string(Section), LineNo, REError)) {
- Error = (Twine("malformed section ") + Section + ": '" + REError).str();
- return false;
- }
-
- SectionsMap[Section] = Sections.size();
- Sections.emplace_back(std::move(M));
- }
-
- auto &Entry = Sections[SectionsMap[Section]].Entries[Prefix][Category];
- std::string REError;
- if (!Entry.insert(std::move(Regexp), LineNo, REError)) {
- Error = (Twine("malformed regex in line ") + Twine(LineNo) + ": '" +
- SplitLine.second + "': " + REError).str();
+ auto [Pattern, Category] = Postfix.split("=");
+ auto &Entry = CurrentSection->Entries[Prefix][Category];
+ if (auto Err = Entry.insert(Pattern, LineNo, UseGlobs)) {
+ Error =
+ (Twine("malformed ") + (UseGlobs ? "glob" : "regex") + " in line " +
+ Twine(LineNo) + ": '" + Pattern + "': " + toString(std::move(Err)))
+ .str();
return false;
}
}
@@ -205,13 +213,14 @@ bool SpecialCaseList::inSection(StringRef Section, StringRef Prefix,
unsigned SpecialCaseList::inSectionBlame(StringRef Section, StringRef Prefix,
StringRef Query,
StringRef Category) const {
- for (const auto &SectionIter : Sections)
- if (SectionIter.SectionMatcher->match(Section)) {
- unsigned Blame =
- inSectionBlame(SectionIter.Entries, Prefix, Query, Category);
+ for (const auto &It : Sections) {
+ const auto &S = It.getValue();
+ if (S.SectionMatcher->match(Section)) {
+ unsigned Blame = inSectionBlame(S.Entries, Prefix, Query, Category);
if (Blame)
return Blame;
}
+ }
return 0;
}
@@ -226,4 +235,4 @@ unsigned SpecialCaseList::inSectionBlame(const SectionEntries &Entries,
return II->getValue().match(Query);
}
-} // namespace llvm
+} // namespace llvm
diff --git a/llvm/unittests/Support/SpecialCaseListTest.cpp b/llvm/unittests/Support/SpecialCaseListTest.cpp
index bee639df6a66677..81faeca5d63571e 100644
--- a/llvm/unittests/Support/SpecialCaseListTest.cpp
+++ b/llvm/unittests/Support/SpecialCaseListTest.cpp
@@ -10,8 +10,11 @@
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/VirtualFileSystem.h"
+#include "gmock/gmock.h"
#include "gtest/gtest.h"
+using testing::HasSubstr;
+using testing::StartsWith;
using namespace llvm;
namespace {
@@ -19,24 +22,32 @@ namespace {
class SpecialCaseListTest : public ::testing::Test {
protected:
std::unique_ptr<SpecialCaseList> makeSpecialCaseList(StringRef List,
- std::string &Error) {
- std::unique_ptr<MemoryBuffer> MB = MemoryBuffer::getMemBuffer(List);
+ std::string &Error,
+ bool UseGlobs = true) {
+ auto S = List.str();
+ if (UseGlobs)
+ S = (Twine("#!special-case-list-v2\n") + S).str();
+ std::unique_ptr<MemoryBuffer> MB = MemoryBuffer::getMemBuffer(S);
return SpecialCaseList::create(MB.get(), Error);
}
- std::unique_ptr<SpecialCaseList> makeSpecialCaseList(StringRef List) {
+ std::unique_ptr<SpecialCaseList> makeSpecialCaseList(StringRef List,
+ bool UseGlobs = true) {
std::string Error;
- auto SCL = makeSpecialCaseList(List, Error);
+ auto SCL = makeSpecialCaseList(List, Error, UseGlobs);
assert(SCL);
assert(Error == "");
return SCL;
}
- std::string makeSpecialCaseListFile(StringRef Contents) {
+ std::string makeSpecialCaseListFile(StringRef Contents,
+ bool UseGlobs = true) {
int FD;
SmallString<64> Path;
sys::fs::createTemporaryFile("SpecialCaseListTest", "temp", FD, Path);
raw_fd_ostream OF(FD, true, true);
+ if (UseGlobs)
+ OF << "#!special-case-list-v2\n";
OF << Contents;
OF.close();
return std::string(Path.str());
@@ -59,10 +70,10 @@ TEST_F(SpecialCaseListTest, Basic) {
EXPECT_FALSE(SCL->inSection("", "fun", "hello"));
EXPECT_FALSE(SCL->inSection("", "src", "hello", "category"));
- EXPECT_EQ(3u, SCL->inSectionBlame("", "src", "hello"));
- EXPECT_EQ(4u, SCL->inSectionBlame("", "src", "bye"));
- EXPECT_EQ(5u, SCL->inSectionBlame("", "src", "hi", "category"));
- EXPECT_EQ(6u, SCL->inSectionBlame("", "src", "zzzz", "category"));
+ EXPECT_EQ(4u, SCL->inSectionBlame("", "src", "hello"));
+ EXPECT_EQ(5u, SCL->inSectionBlame("", "src", "bye"));
+ EXPECT_EQ(6u, SCL->inSectionBlame("", "src", "hi", "category"));
+ EXPECT_EQ(7u, SCL->inSectionBlame("", "src", "zzzz", "category"));
EXPECT_EQ(0u, SCL->inSectionBlame("", "src", "hi"));
EXPECT_EQ(0u, SCL->inSectionBlame("", "fun", "hello"));
EXPECT_EQ(0u, SCL->inSectionBlame("", "src", "hello", "category"));
@@ -74,31 +85,31 @@ TEST_F(SpecialCaseListTest, CorrectErrorLineNumberWithBlankLine) {
"\n"
"[not valid\n",
Error));
- EXPECT_TRUE(
- ((StringRef)Error).startswith("malformed section header on line 3:"));
+ EXPECT_THAT(Error, StartsWith("malformed section header on line 4:"));
EXPECT_EQ(nullptr, makeSpecialCaseList("\n\n\n"
"[not valid\n",
Error));
- EXPECT_TRUE(
- ((StringRef)Error).startswith("malformed section header on line 4:"));
+ EXPECT_THAT(Error, StartsWith("malformed section header on line 5:"));
}
-TEST_F(SpecialCaseListTest, SectionRegexErrorHandling) {
+TEST_F(SpecialCaseListTest, SectionGlobErrorHandling) {
std::string Error;
EXPECT_EQ(makeSpecialCaseList("[address", Error), nullptr);
- EXPECT_TRUE(((StringRef)Error).startswith("malformed section header "));
+ EXPECT_THAT(Error, StartsWith("malformed section header "));
EXPECT_EQ(makeSpecialCaseList("[[]", Error), nullptr);
- EXPECT_TRUE(((StringRef)Error).startswith("malformed regex for section [: "));
+ EXPECT_EQ(
+ Error,
+ "malformed section at line 2: '[': invalid glob pattern, unmatched '['");
EXPECT_EQ(makeSpecialCaseList("src:=", Error), nullptr);
- EXPECT_TRUE(((StringRef)Error).endswith("Supplied regexp was blank"));
+ EXPECT_THAT(Error, HasSubstr("Supplied glob was blank"));
}
TEST_F(SpecialCaseListTest, Section) {
std::unique_ptr<SpecialCaseList> SCL = makeSpecialCaseList("src:global\n"
- "[sect1|sect2]\n"
+ "[{sect1,sect2}]\n"
"src:test1\n"
"[sect3*]\n"
"src:test2\n");
@@ -152,19 +163,15 @@ TEST_F(SpecialCaseListTest, Substring) {
TEST_F(SpecialCaseListTest, InvalidSpecialCaseList) {
std::string Error;
EXPECT_EQ(nullptr, makeSpecialCaseList("badline", Error));
- EXPECT_EQ("malformed line 1: 'badline'", Error);
+ EXPECT_EQ("malformed line 2: 'badline'", Error);
EXPECT_EQ(nullptr, makeSpecialCaseList("src:bad[a-", Error));
- EXPECT_EQ("malformed regex in line 1: 'bad[a-': invalid character range",
- Error);
- EXPECT_EQ(nullptr, makeSpecialCaseList("src:a.c\n"
- "fun:fun(a\n",
- Error));
- EXPECT_EQ("malformed regex in line 2: 'fun(a': parentheses not balanced",
- Error);
+ EXPECT_EQ(
+ "malformed glob in line 2: 'bad[a-': invalid glob pattern, unmatched '['",
+ Error);
std::vector<std::string> Files(1, "unexisting");
EXPECT_EQ(nullptr,
SpecialCaseList::create(Files, *vfs::getRealFileSystem(), Error));
- EXPECT_EQ(0U, Error.find("can't open file 'unexisting':"));
+ EXPECT_THAT(Error, StartsWith("can't open file 'unexisting':"));
}
TEST_F(SpecialCaseListTest, EmptySpecialCaseList) {
@@ -191,7 +198,7 @@ TEST_F(SpecialCaseListTest, MultipleExclusions) {
}
TEST_F(SpecialCaseListTest, NoTrigramsInRules) {
- std::unique_ptr<SpecialCaseList> SCL = makeSpecialCaseList("fun:b.r\n"
+ std::unique_ptr<SpecialCaseList> SCL = makeSpecialCaseList("fun:b?r\n"
"fun:za*az\n");
EXPECT_TRUE(SCL->inSection("", "fun", "bar"));
EXPECT_FALSE(SCL->inSection("", "fun", "baz"));
@@ -245,4 +252,58 @@ TEST_F(SpecialCaseListTest, EscapedSymbols) {
EXPECT_FALSE(SCL->inSection("", "src", "hello\\\\world"));
}
+TEST_F(SpecialCaseListTest, Version1) {
+ std::unique_ptr<SpecialCaseList> SCL =
+ makeSpecialCaseList("[sect1|sect2]\n"
+ // Does not match foo!
+ "fun:foo.*\n"
+ "fun:abc|def\n"
+ "fun:b.r\n",
+ /*UseGlobs=*/false);
+
+ EXPECT_TRUE(SCL->inSection("sect1", "fun", "fooz"));
+ EXPECT_TRUE(SCL->inSection("sect2", "fun", "fooz"));
+ EXPECT_FALSE(SCL->inSection("sect3", "fun", "fooz"));
+
+ // `foo.*` does not match `foo` because the pattern is translated to `foo..*`
+ EXPECT_FALSE(SCL->inSection("sect1", "fun", "foo"));
+
+ EXPECT_TRUE(SCL->inSection("sect1", "fun", "abc"));
+ EXPECT_TRUE(SCL->inSection("sect2", "fun", "abc"));
+ EXPECT_FALSE(SCL->inSection("sect3", "fun", "abc"));
+
+ EXPECT_TRUE(SCL->inSection("sect1", "fun", "def"));
+ EXPECT_TRUE(SCL->inSection("sect2", "fun", "def"));
+ EXPECT_FALSE(SCL->inSection("sect3", "fun", "def"));
+
+ EXPECT_TRUE(SCL->inSection("sect1", "fun", "bar"));
+ EXPECT_TRUE(SCL->inSection("sect2", "fun", "bar"));
+ EXPECT_FALSE(SCL->inSection("sect3", "fun", "bar"));
+}
+
+TEST_F(SpecialCaseListTest, Version2) {
+ std::unique_ptr<SpecialCaseList> SCL = makeSpecialCaseList("[{sect1,sect2}]\n"
+ "fun:foo*\n"
+ "fun:{abc,def}\n"
+ "fun:b?r\n");
+ EXPECT_TRUE(SCL->inSection("sect1", "fun", "fooz"));
+ EXPECT_TRUE(SCL->inSection("sect2", "fun", "fooz"));
+ EXPECT_FALSE(SCL->inSection("sect3", "fun", "fooz"));
+
+ EXPECT_TRUE(SCL->inSection("sect1", "fun", "foo"));
+ EXPECT_TRUE(SCL->inSection("sect2", "fun", "foo"));
+ EXPECT_FALSE(SCL->inSection("sect3", "fun", "foo"));
+
+ EXPECT_TRUE(SCL->inSection("sect1", "fun", "abc"));
+ EXPECT_TRUE(SCL->inSection("sect2", "fun", "abc"));
+ EXPECT_FALSE(SCL->inSection("sect3", "fun", "abc"));
+
+ EXPECT_TRUE(SCL->inSection("sect1", "fun", "def"));
+ EXPECT_TRUE(SCL->inSection("sect2", "fun", "def"));
+ EXPECT_FALSE(SCL->inSection("sect3", "fun", "def"));
+
+ EXPECT_TRUE(SCL->inSection("sect1", "fun", "bar"));
+ EXPECT_TRUE(SCL->inSection("sect2", "fun", "bar"));
+ EXPECT_FALSE(SCL->inSection("sect3", "fun", "bar"));
+}
}
More information about the cfe-commits
mailing list