[LLVMdev] Format of special case list for sanitizers
Sean Silva
chisophugis at gmail.com
Wed Apr 15 08:31:14 PDT 2015
On Mon, Apr 6, 2015 at 12:47 AM, Ryan Govostes <rzg at apple.com> wrote:
> The documentation for the sanitizer special case list format[0] says,
>
> > The meanining of * in regular expression for entity names is different -
> it is treated as in shell wildcarding.
>
>
> In SpecialCaseList::parse, we see that this is just replacing * with .*:
>
> // Replace * with .*
> for (size_t pos = 0; (pos = Regexp.find("*", pos)) !=
> std::string::npos;
> pos += strlen(".*")) {
> Regexp.replace(pos, strlen("*"), ".*");
> }
>
> This seems to introduce more problems than it solves, since (i) this
> doesn’t really behave like a shell globbing wildcard as advertised, and
> (ii) if the user tries to use * as a regex quantifier, this will match
> incorrectly: A* matches the empty string and any number of As, while A.*
> matches all strings that start with at least one A.
>
> If it’s forgivable to break compatibility here, we should do regular
> expressions _or_ shell globbing, and not a hybrid format.
>
This really doesn't seem compelling enough to merit breaking compatibility,
especially since there are already users in the wild.
-- Sean Silva
>
> I’d prefer shell globbing for paths in src entities, but that isn’t as
> useful for function names. Most filenames will contain periods, which also
> need to be escaped properly as regular expressions. (This also limits the
> usefulness of treating literals separately.)
>
> (Just a note: the way that regular expressions are concatenated in ::parse
> appears to have a bug if a pattern contains a pipe.)
>
> Ryan
>
>
> 0: http://clang.llvm.org/docs/SanitizerSpecialCaseList.html
>
>
> diff --git a/lib/Support/SpecialCaseList.cpp
> b/lib/Support/SpecialCaseList.cpp
> index c312cc1..2972cb1 100644
> --- a/lib/Support/SpecialCaseList.cpp
> +++ b/lib/Support/SpecialCaseList.cpp
> @@ -133,7 +133,7 @@ bool SpecialCaseList::parse(const MemoryBuffer *MB,
> std::string &Error) {
> // Add this regexp into the proper group by its prefix.
> if (!Regexps[Prefix][Category].empty())
> Regexps[Prefix][Category] += "|";
> - Regexps[Prefix][Category] += "^" + Regexp + "$";
> + Regexps[Prefix][Category] += "^(" + Regexp + ")$)";
> }
> return true;
> }
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150415/54a1d5ff/attachment.html>
More information about the llvm-dev
mailing list