[LLVMdev] Format of special case list for sanitizers

Sean Silva chisophugis at gmail.com
Wed Apr 15 08:31:14 PDT 2015


On Mon, Apr 6, 2015 at 12:47 AM, Ryan Govostes <rzg at apple.com> wrote:

> The documentation for the sanitizer special case list format[0] says,
>
> > The meanining of * in regular expression for entity names is different -
> it is treated as in shell wildcarding.
>
>
> In SpecialCaseList::parse, we see that this is just replacing * with .*:
>
>     // Replace * with .*
>     for (size_t pos = 0; (pos = Regexp.find("*", pos)) !=
> std::string::npos;
>          pos += strlen(".*")) {
>       Regexp.replace(pos, strlen("*"), ".*");
>     }
>
> This seems to introduce more problems than it solves, since (i) this
> doesn’t really behave like a shell globbing wildcard as advertised, and
> (ii) if the user tries to use * as a regex quantifier, this will match
> incorrectly: A* matches the empty string and any number of As, while A.*
> matches all strings that start with at least one A.
>
> If it’s forgivable to break compatibility here, we should do regular
> expressions _or_ shell globbing, and not a hybrid format.
>

This really doesn't seem compelling enough to merit breaking compatibility,
especially since there are already users in the wild.

-- Sean Silva


>
> I’d prefer shell globbing for paths in src entities, but that isn’t as
> useful for function names. Most filenames will contain periods, which also
> need to be escaped properly as regular expressions. (This also limits the
> usefulness of treating literals separately.)
>
> (Just a note: the way that regular expressions are concatenated in ::parse
> appears to have a bug if a pattern contains a pipe.)
>
> Ryan
>
>
> 0: http://clang.llvm.org/docs/SanitizerSpecialCaseList.html
>
>
> diff --git a/lib/Support/SpecialCaseList.cpp
> b/lib/Support/SpecialCaseList.cpp
> index c312cc1..2972cb1 100644
> --- a/lib/Support/SpecialCaseList.cpp
> +++ b/lib/Support/SpecialCaseList.cpp
> @@ -133,7 +133,7 @@ bool SpecialCaseList::parse(const MemoryBuffer *MB,
> std::string &Error) {
>      // Add this regexp into the proper group by its prefix.
>      if (!Regexps[Prefix][Category].empty())
>        Regexps[Prefix][Category] += "|";
> -    Regexps[Prefix][Category] += "^" + Regexp + "$";
> +    Regexps[Prefix][Category] += "^(" + Regexp + ")$)";
>    }
>    return true;
>  }
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150415/54a1d5ff/attachment.html>


More information about the llvm-dev mailing list