<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">Hi,</div><div class="gmail_quote"><br></div><div class="gmail_quote">On Wed, Jul 9, 2014 at 1:42 PM, Peter Collingbourne <span dir="ltr"><<a href="mailto:peter@pcc.me.uk" target="_blank">peter@pcc.me.uk</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">On Wed, Jul 09, 2014 at 07:40:09PM -0000, Alexey Samsonov wrote:<br>

> Author: samsonov<br>

> Date: Wed Jul  9 14:40:08 2014<br>

> New Revision: 212643<br>

><br>

> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=212643&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=212643&view=rev</a><br>

> Log:<br>

> Decouple llvm::SpecialCaseList text representation and its LLVM IR semantics.<br>

><br>

> Turn llvm::SpecialCaseList into a simple class that parses text files in<br>

> a specified format and knows nothing about LLVM IR. Move this class into<br>

> LLVMSupport library. Implement two users of this class:<br>

>   * DFSanABIList in DFSan instrumentation pass.<br>

>   * SanitizerBlacklist in Clang CodeGen library.<br>

> The latter will be modified to use actual source-level information from frontend<br>

> (source file names) instead of unstable LLVM IR things (LLVM Module identifier).<br>

><br>

> Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils.<br>

><br>

> No functionality change.<br>

<br>

</div>Okay.<br>

<br>

Do you have any further refactoring planned for SpecialCaseList? One change<br>

that I realised that I will need to make for DFSan is to change the semantics<br>

for symbol names appearing in the special case list so that they are always<br>

interpreted "literally", in order to allow symbol names that may contain<br>

regex metacharacters to be added to the StringSet. Of course this would be<br>

under a boolean argument somewhere so that the special case list semantics<br>

for the other sanitizers aren't affected.<br></blockquote><div><br></div><div>No, I don't currently plan refactoring of SpecialCaseList, instead I'm going change</div><div>the behavior of SanitizerBlacklist. Literal representation of metacharacters (except for *)</div>

<div>sounds like a good idea - documentation refers to special case list entries as "wildcard</div><div>expressions", not "regular expressions", so it would be weird to force users to escape</div><div>

characters on their own. But, yes, let's make this change affect DFSan only for now,</div><div>just add  an extra parameter to SpecialCaseList factory to specify if it needs to escape</div><div>metachars when parsing the file.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

(The other solution to this problem that I thought of was to teach the tool<br>

that builds the special case list to backslash-escape metacharacters and then<br>

teach SpecialCaseList to try to unescape them, but then I realised that this<br>

wouldn't work for the '*' character because we currently replace it with '.*'<br>

which wouldn't be escaped correctly. Although I guess maybe we could say that<br>

'\*' is treated like a literal '*' in the regex, meaning that one could write<br>

'\\*' to match a literal '*'. But maybe so many levels of escaping would be<br>

too confusing.)<br>

<br>

Thanks,<br>

<span class="HOEnZb"><font color="#888888">--<br>

Peter<br>

</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr">Alexey Samsonov<br><a href="mailto:vonosmas@gmail.com" target="_blank">vonosmas@gmail.com</a></div>

</div></div>