[PATCH] D22353: FileCheck Enhancement - CHECK-WORD

Alexander Kornienko via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 27 19:46:49 PDT 2016


alexfh requested changes to this revision.
alexfh added a comment.
This revision now requires changes to proceed.

In https://reviews.llvm.org/D22353#541171, @eklepilkina wrote:

> > What are our options with the regex library? Is there a newer version of the one we are currently using that we could upgrade to (and that supports \b)?
>
>
> As I understood used regexp library is version of OpenBSD library. I couldn't find some new versions.
>  I don't know a lot about modern C++ regexp library, but there are:
>
> 1. PCRE(pcre.h)
> 2. There is regex library in boost. But boost is too big
> 3. Also there is regex library in poco.


FWIW, there's also re2 (https://github.com/google/re2).

> May be there are people who has some experience with C++ regexs libraries.

> 

> > How much work would it be to implement support for \b ourselves?

> 

> 

> I can't give exact estimate. I inly had a look on its source code. But I think that it's wrong to add features support. There are a lot of other useful feautures, for example complex assertions, conditional subexpressions and etc., that can be used. Then if somebody want them he should implement them by himself.




In https://reviews.llvm.org/D22353#536766, @eklepilkina wrote:

> In https://reviews.llvm.org/D22353#535691, @alexfh wrote:
>
> > Why not add more words to check lines to make them more strict?
>
>
> This patch adds
>
>   // CHECK-WORD:
>   // CHECK-WORD-NEXT:
>   // CHECK-WORD-SAME:
>   // CHECK-WORD-DAG:
>   // CHECK-WORD-NOT:
>   
>
> What words do you suggest to add?


I'm suggesting to make patterns used in CHECK-lines stricter by adding more text to them instead of allowing to match on word boundaries using the new CHECK-directives. E.g. the line you want to check is

  op1;

So instead of using `CHECK-WORD: op1`, you could make the pattern stricter by adding more context: `CHECK: {{^}}op1;{{$}}`. Do you have specific examples of output and patterns, where this solution is undesired for some reason?

Overall, I think the FileCheck utility is already confusing enough and many developers attribute some magic properties to it (that it can somehow implicitly connect position of a CHECK line to the position of the relevant line in the input, etc.). Adding more features like CHECK-WORD only makes this worse.

> > So is it feasible to add support for \b to LLVM regexps?

> 

> 

> The regexp library is very old. I asked several times about changing it. But I haven't got any answer. There are a lot of other unsupported features of regexs in this library and which can be useful.




In https://reviews.llvm.org/D22353#545957, @eklepilkina wrote:

> There is problem with STL, their realization of regular expressions doesn't match start and end of line in basic mode and there is no multiline mode. There is the issue 2343 in http://cplusplus.github.io/LWG/lwg-toc.html.
>  There is opportunity to split text by lines and match each line separately, but I think this is a hack, and I think it'll slow FileCheck.


Not sure, whether adding \b support to the regex library used in LLVM is more effort, but there's another option here: take the lib++s regex header with the multiline option added (as per http://cplusplus.github.io/LWG/lwg-active.html#2503) and use a private copy of it (with proper namespace modifications and whatever else is needed to make it actually private. WDYT?


https://reviews.llvm.org/D22353





More information about the llvm-commits mailing list