[cfe-dev] How to write a matcher?

Stephen Kelly steveire at gmail.com
Tue Mar 11 02:48:39 PDT 2014


I want to write a tool to remove c_str() calls in lines like

 std::cout << s.c_str();

There isn't really enough information about what matchers mean, and how to 
use them. 

There is no good information on how to write/debug them, and how they need 
to be arranged (nested, in an allof, wrapped with to(), on(), callee() etc).

How does one *actually* go about doing it?

Here's my attempt:

My test code is:

 int main(int, char **)
   std::string s("something");
   std::cout << s << std::endl;
   std::cout << s.c_str() << std::endl;
   return 0;

I used

 clang++ -Xclang -ast-dump -fsyntax-only -fno-color-diagnostics ../main.cpp 

to generate   

 `-FunctionDecl 0x275cae0 <../main.cpp:9:1, line:16:1> main 'int (int, char 
   |-ParmVarDecl 0x275c9a0 <line:9:10> 'int'
   |-ParmVarDecl 0x275ca10 <col:15, col:21> 'char **'
   `-CompoundStmt 0x275f150 <line:10:1, line:16:1>
     |-DeclStmt 0x275d160 <line:11:3, col:23>
     | `-VarDecl 0x275cc30 <col:3, col:22> s 'std::string':'class 
     |   `-ExprWithCleanups 0x275d148 <col:15, col:22> 'std::string':'class 
     |     `-CXXConstructExpr 0x275d108 <col:15, col:22> 
'std::string':'class std::basic_string<char>' 'void (const char *, const 
class std::allocator<char> &)'
     |       |-ImplicitCastExpr 0x275cd20 <col:17> 'const char *' 
     |       | `-StringLiteral 0x275cc88 <col:17> 'const char [4]' lvalue 
     |       `-CXXDefaultArgExpr 0x275d0e0 <<invalid sloc>> 'const class 
std::allocator<char>':'const class std::allocator<char>' lvalue
     |-CXXOperatorCallExpr 0x275e160 <line:12:3, col:26> 
'__ostream_type':'class std::basic_ostream<char>' lvalue
     | |-ImplicitCastExpr 0x275e148 <col:18> '__ostream_type &(*)
(__ostream_type &(*)(__ostream_type &))' <FunctionToPointerDecay>
     | | `-DeclRefExpr 0x275e0c0 <col:18> '__ostream_type &(__ostream_type 
&(*)(__ostream_type &))' lvalue CXXMethod 0x26d0410 'operator<<' 
'__ostream_type &(__ostream_type &(*)(__ostream_type &))'
     | |-CXXOperatorCallExpr 0x275d5c0 <col:3, col:16> 'basic_ostream<char, 
struct std::char_traits<char> >':'class std::basic_ostream<char>' lvalue
     | | |-ImplicitCastExpr 0x275d5a8 <col:13> 'basic_ostream<char, struct 
std::char_traits<char> > &(*)(basic_ostream<char, struct 
std::char_traits<char> > &, const basic_string<char, struct 
std::char_traits<char>, class std::allocator<char> > &)' 
     | | | `-DeclRefExpr 0x275d528 <col:13> 'basic_ostream<char, struct 
std::char_traits<char> > &(basic_ostream<char, struct std::char_traits<char> 
> &, const basic_string<char, struct std::char_traits<char>, class 
std::allocator<char> > &)' lvalue Function 0x2533250 'operator<<' 
'basic_ostream<char, struct std::char_traits<char> > &(basic_ostream<char, 
struct std::char_traits<char> > &, const basic_string<char, struct 
std::char_traits<char>, class std::allocator<char> > &)'
     | | |-DeclRefExpr 0x275d198 <col:3, col:8> 'ostream':'class 
std::basic_ostream<char>' lvalue Var 0x275c3a0 'cout' 'ostream':'class 
     | | `-ImplicitCastExpr 0x275d510 <col:16> 'const basic_string<char, 
struct std::char_traits<char>, class std::allocator<char> >':'const class 
std::basic_string<char>' lvalue <NoOp>
     | |   `-DeclRefExpr 0x275d1d0 <col:16> 'std::string':'class 
std::basic_string<char>' lvalue Var 0x275cc30 's' 'std::string':'class 
     | `-ImplicitCastExpr 0x275e0a8 <col:21, col:26> 'basic_ostream<char, 
struct std::char_traits<char> > &(*)(basic_ostream<char, struct 
std::char_traits<char> > &)' <FunctionToPointerDecay>
     |   `-DeclRefExpr 0x275e068 <col:21, col:26> 'basic_ostream<char, 
struct std::char_traits<char> > &(basic_ostream<char, struct 
std::char_traits<char> > &)' lvalue Function 0x26d4480 'endl' 
'basic_ostream<char, struct std::char_traits<char> > &(basic_ostream<char, 
struct std::char_traits<char> > &)' (FunctionTemplate 0x26b9180 'endl')
     |-CXXOperatorCallExpr 0x275f0c8 <line:13:3, col:34> 
'__ostream_type':'class std::basic_ostream<char>' lvalue
     | |-ImplicitCastExpr 0x275f0b0 <col:26> '__ostream_type &(*)
(__ostream_type &(*)(__ostream_type &))' <FunctionToPointerDecay>
     | | `-DeclRefExpr 0x275f088 <col:26> '__ostream_type &(__ostream_type 
&(*)(__ostream_type &))' lvalue CXXMethod 0x26d0410 'operator<<' 
'__ostream_type &(__ostream_type &(*)(__ostream_type &))'
     | |-CXXOperatorCallExpr 0x275e610 <col:3, col:24> 'basic_ostream<char, 
struct std::char_traits<char> >':'class std::basic_ostream<char>' lvalue
     | | |-ImplicitCastExpr 0x275e5f8 <col:13> 'basic_ostream<char, struct 
std::char_traits<char> > &(*)(basic_ostream<char, struct 
std::char_traits<char> > &, const char *)' <FunctionToPointerDecay>
     | | | `-DeclRefExpr 0x275e578 <col:13> 'basic_ostream<char, struct 
std::char_traits<char> > &(basic_ostream<char, struct std::char_traits<char> 
> &, const char *)' lvalue Function 0x26d9830 'operator<<' 
'basic_ostream<char, struct std::char_traits<char> > &(basic_ostream<char, 
struct std::char_traits<char> > &, const char *)'
     | | |-DeclRefExpr 0x275e1c8 <col:3, col:8> 'ostream':'class 
std::basic_ostream<char>' lvalue Var 0x275c3a0 'cout' 'ostream':'class 
     | | `-CXXMemberCallExpr 0x275e258 <col:16, col:24> 'const char *'
     | |   `-MemberExpr 0x275e228 <col:16, col:18> '<bound member function 
type>' .c_str 0x252c400
     | |     `-ImplicitCastExpr 0x275e280 <col:16> 'const class 
std::basic_string<char>' lvalue <NoOp>
     | |       `-DeclRefExpr 0x275e200 <col:16> 'std::string':'class 
std::basic_string<char>' lvalue Var 0x275cc30 's' 'std::string':'class 
     | `-ImplicitCastExpr 0x275f070 <col:29, col:34> 'basic_ostream<char, 
struct std::char_traits<char> > &(*)(basic_ostream<char, struct 
std::char_traits<char> > &)' <FunctionToPointerDecay>
     |   `-DeclRefExpr 0x275f030 <col:29, col:34> 'basic_ostream<char, 
struct std::char_traits<char> > &(basic_ostream<char, struct 
std::char_traits<char> > &)' lvalue Function 0x26d4480 'endl' 
'basic_ostream<char, struct std::char_traits<char> > &(basic_ostream<char, 
struct std::char_traits<char> > &)' (FunctionTemplate 0x26b9180 'endl')
     `-ReturnStmt 0x275f130 <line:14:3, col:10>
       `-IntegerLiteral 0x275f110 <col:10> 'int' 0

It seems useful to writing a matcher.

I modified the removeCStrCalls tool with a debugging matcher, so that I can 
start broad, and incrementally narrow what I want to match, while printing 
out what I have actually matched:

 class Debugging : public ast_matchers::MatchFinder::MatchCallback {
   Debugging(tooling::Replacements *Replace)
     : Replace(Replace) {}

   virtual void run(const ast_matchers::MatchFinder::MatchResult &Result) {

     const Expr *Arg =
     std::string match = getText(*Result.SourceManager, *Arg);
     std::cout << "MATCH " << match << std::endl;


   tooling::Replacements *Replace;

Is this the right/sane/only possible approach to the act of writing a 

>From the -ast-dump, it looks like I need to first match a 
CXXOperatorCallExpr, so I write this matcher:

      id("match", operatorCallExpr()),

which gives me lots of output, mostly from the iostream header. I need to 
get narrower.

In the -ast-dump output there is a CXXMemberCallExpr nested in the 
CXXOperatorCallExpr. I make a logical jump at matching that:

      id("match", operatorCallExpr(memberCallExpr())),

However, that produces no output, so my logical jump must be incorrect.

What is the correct logical jump here in order to narrow my match? 

How can I determine what logical jumps I can make, by looking at the ast-
dump output? 

How does one go about *actually* writing matcher code? 

What is the understanding one must have, the logic one must have and the 
information one must have to hand (is ast-dump output useful or not)?

To be clear (in case it is not already clear), I'm not looking for a 
finished solution for the match/replacement I'm trying to make. I'm asking 
what to spend time doing between opening my editor and running a tool that 
does what I need. How do I figure out what code I need to write?

Note: I have read the docs. The section at


shows some incremental steps, but does not explain how they are chosen. 

How does one know to go from 




for example? That is the kind of information I am looking for.



More information about the cfe-dev mailing list