[cfe-dev] ASTMatchers: isVirtual and isOverride

Pedro Delgado Perez pedro.delgado at uca.es
Thu May 16 03:30:51 PDT 2013


 
Hi,
> You'll have one process(Decl*) method that does all the common stuff, and overloads for more specific types as needed, e.g. process(FunctionDecl*), which then call the Decl* overload by explicit casting. Have overload resolution work for you.
Ok, I think the best in my case is to have a process method per matcher, so don't take care of this anymore.

Now, I have other two or three questions spinning in my mind.

1. Imagine that I have two files to process in the same execution and both include the same header file. Will that header file's code be processed twice? In this case, how could I avoid this?

2. A matcher finds a CXXConstructorDecl node in a header file (we can call it X.h):
-first: I need to inform the cpp files that include X.h that the header has changed. How can I find that files? Up to now, I was only proccesing one file and searched for the file through Context->getSourceManager().getMainFileID() (That I suppose is the file which have the main method implemented, but I'm not absolutely sure). But now, I need to find another way to find these cpp files. 

3. Is there a way to stop the traversal of the AST in a certain moment when, for instance, I find what I was looking for?

Thanks in advance,

Pedro.


El dia 14 may 2013 21:13, Gábor Kozár <kozargabor at gmail.com> escribió:
> 
> I don't see the problem.
> You'll have one process(Decl*) method that does all the common stuff, and overloads for more specific types as needed, e.g. process(FunctionDecl*), which then call the Decl* overload by explicit casting. Have overload resolution work for you.
> Gabor
> 
> 
> 
> 2013/5/14 Pedro Delgado Perez <pedro.delgado at uca.es [ mailto:pedro.delgado at uca.es ]>
> 
> 
> Ok, what I'm trying to do is not to repeat the same code again and again in different methods (process_matcher1(), process_matcher2()...) For example, matcher1 and matcher2 are treated in the same way except in a little part.  At first, I thought of having a Decl* variable and casting the node result, but I'm not sure if this is a good option. Later, for instance, I need to use the getPreviousDecl() method that you told me and then I'll have to know the class type (or maybe I can cast to Decl* again).
> 
> Pedro.
> 
> El dia 14 may 2013 20:15, Gábor Kozár <kozargabor at gmail.com [ mailto:kozargabor at gmail.com ]> escribió:
> 
> 
> 
> Hi,
> > if (Result.Nodes.getNodeAs<clang::CXXConstructorDecl>("matcher1")) {
> 
> 
> I usually do this:
> 
> if(const CXXConstructorDecl* ctor = Result.Nodes.getNodeAs<CXXConstructorDecl>("matcher1")) { ... }
> > template<typename T>
> void process(const T* Node) {
> 
> 
> I would recommend against this. Template metaprogramming gives you no advantages here so far as I can see. Use overloading and polymorphism instead.
> Gabor
> 
> 
> 
> 2013/5/14 Pedro Delgado Perez <pedro.delgado at uca.es [ mailto:pedro.delgado at uca.es ]>
> 
> 
> Hi,
> 
> 
> Neither :) Just create an object in main(), and hand a pointer to that object to your callback - then call stuff on it from the match callback.
> 
Ok, ok, that was the problem.


> 1. I create a MatchFinder.
> 2. Then, I add two different matchers to the same callback.
> 3. I call run. 
>
> My questions are:
> 1. How many times is called the run method? Every time the MatchFinder finds a node to match?


Yes, each time either matcher accepts a node.

> 2. The object callback I add to MatchFinder is created more than once? When it is deleted?


You (the user) are the one responsible for creating and destroying the callback object. Usually what we do is allocate it on the stack (and pass it the appropriate arguments, of course), and then pass its address to MatchFinder. MatchFinder doesn't create, copy or delete your callback object, so it is safe to store state in the callback.

> Could you explain better "if the second matcher has no deps on the information collected by matcher1"?


deps = dependencies



Gabor
Thanks for your answers Gabor.

I have decided to follow up the advice of Manuel and I have created a MatchFinder object with all the matchers and a single MatchCallback object. So now, I'm in my run() method and I want to know which matcher was bound. So I think the only way to know this is to ask:

if (Result.Nodes.getNodeAs<clang::CXXConstructorDecl>("matcher1")){
...
}
else if(Result.Nodes.getNodeAs<clang::CXXRecordDecl>("matcher2")){
... 
}

Notice that I'm retrieving different classes of nodes (clang::CXXConstructorDecl, clang::CXXRecordDecl...) In spite of this fact, I want to process all the nodes in the same way except from a subtle difference and I was asking myself if there is a better way I can do something like this:

if (Result.Nodes.getNodeAs<clang::CXXConstructorDecl>("matcher1")){
 ...
 process(Result.Nodes.getNodeAs<clang::CXXConstructorDecl>("matcher1"))

}
else if(Result.Nodes.getNodeAs<clang::CXXRecordDecl>("matcher2")){
 ...
 process(Result.Nodes.getNodeAs<clang::CXXRecordDecl>("matcher2"))
}
...


template<typename T>
void process(const T* Node){
...
}

Regards,

Pedro.


El dia 10 may 2013 09:21, Gábor Kozár <kozargabor at gmail.com [ mailto:kozargabor at gmail.com ]> escribió:



Hi,
> 1. I create a MatchFinder.
> 2. Then, I add two different matchers to the same callback.
> 3. I call run. 
>
> My questions are:
> 1. How many times is called the run method? Every time the MatchFinder finds a node to match?


Yes, each time either matcher accepts a node.

> 2. The object callback I add to MatchFinder is created more than once? When it is deleted?


You (the user) are the one responsible for creating and destroying the callback object. Usually what we do is allocate it on the stack (and pass it the appropriate arguments, of course), and then pass its address to MatchFinder. MatchFinder doesn't create, copy or delete your callback object, so it is safe to store state in the callback.

> Could you explain better "if the second matcher has no deps on the information collected by matcher1"?


deps = dependencies



Gabor



2013/5/8 Pedro Delgado Perez <pedro.delgadoperez at mail.uca.es [ mailto:pedro.delgadoperez at mail.uca.es ]>


Hi,

Thanks for all your support. I'm going to respond all your contributions little by little:


You can execute Matcher1 first, and when it returns, you start executing Matcher2. I'm not sure what your problem is exactly. (Of course, this will also mean that you will traverse the whole AST twice, which can potentially be very expensive.)

Ok. What I don't understand quite well is the order in the execution.

1. I create a MatchFinder.
2. Then, I add two different matchers to the same callback.
3. I call run. 

My questions are:
1. How many times is called the run method? Every time the MatchFinder finds a node to match?
2. The object callback I add to MatchFinder is created more than once? When it is deleted? The problem is that I create the object callback with an argument (a constructor method with an argument, not the default constructor) and store it in a variable member. However, when the run method executes, the value of the variable member is lost and I can't understand the problem.



Sorry, missed the comment in one of your more recent emails. So you basically want all matches by matcher1 to be found before all matchers by matcher2. I think the easiest way to do this is just have two MatchFinder classes. Add matcher1 and its callback to the first MatchFinder and call run(). Add matcher2 and its callback to the second MatchFinder and call run() after the first run() returns. You won't have to reparse the file but you will have to traverse the AST twice. There's not much you can do about this given the order you want. However, if the second matcher has no deps on the information collected by matcher1 you could just get away with a single MatchFinder and two callbacks that just store their finds in two vectors which you process after run returns

I'm not sure what order you're interested in but you can create two match callbacks one to use for each matcher. The callbacks will be called in the order the nodes are found in the tree by doing a depth-first pre-order traversal as is usual for RecursiveASTVisitor. If both matchers match a given node, the callbacks are called in the order they are registered. 

What sort of order are you looking for? 

I understand what you have explained until you say "two callbacks". Why could I want to have two different callbacks? Could you explain better "if the second matcher has no deps on the information collected by matcher1"?



What is often faster is to produce a superset of the information needed during the traversal / match phase, and then combine / filter the results after everything's done...
That is a good solution, I suppose. I will have to study my case more in detail to determine whether this fit my problem or not. But, do you store all that information in global variables or in the callback object?



.

You can just count how many times your match callback is called and then print that number after run() returns. 

Again, my problem is that I don't $when the match callback is called. 

I hope I can end up getting the usage of all this with your help.

Pedro.


El dia 08 may 2013 12:10, Gábor Kozár <kozargabor at gmail.com [ mailto:kozargabor at gmail.com ]> escribió:

Hi,


> Ok, this also helps, but the problem is that I need to print something when all the nodes have been matched. So, is there a way I can check within onStartOfTranslationUnit() method if we have reached the end? I have just found this:


onStartOfTranslationUnit() is - obviously - called at the start of each TU. I.e. this will be called after the previous TU has been completed, i.e. all nodes in it have been matched.

> clang::ast_matchers::MatchFinder::ParsingDoneTestCallback


This is for testing purposes only, it's not recommended to use it.

> So, what you are saying is that I can't control the order the nodes are matched?


No, you can't. You'll have to implement a RecursiveASTVisitor manually if you need this fine control.

> What I'm trying to explain is that I need all the nodes bound with Matcher1 are treated before the execution of nodes bound with Matcher2 starts.


You can execute Matcher1 first, and when it returns, you start executing Matcher2. I'm not sure what your problem is exactly. (Of course, this will also mean that you will traverse the whole AST twice, which can potentially be very expensive.)


Gabor






2013/5/8 Pedro Delgado Perez <pedro.delgadoperez at mail.uca.es [ mailto:pedro.delgadoperez at mail.uca.es ]>



Hi,


You collect information about the matches into e.g. a vector, and you can use the MatchCallback's onStartOfTranslationUnit to process the previous TU's matches.

Ok, this also helps, but the problem is that I need to print something when all the nodes have been matched. So, is there a way I can check within onStartOfTranslationUnit() method if we have reached the end? I have just found this:

clang::ast_matchers::MatchFinder::ParsingDoneTestCallback

This is the implementation of the class:

class ParsingDoneTestCallback [ http://fossies.org/dox/clang-3.2.src/classclang_1_1ast__matchers_1_1MatchFinder_1_1ParsingDoneTestCallback.html ] {

 public:
 virtual ~ParsingDoneTestCallback [ http://fossies.org/dox/clang-3.2.src/classclang_1_1ast__matchers_1_1MatchFinder_1_1ParsingDoneTestCallback.html#ab6018406e1835b0dcc7fa982e0836a55 ]();
 virtual void run [ http://fossies.org/dox/clang-3.2.src/classclang_1_1ast__matchers_1_1MatchFinder_1_1ParsingDoneTestCallback.html#a818b643aaad117bb86bcda80bba32ceb ]() = 0;
};

Maybe I could redefine the run method. Do you think this may be my solution?


So the matchers run regardless of whether you ever access the bound nodes. What this means is that your run() method will be called for every match, with the appropriate nodes bound to the names you defined. So a MatchResult only contains information about one single match (i.e. a subtree of the AST, if you will). Hope this clears things up.
Um... I'm a bit mixed up at this moment. So, what you are saying is that I can't control the order the nodes are matched? For me that would be a problem because I need to keep an order in the execution. What I'm trying to explain is that I need all the nodes bound with Matcher1 are treated before the execution of nodes bound with Matcher2 starts.

Thanks,

Pedro.

El dia 07 may 2013 22:10, Gábor Kozár <kozargabor at gmail.com [ mailto:kozargabor at gmail.com ]> escribió:



Hi,

>1. Imagine that I need to print something after all the nodes have been matched. For example, the number of nodes matched. How can I do that? 



You collect information about the matches into e.g. a vector, and you can use the MatchCallback's onStartOfTranslationUnit to process the previous TU's matches.


> How does the 'run' method behave in this case? I mean that I don't know if it retrieves, one by one, all the nodes in applyMatch1 and ,after that, all the nodes in applyMatch2 or it matches one in applyMatch1 and then other in applyMatch2 in each iteration.



So the matchers run regardless of whether you ever access the bound nodes. What this means is that your run() method will be called for every match, with the appropriate nodes bound to the names you defined. So a MatchResult only contains information about one single match (i.e. a subtree of the AST, if you will). Hope this clears things up.
Gabor




2013/5/7 Pedro Delgado Perez <pedro.delgadoperez at mail.uca.es [ mailto:pedro.delgadoperez at mail.uca.es ]>


Hi again,

Sorry to make so much questions, but I hope a good structure of my tool will save me a lot of time in future.

I'm using something like this in http://clang.llvm.org/docs/LibASTMatchersTutorial.html [ http://clang.llvm.org/docs/LibASTMatchersTutorial.html ]:

class LoopPrinter : public MatchFinder::MatchCallback {public :  virtual void run(const MatchFinder::MatchResult &Result) {    if (const ForStmt *FS = Result.Nodes.getNodeAs<clang::ForStmt>("forLoop"))      FS->dump();  }};

So, now I have two questions:

1. Imagine that I need to print something after all the nodes have been matched. For example, the number of nodes matched. How can I do that? 

2. Imagine that I have two methods within the run method to separate two kind of nodes I want to bind. Something like this:




 class LoopPrinter : public MatchFinder::MatchCallback {public :  virtual void run(const MatchFinder::MatchResult &Result) {    applyMatch1();
     applyMatch2();  }

void applyMatch1(){
 if (const ForStmt *FS = Result.Nodes.getNodeAs<clang::ForStmt>("forLoop__1"))
 }

void apply2(){if (const ForStmt *FS = Result.Nodes.getNodeAs<clang::ForStmt>("forLoop_2"))}
};



How does the 'run' method behave in this case? I mean that I don't know if it retrieves, one by one, all the nodes in applyMatch1 and ,after that, all the nodes in applyMatch2 or it matches one in applyMatch1 and then other in applyMatch2 in each iteration. I hope you can understand me because this is very important in my case.

Thanks in advance.

Pedro






El dia 06 may 2013 22:32, "Vane, Edwin" <edwin.vane at intel.com [ mailto:edwin.vane at intel.com ]> escribió:

Given your description, I'm not sure matchers are what you want. If you just want to print information on certain types of nodes, you could use a RecursiveASTVisitor for that.

However, if what you're looking for is a little more complex then matchers may be what you want after all.

As for the two classes, you want to use tooling::MatchFinder as shown in the tutorial. The other is just an implementation detail of the match finding code.

newASTConsumer() is a function that's required to be defined for objects passed to newFrontendActionFactory(). You don't need to implement it. It's implemented by MatchFinder. Again, it's an implementation detail you don't need to worry about at this point.

The use of ASTConsumers is not necessary if you're using MatchFinder and ClangTool as described in the tutorial. MatchFinder is an abstraction around RecursiveASTVisitor so all that stuff in RecursiveASTVisitor you'd normally have to use is actually hidden away.

I think you should first decide which route you want to go: MatchFinder or RecursiveASTVisitor. The first question I'd ask is: how hard is it to find the nodes I want to print info on in the AST. If all I want is every for loop that's easy. If I want for loops within member functions of a specific class, that's hard and an excellent use case for ASTMatchers.



-----Original Message-----
From: cfe-dev-bounces at cs.uiuc.edu [ mailto:cfe-dev-bounces at cs.uiuc.edu ] [mailto:cfe-dev-bounces at cs.uiuc.edu [ mailto:cfe-dev-bounces at cs.uiuc.edu ]] On
Behalf Of Pedro Delgado Perez
Sent: Monday, May 06, 2013 12:58 PM
To: klimek at google.com [ mailto:klimek at google.com ]; kozargabor at gmail.com [ mailto:kozargabor at gmail.com ]
Cc: cfe-dev at cs.uiuc.edu [ mailto:cfe-dev at cs.uiuc.edu ]
Subject: Re: [cfe-dev] ASTMatchers: isVirtual and isOverride

Hi,

I need your help again. Look, in my tool I was trying to use the syntax that's
shown here:

http://clang.llvm.org/docs/LibASTMatchersTutorial.html [ http://clang.llvm.org/docs/LibASTMatchersTutorial.html ]

Namely I'm referring to this part:

int main(int argc, const char **argv) {
CommonOptionsParser OptionsParser(argc, argv);
ClangTool Tool(OptionsParser.getCompilations(),
OptionsParser.getSourcePathList());

LoopPrinter Printer;
MatchFinder Finder;
Finder.addMatcher(LoopMatcher, &Printer);

return Tool.run(newFrontendActionFactory(&Finder));
}

However, now I want to create a object "Printer" with different features
depending on the arguments provided in command line. So I was thinking on
implement a factory method pattern to create a different LoopPrinter object:
class OptionsFactory {
public:
LoopPrinter getOption() {
if(...)
return LoopPrinter(attribute1, attribute2);
else
return LoopPrinter(attribute1);
}
};

I was searching for a better solution and there are some things that I can't
completely understand.

- Why are there two classes MatchFinder:
http://clang.llvm.org/doxygen/classclang_1_1tooling_1_1MatchFinder.html [ http://clang.llvm.org/doxygen/classclang_1_1tooling_1_1MatchFinder.html ]
http://clang.llvm.org/doxygen/classclang_1_1ast__matchers_1_1MatchFinder [ http://clang.llvm.org/doxygen/classclang_1_1ast__matchers_1_1MatchFinder ].
html

- What the method in ast_matchers:MatchFinder

clang::ASTConsumer
<http://clang.llvm.org/doxygen/classclang_1_1ASTConsumer.html> [ http://clang.llvm.org/doxygen/classclang_1_1ASTConsumer.html%3E ] *

newASTConsumer
<http://clang.llvm.org/doxygen/classclang_1_1ast__matchers_1_1MatchFinder [ http://clang.llvm.org/doxygen/classclang_1_1ast__matchers_1_1MatchFinder ]
.html#a4807049e6e39572d19ff127406df3d81> ()

is used for? I see we can 'associate' an ASTConsumer to the Frontend as in:
http://clang.llvm.org/docs/RAVFrontendAction.html [ http://clang.llvm.org/docs/RAVFrontendAction.html ]

but, is this possible using Matchers? Would it have any sense to create an
ASTConsumer in my class OptionsFactory?

I have improved a lot since you last helped me, but clang is too big!

By the way, do you know how to use CommandLine? I posted a new thread

http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-May/029473.html [ http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-May/029473.html ]

If you know how to solve that problem, please, let me know.

Thanks in advance,

Pedro.


El dia 27 abr 2013 18:39, Manuel Klimek <klimek at google.com [ mailto:klimek at google.com ]> escribió:

On Sat, Apr 27, 2013 at 6:36 PM, Gábor Kozár
<kozargabor at gmail.com [ mailto:kozargabor at gmail.com ]> wrote:



2013/4/27 Manuel Klimek <klimek at google.com [ mailto:klimek at google.com ]>


Just use the empty string for binding and getNodeAs :)


That would potentially lead to confusion when there are more
nodes bound, but the programmer forgot to supply the proper name. In my
suggestion, the parameterless getNodeAs would have an assert to check there is
exactly one node bound (and whose name is the default name).

If you put everything behind constants, I think it'll be easy enough to see
what's happening - and that's a generally good strategy anyway, as you get a
compile error if you mistype...
Thus, I think it'd not add enough value to special case the interface.




2013/4/27 Manuel Klimek <klimek at google.com [ mailto:klimek at google.com ]>


On Sat, Apr 27, 2013 at 6:28 PM, Gábor Kozár
<kozargabor at gmail.com [ mailto:kozargabor at gmail.com ]> wrote:

Hi,
2013/4/26 Pedro Delgado Perez
<pedro.delgadoperez at mail.uca.es [ mailto:pedro.delgadoperez at mail.uca.es ]>


Hehehe... I found the problem with
this. I was binding wrongly the matcher! I used a id in the matcher thas was
different from the id in the function that retrieves the nodes... I think this will be
a typical mistake for newbies...

Ah, yes, that happens a lot to me as well. Now
that I think about it, it might be worthwhile adding a parameterless .bind() and
.getNodeAs<T>() for situations where only one node is bound. Should be fairly
trivial, but also not all that useful...

Just use the empty string for binding and getNodeAs :)



2013/4/26 Pedro Delgado Perez
<pedro.delgadoperez at mail.uca.es [ mailto:pedro.delgadoperez at mail.uca.es ]>


Thanks both! Now I can see all this
much clearer and I have the enough knowledge to start out with clang.


You're welcome. Good luck!


2013/4/25 Manuel Klimek <klimek at google.com [ mailto:klimek at google.com ]>


And btw thanks a lot for all the great user support you're giving
here!


Thank you, I'm happy to help. Clang is a great project!

As soon as the university term is over, I'm also planning on trying to
contribute code-wise, mainly to the Static Analyzer but I guess also on just about
anything that catches my attention. :) I'm quite excited - this is going to be the
first open source project I contribute to.
Gabor





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130516/afdcd028/attachment.html>


More information about the cfe-dev mailing list