[cfe-dev] Adding more analysis algorithms to Clang?

Anna Zaks ganna at apple.com
Fri Oct 14 10:06:11 PDT 2011


Hi Guoping,

Yes, you should include a test with your patch. The test should be a part of the clang regression tests - it should be automatically tested when we run $make test. As for how to approach testing here, see the suggestion made by Ted in one of the previous emails to use testing approach similar to how we test LiveVariables analysis. This is not going to be trivial but it is absolutely necessary.

As for the format of the patch, please see "Making a Patch" in
http://llvm.org/docs/DeveloperPolicy.html

Thanks!
Anna.

On Oct 13, 2011, at 9:03 PM, Guoping Long wrote:

> Hello
> 
>    I re-engineered the dominance algorithm. Attached are the cpp/h files and a test input file. There are still several things:
> A. This implementation is done based on various comments pointed out by Ted, Anna, Arjun, et al. While I tried my best to provide a much better version this time, it still may not meet the quality to be a part of Clang. Please let me know if there are any issues on the code. I shall continue to improve it until it's good enough.
> 
> B. The dominance algorithm need two basic things. One is the post order iterator of a general graph. As Anna pointed out, Clang already has this facility. So I use it directly to traverse the CFG nodes in reverse post order. The other one is the LLVM's implementation of DominatorTree. This implementation is different from the one in VMCore on two aspects: 1 The dominator implementation in LLVM VMCore is closely coupled with the LLVM IR infrastructure (basic blocks, instructions, etc). The attached implementation mainly suits the CFG graph of Clang. 2. The LLVM implementation used the classical Lengauer-Tarjan algorithm, published in 1979. The attached implementation is based on an improved 2001 paper. For more details please refer to the comments of the code.
> 
> C. Shall I also provide the test code for the algorithm? While I know I need to test code, but I am not sure the standard procedures. Please let me know.
> Thanks.
> 
> --------
> Guoping
> 
> 2011/10/12 Anna Zaks <ganna at apple.com>
> 
> On Oct 12, 2011, at 3:05 PM, Guoping Long wrote:
> 
>> Hi, Anna
>> 
>>    You are right. Since LLVM already has dominatorTree implementation, it's not very interesting to build the wheel again.
>> 
> LLVM CFG and Clang front end CFG are different. I am not sure if the LLVM dominator tree can be integrated to work with Clang CFG (most likely not). Though, it could be worthwhile to check how it is implemented, both from algorithmic as well as API design points of view.
> 
> I think bringing those algorithms into Clang would be useful.
> 
> Cheers,
> Anna.
>> ------
>> Guoping
>> 
>> 2011/10/6 Anna Zaks <ganna at apple.com>
>> Hi Guoping,
>> 
>> Here are some comments mostly based on review of buildDominanceTree().
>> 
>> This should be extensively tested. I think unit tests(clang/test/unittests) could be a good fit in this case. Please, make tests a part of your next patch.
>> 
>> It would be nice if you could provide more expressive API for users of the algorithms. For example, provide reverse post order iterator which can be used by multiple clients instead of using the RPOQueue local variable to store the results.  Similarly for the dominator info, it would be nice to add some API which allows querying if one block dominates another one.. LLVM already has these algorithms implemented. I am not sure if some reuse is possible or not:
>> http://llvm.org/docs/doxygen/html/PostOrderIterator_8h_source.html
>> http://llvm.org/docs/doxygen/html/classllvm_1_1DominatorTree.html
>> 
>> Other comments:
>> - A single patch file would be easier to review.
>> 
>>  - If the following if statement needed? (Can a block have null successors?)
>> +      if(CFGBlock *B = *I) {
>> 
>>  - Is VisitVector needed? Maybe you could use the BlockRPO of the successors instead.
>> 
>>  - Shouldn't Dominator fields be initialized to 0?
>> 
>>  - 2 components/functions should be pulled out of buildDominanceTree(): buildReversePostorder(), intersect().
>> 
>>  - I think it would be cleaner to include the paper references as part of method comments/not fields comments.
>> 
>>  - "That is, dead code CFG blocks are ignored by this algorithm" - I'd remove this comment. The first sentence is clear enough. (I think you are talking about unreachable code here, and dominance analysis are not powerful enough to find all unreachable code.)
>> 
>> Microscopic style issues:
>>    - Comments should use proper punctuation.
>> 
>>    - You need to insert a space after a comma and before {. For example:  
>>         RPOQueue.push_back(item,BlkBVC); ->  RPOQueue.push_back(item, BlkBVC);
>>         if(!VisitVector[BlockID]){ -> if (!VisitVector[BlockID]) {
>> 
>>    - CFGBlock * getSCCRoot() -> CFGBlock *getSCCRoot()
>> 
>> Cheers,
>> Anna.
>> 
>> On Oct 5, 2011, at 3:44 PM, Guoping Long wrote:
>> 
>>> Thanks Arjun for your feedback. I updated these files (attached).
>>> Ted, I am looking forward to your comments.
>>> 
>>> ------
>>> Guoping
>>> 
>>> 2011/10/5 Arjun Singri <arjunsingri at gmail.com>
>>> 1. Use "unsigned int" instead of just "unsigned"
>>> 2. Shouldnt       "CFGStack.pop_back_val(); " on line 45 be outside the "if (AllVisited)" statement?
>>> 3. The check on line 76 is unnecessary. If you are at line 76 then the check at line 71 was false which means that the check at 76 is true. So I guess replace "else if" with "else".
>>> 4. typo on line 52
>>> 
>>> Arjun
>>> 
>>> On Tue, Oct 4, 2011 at 8:34 PM, Guoping Long <longguoping at gmail.com> wrote:
>>> Dear Ted
>>> 
>>>    I added two control flow analysis algorithms: One builds the dominance tree for CFG. The other one finds all SCC components within the CFG. Attached is the patch. I integrated these algorithms within the CFG and CFGBlock classes. This is the first time I submit something, so I have some questions:
>>> 
>>> 1 While I would really love to contribute, I am not sure if the code quality is enough. Please let me know any problems with the code so I shall fix them.
>>> 2 These algorithms shall work on any CFG. I am not sure what kind of special test cases should be added. I do wrote an example (in tools/clang/examples) to illustrate how to use these two algorithms. Is it necessary to submit the example into the examples/ directory?
>>> 3 I tried my best to follow the code standards and comments. Also, in the comments, I listed the key reference papers which described thorough details of these algorithms. They are the state of the art algorithms as far as I know, and very efficient, although I am not sure my implementation is good or not.
>>> 
>>> I am looking forward to feedback. I shall fix problems until it meets the quality for actual commit.
>>> Thanks.
>>> 
>>> ------------
>>> Guoping
>>> 
>>> 
>>> 2011/10/3 Ted Kremenek <kremenek at apple.com>
>>> Hi Guoping,
>>> 
>>> Additions to libAnalysis have largely been demand-driven.  I think we are fine with general contributions to this library that meeting the following criteria:
>>> 
>>> (1) Meet the Clang coding conventions and quality expectations.
>>> 
>>> (2) Are testable ("make test"), and tests are included in clang/test.
>>> 
>>> (3) Performance is acceptable and measurable, or at least documented when algorithms are known to be expensive.
>>> 
>>> (4) APIs are well-documented, which can come in the form of good doxygen comments.
>>> 
>>> Source-to-source transformations are welcome, and the Clang codebase does contains tools that are source-to-source rewriters.  For example, libRewrite provides low-level functionality for doing textual rewrites of source files using Clang SourceLocations.  Other analysis algorithms are useful as well; we just prefer that they can be regression tested, and the APIs are well-documented (especially if they don't have any real clients in the codebase at present).
>>> 
>>> Cheers,
>>> Ted
>>> 
>>> On Oct 3, 2011, at 2:40 PM, Guoping Long wrote:
>>> 
>>> > Hi,
>>> >
>>> >    There are some program analysis algorithms in clang/lib/Analysis directory. I am wandering if there are more algorithms to come or not. Are there some people working on this? What kind of algorithms are welcome to be part of Clang?
>>> >
>>> >    I am asking this because I want to help on this part. But I am not sure if source-to-source transformation algorithms are welcome in this community, since many optimization work is done in the LLVM backend. Currently I have implemented to CFG analysis algorithms (building the dominance tree and finding the strong connected components in CFG). I plan to implement more data flow analysis or even pointer analysis algorithms. If these algorithms are useful to people in this community, I would love to submit a patch.
>>> >
>>> >   I am new to Clang and this community. Hopefully these questions are not naive to you.
>>> >
>>> >   Thanks.
>>> > --------
>>> > Guoping
>>> > _______________________________________________
>>> > cfe-dev mailing list
>>> > cfe-dev at cs.uiuc.edu
>>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>> 
>>> 
>>> 
>>> <CFG.cpp.patch><CFG.h.patch>_______________________________________________
>>> 
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>> 
>> 
> 
> 
> <Dominators.cpp><Dominators.h><domtest.c>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111014/ce92a6af/attachment.html>


More information about the cfe-dev mailing list