[cfe-dev] Adding more analysis algorithms to Clang?

Guoping Long longguoping at gmail.com
Thu Oct 13 21:03:59 PDT 2011


Hello

   I re-engineered the dominance algorithm. Attached are the cpp/h files and
a test input file. There are still several things:
A. This implementation is done based on various comments pointed out by Ted,
Anna, Arjun, et al. While I tried my best to provide a much better version
this time, it still may not meet the quality to be a part of Clang. Please
let me know if there are any issues on the code. I shall continue to improve
it until it's good enough.

B. The dominance algorithm need two basic things. One is the post order
iterator of a general graph. As Anna pointed out, Clang already has this
facility. So I use it directly to traverse the CFG nodes in reverse post
order. The other one is the LLVM's implementation of DominatorTree. This
implementation is different from the one in VMCore on two aspects: 1 The
dominator implementation in LLVM VMCore is closely coupled with the LLVM IR
infrastructure (basic blocks, instructions, etc). The attached
implementation mainly suits the CFG graph of Clang. 2. The LLVM
implementation used the classical Lengauer-Tarjan algorithm, published in
1979. The attached implementation is based on an improved 2001 paper. For
more details please refer to the comments of the code.

C. Shall I also provide the test code for the algorithm? While I know I need
to test code, but I am not sure the standard procedures. Please let me know.
Thanks.

--------
Guoping

2011/10/12 Anna Zaks <ganna at apple.com>

>
> On Oct 12, 2011, at 3:05 PM, Guoping Long wrote:
>
> Hi, Anna
>
>    You are right. Since LLVM already has dominatorTree implementation, it's
> not very interesting to build the wheel again.
>
> LLVM CFG and Clang front end CFG are different. I am not sure if the LLVM
> dominator tree can be integrated to work with Clang CFG (most likely not).
> Though, it could be worthwhile to check how it is implemented, both from
> algorithmic as well as API design points of view.
>
> I think bringing those algorithms into Clang would be useful.
>
> Cheers,
> Anna.
>
> ------
> Guoping
>
> 2011/10/6 Anna Zaks <ganna at apple.com>
>
>> Hi Guoping,
>>
>> Here are some comments mostly based on review of buildDominanceTree().
>>
>> This should be extensively tested. I think unit
>> tests(clang/test/unittests) could be a good fit in this case. Please, make
>> tests a part of your next patch.
>>
>> It would be nice if you could provide more expressive API for users of the
>> algorithms. For example, provide reverse post order iterator which can be
>> used by multiple clients instead of using the RPOQueue local variable to
>> store the results.  Similarly for the dominator info, it would be nice to
>> add some API which allows querying if one block dominates another one.. LLVM
>> already has these algorithms implemented. I am not sure if some reuse is
>> possible or not:
>> http://llvm.org/docs/doxygen/html/PostOrderIterator_8h_source.html
>> http://llvm.org/docs/doxygen/html/classllvm_1_1DominatorTree.html
>>
>> Other comments:
>> - A single patch file would be easier to review.
>>
>>  - If the following if statement needed? (Can a block have null
>> successors?)
>> +      if(CFGBlock *B = *I) {
>>
>>  - Is VisitVector needed? Maybe you could use the BlockRPO of the
>> successors instead.
>>
>>  - Shouldn't Dominator fields be initialized to 0?
>>
>>  - 2 components/functions should be pulled out of buildDominanceTree():
>> buildReversePostorder(), intersect().
>>
>>  - I think it would be cleaner to include the paper references as part of
>> method comments/not fields comments.
>>
>>  - "That is, dead code CFG blocks are ignored by this algorithm" - I'd
>> remove this comment. The first sentence is clear enough. (I think you are
>> talking about unreachable code here, and dominance analysis are not powerful
>> enough to find all unreachable code.)
>>
>> Microscopic style issues:
>>    - Comments should use proper punctuation.
>>
>>    - You need to insert a space after a comma and before {. For example:
>>         RPOQueue.push_back(item,BlkBVC); ->  RPOQueue.push_back(item,
>> BlkBVC);
>>         if(!VisitVector[BlockID]){ -> if (!VisitVector[BlockID]) {
>>
>>    - CFGBlock * getSCCRoot() -> CFGBlock *getSCCRoot()
>>
>> Cheers,
>> Anna.
>>
>> On Oct 5, 2011, at 3:44 PM, Guoping Long wrote:
>>
>> Thanks Arjun for your feedback. I updated these files (attached).
>> Ted, I am looking forward to your comments.
>>
>> ------
>> Guoping
>>
>> 2011/10/5 Arjun Singri <arjunsingri at gmail.com>
>>
>>> 1. Use "unsigned int" instead of just "unsigned"
>>> 2. Shouldnt       "CFGStack.pop_back_val(); " on line 45 be outside the
>>> "if (AllVisited)" statement?
>>> 3. The check on line 76 is unnecessary. If you are at line 76 then the
>>> check at line 71 was false which means that the check at 76 is true. So I
>>> guess replace "else if" with "else".
>>> 4. typo on line 52
>>>
>>> Arjun
>>>
>>> On Tue, Oct 4, 2011 at 8:34 PM, Guoping Long <longguoping at gmail.com>wrote:
>>>
>>>> Dear Ted
>>>>
>>>>    I added two control flow analysis algorithms: One builds the
>>>> dominance tree for CFG. The other one finds all SCC components within the
>>>> CFG. Attached is the patch. I integrated these algorithms within the CFG and
>>>> CFGBlock classes. This is the first time I submit something, so I have some
>>>> questions:
>>>>
>>>> 1 While I would really love to contribute, I am not sure if the code
>>>> quality is enough. Please let me know any problems with the code so I shall
>>>> fix them.
>>>> 2 These algorithms shall work on any CFG. I am not sure what kind of
>>>> special test cases should be added. I do wrote an example (in
>>>> tools/clang/examples) to illustrate how to use these two algorithms. Is it
>>>> necessary to submit the example into the examples/ directory?
>>>> 3 I tried my best to follow the code standards and comments. Also, in
>>>> the comments, I listed the key reference papers which described thorough
>>>> details of these algorithms. They are the state of the art algorithms as far
>>>> as I know, and very efficient, although I am not sure my implementation is
>>>> good or not.
>>>>
>>>> I am looking forward to feedback. I shall fix problems until it meets
>>>> the quality for actual commit.
>>>> Thanks.
>>>>
>>>> ------------
>>>> Guoping
>>>>
>>>>
>>>> 2011/10/3 Ted Kremenek <kremenek at apple.com>
>>>>
>>>>> Hi Guoping,
>>>>>
>>>>> Additions to libAnalysis have largely been demand-driven.  I think we
>>>>> are fine with general contributions to this library that meeting the
>>>>> following criteria:
>>>>>
>>>>> (1) Meet the Clang coding conventions and quality expectations.
>>>>>
>>>>> (2) Are testable ("make test"), and tests are included in clang/test.
>>>>>
>>>>> (3) Performance is acceptable and measurable, or at least documented
>>>>> when algorithms are known to be expensive.
>>>>>
>>>>> (4) APIs are well-documented, which can come in the form of good
>>>>> doxygen comments.
>>>>>
>>>>> Source-to-source transformations are welcome, and the Clang codebase
>>>>> does contains tools that are source-to-source rewriters.  For example,
>>>>> libRewrite provides low-level functionality for doing textual rewrites of
>>>>> source files using Clang SourceLocations.  Other analysis algorithms are
>>>>> useful as well; we just prefer that they can be regression tested, and the
>>>>> APIs are well-documented (especially if they don't have any real clients in
>>>>> the codebase at present).
>>>>>
>>>>> Cheers,
>>>>> Ted
>>>>>
>>>>> On Oct 3, 2011, at 2:40 PM, Guoping Long wrote:
>>>>>
>>>>> > Hi,
>>>>> >
>>>>> >    There are some program analysis algorithms in clang/lib/Analysis
>>>>> directory. I am wandering if there are more algorithms to come or not. Are
>>>>> there some people working on this? What kind of algorithms are welcome to be
>>>>> part of Clang?
>>>>> >
>>>>> >    I am asking this because I want to help on this part. But I am not
>>>>> sure if source-to-source transformation algorithms are welcome in this
>>>>> community, since many optimization work is done in the LLVM backend.
>>>>> Currently I have implemented to CFG analysis algorithms (building the
>>>>> dominance tree and finding the strong connected components in CFG). I plan
>>>>> to implement more data flow analysis or even pointer analysis algorithms. If
>>>>> these algorithms are useful to people in this community, I would love to
>>>>> submit a patch.
>>>>> >
>>>>> >   I am new to Clang and this community. Hopefully these questions are
>>>>> not naive to you.
>>>>> >
>>>>> >   Thanks.
>>>>> > --------
>>>>> > Guoping
>>>>> > _______________________________________________
>>>>> > cfe-dev mailing list
>>>>> > cfe-dev at cs.uiuc.edu
>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>
>>>>
>>>
>> <CFG.cpp.patch><CFG.h.patch>
>> _______________________________________________
>>
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111013/ce2d8d72/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Dominators.cpp
Type: text/x-c++src
Size: 7763 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111013/ce2d8d72/attachment.cpp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Dominators.h
Type: text/x-chdr
Size: 2473 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111013/ce2d8d72/attachment.h>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: domtest.c
Type: text/x-csrc
Size: 662 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111013/ce2d8d72/attachment.c>


More information about the cfe-dev mailing list