[cfe-dev] Adding more analysis algorithms to Clang?

Guoping Long longguoping at gmail.com
Mon Oct 17 22:59:27 PDT 2011


Hi,

   Attached is the revised patch for dominance tree analysis. I added
necessary test code. To run the test: clang -cc1 -analyze
-analyzer-checker=debug.DumpDominators input-file.c
Two things:
1 The patch file is generated by running svn diff under the llvm/tools/clang
directory. I do not know why this patch only contains information for
existing file modifications, not including new added files (dominators.cpp
(should be in clang/lib/Analysis/), dominators.h (should be in
clang/include/clang/Analysis/Analyses/), domtest.c (should be in
clang/test/Analysis)). So I included the patch in together with the three
new files as attached.

2. While I tried my best to test the algorithm, I am not sure to what extent
can this process be satisfactory enough. Do I need to add more test inputs
in domtest.c? Or what else should I do to ensure more thorough testing?

   Please let me know if there are any issues. I shall fix them. Thanks.

----
Guoping

2011/10/14 Anna Zaks <ganna at apple.com>

> Hi Guoping,
>
> Yes, you should include a test with your patch. The test should be a part
> of the clang regression tests - it should be automatically tested when we
> run $make test. As for how to approach testing here, see the suggestion made
> by Ted in one of the previous emails to use testing approach similar to how
> we test LiveVariables analysis. This is not going to be trivial but it is
> absolutely necessary.
>
> As for the format of the patch, please see "Making a Patch" in
> http://llvm.org/docs/DeveloperPolicy.html
>
> Thanks!
> Anna.
>
> On Oct 13, 2011, at 9:03 PM, Guoping Long wrote:
>
> Hello
>
>    I re-engineered the dominance algorithm. Attached are the cpp/h files
> and a test input file. There are still several things:
> A. This implementation is done based on various comments pointed out by
> Ted, Anna, Arjun, et al. While I tried my best to provide a much better
> version this time, it still may not meet the quality to be a part of Clang.
> Please let me know if there are any issues on the code. I shall continue to
> improve it until it's good enough.
>
> B. The dominance algorithm need two basic things. One is the post order
> iterator of a general graph. As Anna pointed out, Clang already has this
> facility. So I use it directly to traverse the CFG nodes in reverse post
> order. The other one is the LLVM's implementation of DominatorTree. This
> implementation is different from the one in VMCore on two aspects: 1 The
> dominator implementation in LLVM VMCore is closely coupled with the LLVM IR
> infrastructure (basic blocks, instructions, etc). The attached
> implementation mainly suits the CFG graph of Clang. 2. The LLVM
> implementation used the classical Lengauer-Tarjan algorithm, published in
> 1979. The attached implementation is based on an improved 2001 paper. For
> more details please refer to the comments of the code.
>
> C. Shall I also provide the test code for the algorithm? While I know I
> need to test code, but I am not sure the standard procedures. Please let me
> know.
> Thanks.
>
> --------
> Guoping
>
> 2011/10/12 Anna Zaks <ganna at apple.com>
>
>>
>> On Oct 12, 2011, at 3:05 PM, Guoping Long wrote:
>>
>> Hi, Anna
>>
>>    You are right. Since LLVM already has dominatorTree implementation,
>> it's not very interesting to build the wheel again.
>>
>> LLVM CFG and Clang front end CFG are different. I am not sure if the LLVM
>> dominator tree can be integrated to work with Clang CFG (most likely not).
>> Though, it could be worthwhile to check how it is implemented, both from
>> algorithmic as well as API design points of view.
>>
>> I think bringing those algorithms into Clang would be useful.
>>
>> Cheers,
>> Anna.
>>
>> ------
>> Guoping
>>
>> 2011/10/6 Anna Zaks <ganna at apple.com>
>>
>>>  Hi Guoping,
>>>
>>> Here are some comments mostly based on review of buildDominanceTree().
>>>
>>> This should be extensively tested. I think unit
>>> tests(clang/test/unittests) could be a good fit in this case. Please, make
>>> tests a part of your next patch.
>>>
>>> It would be nice if you could provide more expressive API for users of
>>> the algorithms. For example, provide reverse post order iterator which can
>>> be used by multiple clients instead of using the RPOQueue local variable to
>>> store the results.  Similarly for the dominator info, it would be nice to
>>> add some API which allows querying if one block dominates another one.. LLVM
>>> already has these algorithms implemented. I am not sure if some reuse is
>>> possible or not:
>>> http://llvm.org/docs/doxygen/html/PostOrderIterator_8h_source.html
>>> http://llvm.org/docs/doxygen/html/classllvm_1_1DominatorTree.html
>>>
>>> Other comments:
>>> - A single patch file would be easier to review.
>>>
>>>  - If the following if statement needed? (Can a block have null
>>> successors?)
>>> +      if(CFGBlock *B = *I) {
>>>
>>>  - Is VisitVector needed? Maybe you could use the BlockRPO of the
>>> successors instead.
>>>
>>>  - Shouldn't Dominator fields be initialized to 0?
>>>
>>>  - 2 components/functions should be pulled out of buildDominanceTree():
>>> buildReversePostorder(), intersect().
>>>
>>>  - I think it would be cleaner to include the paper references as part of
>>> method comments/not fields comments.
>>>
>>>  - "That is, dead code CFG blocks are ignored by this algorithm" - I'd
>>> remove this comment. The first sentence is clear enough. (I think you are
>>> talking about unreachable code here, and dominance analysis are not powerful
>>> enough to find all unreachable code.)
>>>
>>> Microscopic style issues:
>>>    - Comments should use proper punctuation.
>>>
>>>    - You need to insert a space after a comma and before {. For example:
>>>
>>>         RPOQueue.push_back(item,BlkBVC); ->  RPOQueue.push_back(item,
>>> BlkBVC);
>>>         if(!VisitVector[BlockID]){ -> if (!VisitVector[BlockID]) {
>>>
>>>    - CFGBlock * getSCCRoot() -> CFGBlock *getSCCRoot()
>>>
>>> Cheers,
>>> Anna.
>>>
>>> On Oct 5, 2011, at 3:44 PM, Guoping Long wrote:
>>>
>>> Thanks Arjun for your feedback. I updated these files (attached).
>>> Ted, I am looking forward to your comments.
>>>
>>> ------
>>> Guoping
>>>
>>> 2011/10/5 Arjun Singri <arjunsingri at gmail.com>
>>>
>>>> 1. Use "unsigned int" instead of just "unsigned"
>>>> 2. Shouldnt       "CFGStack.pop_back_val(); " on line 45 be outside the
>>>> "if (AllVisited)" statement?
>>>> 3. The check on line 76 is unnecessary. If you are at line 76 then the
>>>> check at line 71 was false which means that the check at 76 is true. So I
>>>> guess replace "else if" with "else".
>>>> 4. typo on line 52
>>>>
>>>> Arjun
>>>>
>>>> On Tue, Oct 4, 2011 at 8:34 PM, Guoping Long <longguoping at gmail.com>wrote:
>>>>
>>>>> Dear Ted
>>>>>
>>>>>    I added two control flow analysis algorithms: One builds the
>>>>> dominance tree for CFG. The other one finds all SCC components within the
>>>>> CFG. Attached is the patch. I integrated these algorithms within the CFG and
>>>>> CFGBlock classes. This is the first time I submit something, so I have some
>>>>> questions:
>>>>>
>>>>> 1 While I would really love to contribute, I am not sure if the code
>>>>> quality is enough. Please let me know any problems with the code so I shall
>>>>> fix them.
>>>>> 2 These algorithms shall work on any CFG. I am not sure what kind of
>>>>> special test cases should be added. I do wrote an example (in
>>>>> tools/clang/examples) to illustrate how to use these two algorithms. Is it
>>>>> necessary to submit the example into the examples/ directory?
>>>>> 3 I tried my best to follow the code standards and comments. Also, in
>>>>> the comments, I listed the key reference papers which described thorough
>>>>> details of these algorithms. They are the state of the art algorithms as far
>>>>> as I know, and very efficient, although I am not sure my implementation is
>>>>> good or not.
>>>>>
>>>>> I am looking forward to feedback. I shall fix problems until it meets
>>>>> the quality for actual commit.
>>>>> Thanks.
>>>>>
>>>>> ------------
>>>>> Guoping
>>>>>
>>>>>
>>>>> 2011/10/3 Ted Kremenek <kremenek at apple.com>
>>>>>
>>>>>> Hi Guoping,
>>>>>>
>>>>>> Additions to libAnalysis have largely been demand-driven.  I think we
>>>>>> are fine with general contributions to this library that meeting the
>>>>>> following criteria:
>>>>>>
>>>>>> (1) Meet the Clang coding conventions and quality expectations.
>>>>>>
>>>>>> (2) Are testable ("make test"), and tests are included in clang/test.
>>>>>>
>>>>>> (3) Performance is acceptable and measurable, or at least documented
>>>>>> when algorithms are known to be expensive.
>>>>>>
>>>>>> (4) APIs are well-documented, which can come in the form of good
>>>>>> doxygen comments.
>>>>>>
>>>>>> Source-to-source transformations are welcome, and the Clang codebase
>>>>>> does contains tools that are source-to-source rewriters.  For example,
>>>>>> libRewrite provides low-level functionality for doing textual rewrites of
>>>>>> source files using Clang SourceLocations.  Other analysis algorithms are
>>>>>> useful as well; we just prefer that they can be regression tested, and the
>>>>>> APIs are well-documented (especially if they don't have any real clients in
>>>>>> the codebase at present).
>>>>>>
>>>>>> Cheers,
>>>>>> Ted
>>>>>>
>>>>>> On Oct 3, 2011, at 2:40 PM, Guoping Long wrote:
>>>>>>
>>>>>> > Hi,
>>>>>> >
>>>>>> >    There are some program analysis algorithms in clang/lib/Analysis
>>>>>> directory. I am wandering if there are more algorithms to come or not. Are
>>>>>> there some people working on this? What kind of algorithms are welcome to be
>>>>>> part of Clang?
>>>>>> >
>>>>>> >    I am asking this because I want to help on this part. But I am
>>>>>> not sure if source-to-source transformation algorithms are welcome in this
>>>>>> community, since many optimization work is done in the LLVM backend.
>>>>>> Currently I have implemented to CFG analysis algorithms (building the
>>>>>> dominance tree and finding the strong connected components in CFG). I plan
>>>>>> to implement more data flow analysis or even pointer analysis algorithms. If
>>>>>> these algorithms are useful to people in this community, I would love to
>>>>>> submit a patch.
>>>>>> >
>>>>>> >   I am new to Clang and this community. Hopefully these questions
>>>>>> are not naive to you.
>>>>>> >
>>>>>> >   Thanks.
>>>>>> > --------
>>>>>> > Guoping
>>>>>> > _______________________________________________
>>>>>> > cfe-dev mailing list
>>>>>> > cfe-dev at cs.uiuc.edu
>>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>>
>>>>>
>>>>
>>> <CFG.cpp.patch><CFG.h.patch>
>>> _______________________________________________
>>>
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>
>>>
>>>
>>
>>
> <Dominators.cpp><Dominators.h><domtest.c>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111017/bc99970e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Dominators.cpp
Type: text/x-c++src
Size: 7763 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111017/bc99970e/attachment.cpp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Dominators.h
Type: text/x-chdr
Size: 2473 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111017/bc99970e/attachment.h>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: domtest.c
Type: text/x-csrc
Size: 662 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111017/bc99970e/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dominators.patch
Type: application/octet-stream
Size: 1990 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111017/bc99970e/attachment.obj>


More information about the cfe-dev mailing list