[cfe-dev] Adding more analysis algorithms to Clang?
longguoping at gmail.com
Thu Oct 6 15:23:44 PDT 2011
Hi, Ted, Anna, Arjun, etc
Thanks you all for your comments. Now I think at least I know the
direction on how to improve the quality of the code. I shall fix these
issues in my next patch.
2011/10/6 Ted Kremenek <kremenek at apple.com>
> Hi Guoping,
> First, I appreciate your interest in pushing this code back to the open
> source tree. I want to start with some high-level issues, as I think there
> needs to be some significant restructuring before we can really get into the
> My main concern right now is the factoring of the code. None of this
> analysis logic should be added to the CFG class itself, but instead built on
> top of it. For example, the LiveVariables analysis builds on top of the
> CFG; it doesn't embed information inside it. Instead, we should see a
> separate API entry point that, given a CFG, returns the analysis results.
> Put another way, I should never expect to see analysis logic, like
> dominators, in CFG.h and CFG.cpp.
> There are a variety of reasons for this. The first is modularity and
> purity. The CFG is just in the business of modeling control-flow. No more,
> and no less. That makes it API contract easy (or easier) to understand.
> The second is just good software engineering. If we add 10 new analyses,
> do they need to get embedded into the CFG? Besides cluttering the API, that
> potentially means that some clients of the CFG need to pay the cost of
> having that logic their even if they don't use. For example, the compiler
> itself won't use the dominance analysis in the CFG, but if the analysis gets
> buried in the CFG API then it potentially must get linked in. Moreover, I
> should be able to extend the set of analyses possibly by providing new
> libraries or plugins; requiring that they are buried in the CFG doesn't
> allow this flexibility.
> Before I am really going to review this patch in detail, you will need to
> split your logic out of the CFG, and provide new classes and entry points
> for clients to use. That's absolutely a prerequisite before it can get
> checked in.
> My other concern is that there is no testing logic. We absolutely *must*
> have a way to test these analyses. We have very high quality standards in
> Clang, both in terms of code quality but also in terms of functionality.
> Without any tests, there is no reason to assume that these analyses really
> work at all.
> What I propose is that you have similar testing to what we do with the
> LiveVariables analysis. There is a static analysis check,
> LiveVariablesDumper (lib/StaticAnalyzer/Checkers/DebugCheckers.cpp) that
> simply dumps out the live variables information for a CFG. That information
> can then be regression tested using FileCheck (there are many examples in
> the "test" directory of using FileCheck to examine output from a tool).
> While I'm not proposing that your analyses be part of the static analyzer
> proper, we can use the analyzer to test your analyses in this way. What I
> would then expect would be a set of tests for each of your analyses that
> regression tested the expected results of the analysis. Providing a way to
> dump these analysis results is not only good for regression testing, but
> also important just for debugging the behavior of these analyses in general.
> The regression testing is also a prerequisite for this code getting checked
> into the Clang repository.
> On Oct 5, 2011, at 3:44 PM, Guoping Long wrote:
> Thanks Arjun for your feedback. I updated these files (attached).
> Ted, I am looking forward to your comments.
> 2011/10/5 Arjun Singri <arjunsingri at gmail.com>
>> 1. Use "unsigned int" instead of just "unsigned"
>> 2. Shouldnt "CFGStack.pop_back_val(); " on line 45 be outside the
>> "if (AllVisited)" statement?
>> 3. The check on line 76 is unnecessary. If you are at line 76 then the
>> check at line 71 was false which means that the check at 76 is true. So I
>> guess replace "else if" with "else".
>> 4. typo on line 52
>> On Tue, Oct 4, 2011 at 8:34 PM, Guoping Long <longguoping at gmail.com>wrote:
>>> Dear Ted
>>> I added two control flow analysis algorithms: One builds the dominance
>>> tree for CFG. The other one finds all SCC components within the CFG.
>>> Attached is the patch. I integrated these algorithms within the CFG and
>>> CFGBlock classes. This is the first time I submit something, so I have some
>>> 1 While I would really love to contribute, I am not sure if the code
>>> quality is enough. Please let me know any problems with the code so I shall
>>> fix them.
>>> 2 These algorithms shall work on any CFG. I am not sure what kind of
>>> special test cases should be added. I do wrote an example (in
>>> tools/clang/examples) to illustrate how to use these two algorithms. Is it
>>> necessary to submit the example into the examples/ directory?
>>> 3 I tried my best to follow the code standards and comments. Also, in the
>>> comments, I listed the key reference papers which described thorough details
>>> of these algorithms. They are the state of the art algorithms as far as I
>>> know, and very efficient, although I am not sure my implementation is good
>>> or not.
>>> I am looking forward to feedback. I shall fix problems until it meets the
>>> quality for actual commit.
>>> 2011/10/3 Ted Kremenek <kremenek at apple.com>
>>>> Hi Guoping,
>>>> Additions to libAnalysis have largely been demand-driven. I think we
>>>> are fine with general contributions to this library that meeting the
>>>> following criteria:
>>>> (1) Meet the Clang coding conventions and quality expectations.
>>>> (2) Are testable ("make test"), and tests are included in clang/test.
>>>> (3) Performance is acceptable and measurable, or at least documented
>>>> when algorithms are known to be expensive.
>>>> (4) APIs are well-documented, which can come in the form of good doxygen
>>>> Source-to-source transformations are welcome, and the Clang codebase
>>>> does contains tools that are source-to-source rewriters. For example,
>>>> libRewrite provides low-level functionality for doing textual rewrites of
>>>> source files using Clang SourceLocations. Other analysis algorithms are
>>>> useful as well; we just prefer that they can be regression tested, and the
>>>> APIs are well-documented (especially if they don't have any real clients in
>>>> the codebase at present).
>>>> On Oct 3, 2011, at 2:40 PM, Guoping Long wrote:
>>>> > Hi,
>>>> > There are some program analysis algorithms in clang/lib/Analysis
>>>> directory. I am wandering if there are more algorithms to come or not. Are
>>>> there some people working on this? What kind of algorithms are welcome to be
>>>> part of Clang?
>>>> > I am asking this because I want to help on this part. But I am not
>>>> sure if source-to-source transformation algorithms are welcome in this
>>>> community, since many optimization work is done in the LLVM backend.
>>>> Currently I have implemented to CFG analysis algorithms (building the
>>>> dominance tree and finding the strong connected components in CFG). I plan
>>>> to implement more data flow analysis or even pointer analysis algorithms. If
>>>> these algorithms are useful to people in this community, I would love to
>>>> submit a patch.
>>>> > I am new to Clang and this community. Hopefully these questions are
>>>> not naive to you.
>>>> > Thanks.
>>>> > --------
>>>> > Guoping
>>>> > _______________________________________________
>>>> > cfe-dev mailing list
>>>> > cfe-dev at cs.uiuc.edu
>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev