[cfe-dev] Constructing use-def chains

Simone Pellegrini spellegrini at dps.uibk.ac.at
Wed Sep 9 04:37:17 PDT 2009


Hi Ted,
Thanks for going through all that messy code, I read the code reviews, I 
definitely agree with them, I will fix it before the end of the week and 
I will send you a better version hopefully next Monday, so we can go 
through another iteration.
Sorry for the 80 columns formatting and white spaces issues, I didn't 
have time to fix it last week and I wanted to send you this code before 
the end of the week... so I try to get lucky :)

I have a couple of questions: Should I provide patch files? or we can 
continue with sending C++ files at this stage? Should I also provide 
some test cases?
cheers, Simone

Ted Kremenek wrote:
>
> On Sep 4, 2009, at 8:43 AM, Simone Pellegrini wrote:
>
>> Hi again,
>> Here it is my current implementation of DefUse chain analysis. I 
>> adjust the code to be compliant with some of the clang rules (I had 
>> some problem in switching from std::map to llvm::DenseMap and I will 
>> try to better investigate this issue in the future).
>>
>> So as this is not a patch, but a complete new feature I don't know 
>> where it should be placed in the source tree (clang/Analysis seems 
>> fine), so I will let u decide. The analysis is composed by 2 files 
>> DefUse.h and DefUse.cpp which represent respectively the interface 
>> and the implementation of the DefUse analysis. The usage of DefUse is 
>> very similar to CFG. You build the all DefUse chains for a specific 
>> piece of code (usually a function body) and then u start querying the 
>> object.
>>
>> #include "DefUse.h"
>> using namespace defuse;
>>
>> clang::FunctionDecl* fd = ...
>> DefUse* DU = DefUse::BuildDefUseChains(fd->getBody(), astCtx);
>>
>> now if we have a reference to a variable inside the function body 
>> (usually a DeclRefExpr) we can query the DefUse object to know if the 
>> VarRef is a "definition" or an "use":
>>
>> DeclRefExpr* DR = ...
>> DU->isDef(DR);
>> DU->isUse(DR);
>>
>> Now if our variable is a "definition" for example we can iterate 
>> among all the uses referred to the definition in the following way:
>> for(DefUse::uses_iterator I = DU->uses_begin(DR), E = DU->uses_end(); 
>> I != E, I++){
>>   DeclRefExpr* U = *I;
>>   ...
>> }
>>
>> If the variable is a use, we can iterate among all the definitions:
>> for(DefUse::defs_iterator I = DU->defs_begin(DR), E = DU->defs_end(); 
>> I != E, I++){
>>  ...
>> }
>> here the problem is that a definition can be either a DeclRefExpr in 
>> situations like:
>> -> a = s; // this is a definition for var. a
>>
>> or a VarDecl in situations like:
>> -> int a; // this is a definition for variable a
>>
>> For these reasons I have introduced a wrapper class called DefUseNode 
>> (the name can be changed) that can contains VarDecls or DeclRefExprs.
>>
>> And that's the basic usage of DefUse.
>>
>> A more advanced use is also possible, I don't want to go into the 
>> details of the algorithm used to compute the defuse chains, but you 
>> can imagine it divided into 2 steps. In the first steps all the 
>> variable references are collected and marked as definitions or uses 
>> depending on the semantic of the instruction and then, in the second 
>> step, this informations are elaborated in order to build the graph.
>> We can discuss about it but It comes out that marking the use of a 
>> variable can depends from the context, and the semantics can be 
>> different. For example if we are calling a function and the variable 
>> is passed as a reference or pointer we have to assume that this is a 
>> definition for the variable, but if we know the semantics of the 
>> function and we are sure the variable is not modified in the function 
>> body we can "risk" a little and mark the variable as an use. For this 
>> reason different semantics can be defined by the user in order to 
>> cover specific cases.
>>
>> This is possible by extending the DefUseBasicHelper. The basic helper 
>> already implements the default semantics (a very conservative one) 
>> but it exposes some virtual methods that can be extended by the user 
>> who wants to implement a different behavior. The helper should be 
>> passed to the BuildDefUseChain method in this way:
>>
>> DefUseCustomHelper helper;
>> DefUse* DU = DefUse::BuildDefUseChains(fd->getBody(), astCtx, &helper);
>> ...
>>
>> At last, I added a class to test the algorithm DefUseChainTest. I 
>> produces a lot of output (badly formatted) but you can easily see how 
>> ( and if :) ) the whole analysis works just by:
>> defuse::DefUseChainTest Consumer(llvm::outs());
>> ...
>> ParseAST(PP, Consumer, *ContextOwner, false);
>>
>> The test class is also meant to be an example of how to use the DefUse.
>>
>> N.B.: As I already stated in my previous emails it is important (when 
>> and if the patch will be committed in the CLANG source tree) the 
>> institution where I am working, the "University of Innsbruck", and if 
>> possible my name (Simone Pellegrini) appear in the CREDITS.TXT file 
>> as contributors for this type of analysis.
>>
>> have fun with DefUses, cheers Simone
>>
>> P.S.: Unfortunately next week I will be on a business trip so not 
>> reachable. I will reply and address your comments/feedbacks as soon 
>> as I am back in the office.
>
> Hi Simone,
>
> Sorry for the delay in replying!  The overall interface for this class 
> looks pretty good.  I have a bunch of comments with regards to the 
> implementation and the coding style that we should iterate on before 
> committing back to the Clang tree.  My comments are inline with the 
> code below.  Feel free to reply whenever you have time.
>
> As for where this code should live, I suggest:
>
> include/clang/Analysis/Analyses/DefUse.h
> lib/Analysis/DefUse.cpp
>
> My comments reflect a first pass over this patch.  There is a lot 
> going on here, so I have focused only on a subset of the details.
>
>>
>> //===-- clang/Analysis/DefUse.h - DefUse analysis -------*- C++ -*-===//
>> //
>> //                     The LLVM Compiler Infrastructure
>> //
>> // This file is distributed under the University of Illinois Open Source
>> // License. See LICENSE.TXT for details.
>> //
>> //===----------------------------------------------------------------------===// 
>>
>> //
>> // This file contains the declaration of DefUse analysis headers
>> //
>> //===----------------------------------------------------------------------===// 
>>
>>
>> #ifndef DEF_USE_HPP_
>> #define DEF_USE_HPP_
>
> Please use "LLVM_CLANG_DEFUSE_H" to avoid any conflict with other 
> defuse implementations.
>
>> #include "llvm/ADT/DenseMap.h"
>>
>> #include "clang/AST/ASTConsumer.h"
>> #include "clang/AST/ParentMap.h"
>> #include "clang/AST/StmtVisitor.h"
>
> Looks fine.
>
>>
>> #include <exception>
>> #include <stdexcept>
>
> We don't use C++ exceptions at all in LLVM, largely for performance 
> reasons.
>
>> namespace clang {
>>
>> // forward definitions
>> class CFG;
>> class CFGBlock;
>>
>> namespace defuse {
>>
>> //===------------------------- Exceptions -------------------------===//
>>
>> struct NotDefExcpetion {
>>    Stmt const* stmt;
>>    NotDefExcpetion(Stmt const* stmt_): stmt(stmt_){}
>> };
>> struct NotUseExcpetion {
>>    DeclRefExpr const* expr;
>>    NotUseExcpetion(DeclRefExpr const* expr_): expr(expr_){}
>> };
>
> These will need to be replaced with an alternate mechanism.
>
>>
>> //===------------------------- DefUseNode -------------------------===//
>> // Keeps the information for a particular 'definition' or 'use' of a 
>> variable
>> // The structure is needed because a defuse chain can contains 
>> variable declarations
>> // as well as variable reference. As a DeclStmt can contains several 
>> declarations
>> // to address a particular variable we need to store its VarDecl or 
>> DeclRefExp
>> struct DefUseNode{
>>     enum NodeKind{ VarDecl, VarRef };
>>     enum UseKind{ Use, Def, UseDef };
>>
>>     DefUseNode(clang::VarDecl const* decl): var_decl(decl), 
>> kind(VarDecl), usage(Def){}
>>     DefUseNode(DeclRefExpr const* ref, UseKind u = Use): 
>> var_ref(ref), kind(VarRef), usage(u){}
>>
>>     clang::VarDecl const* getDecl() const;
>>
>>     NodeKind const& getKind()const { return kind; }
>>     UseKind const&  getUse() const { return usage; }
>>
>>     clang::VarDecl const* getVarDecl() const {
>>         assert(kind==VarDecl); return var_decl;
>>     }
>>     DeclRefExpr const* getVarRef() const {
>>         assert(kind==VarRef); return var_ref;
>>     }
>>
>>     bool operator==(DefUseNode const& n) const;
>>
>> private:
>>     // a def-use node can be either a VarDecl or a DeclRefExpr
>>     union{
>>         clang::VarDecl    const*     var_decl;
>>         DeclRefExpr const*         var_ref;
>>     };
>>     NodeKind     kind;
>>     UseKind     usage;
>> };
>
> Since this 'struct' has private data, please make it a class instead:
>
> http://llvm.org/docs/CodingStandards.html#ci_class_struct
>
>
>>
>> //===------------------------- Typedefs -------------------------===//
>> class DefUseBlock;
>> class VarDeclMap;
>> typedef std::vector<DefUseBlock>                DefUseData;
>> typedef llvm::DenseMap<Stmt const*, unsigned>     VarRefBlockMap;
>> typedef std::vector<DefUseNode>                    DefUseVect;
>> typedef std::vector<DefUseNode const*>             VarRefsVect;
>>
>> class DefUseBasicHelper;
>>
>> //===------------------------- DefUse -------------------------===//
>>
>> class DefUse{
>>     ASTContext const&        ctx;
>>     DefUseData const*         analysis_data;
>>     VarDeclMap const*        decl_map;
>>     VarRefBlockMap const*     block_map;
>>     unsigned const            num_cfg_blocks;
>>
>>     DefUse(ASTContext const& ctx_, DefUseData const* analysis_data_, 
>> VarDeclMap const* decl_map_,
>>             VarRefBlockMap const* block_map_, unsigned num_blocks):
>>         ctx(ctx_), analysis_data(analysis_data_), 
>> decl_map(decl_map_), block_map(block_map_),
>>         num_cfg_blocks(num_blocks){ }
>>
>>     bool isDef(DefUseNode const& n) const;
>>
>>     struct iterator_impl{
>
> To make MSVC happy, this should be a 'class'.
>
>>         DefUse const*         du;
>>         DefUseNode const*    node;
>>
>>         struct iter{
>>             int                             block_id;
>>             DefUseVect::const_iterator     block_it;
>>             iter(int blk_id): block_id(blk_id){}
>>         };
>>
>>         iterator_impl(): du(NULL){ }
>>         iterator_impl(DefUse const* du_): du(du_){ }
>>         iterator_impl& operator++() { return inc(false); }
>>         iterator_impl& operator++(int) { return inc(false); }
>>         virtual iterator_impl& inc(bool) = 0;
>>     };
>> public:
>>     class uses_iterator: public 
>> std::iterator<std::input_iterator_tag, DeclRefExpr, std::ptrdiff_t, 
>> DeclRefExpr const*>,
>>                         public iterator_impl{
>>         iter iter_ptr;
>>         bool inDefBlock;
>>
>>         uses_iterator(): iterator_impl(), iter_ptr(-1), 
>> inDefBlock(true){}
>>         uses_iterator(DefUse const* du, DefUseNode const& n);
>>         uses_iterator& inc(bool first);
>>         friend class DefUse;
>>     public:
>>         virtual bool operator!=(uses_iterator const& iter);
>
> Does this need to be virtual?
>
>>         virtual DeclRefExpr const* operator*();
>
> Does this need to be virtual?
>
>>     };
>>
>>     class defs_iterator: public 
>> std::iterator<std::input_iterator_tag, DefUseNode, std::ptrdiff_t, 
>> DefUseNode const*>,
>>                         public iterator_impl{
>>         struct iter_: public iter{
>>             VarRefsVect::const_iterator     reaches_it;
>>             iter_(int blk_id): iter(blk_id){}
>>         } iter_ptr;
>>         bool blockDef;
>>
>>         defs_iterator(): iterator_impl(), iter_ptr(-1), 
>> blockDef(false) {}
>>         defs_iterator(DefUse const* du, DeclRefExpr const& n);
>>         defs_iterator& inc(bool first);
>>         friend class DefUse;
>>     public:
>>         bool operator!=(defs_iterator const& iter);
>>         DefUseNode const* operator*();
>>     };
>>
>>     // USES //
>>     defs_iterator defs_begin(DeclRefExpr const* var) const;
>>     defs_iterator defs_end() const;
>
> Please add doxygen comments for these methods.
>
>>
>>     // DEFS //
>>     uses_iterator uses_begin(DeclRefExpr const* var) const;
>>     uses_iterator uses_begin(VarDecl const* var) const;
>>     uses_iterator uses_end() const;
>
> Please add doxygen comments for these methods.
>
>>
>>     bool isUse(DeclRefExpr const* var) const;
>>     bool isDef(DeclRefExpr const* var) const { return 
>> isDef(DefUseNode(var, DefUseNode::Def)); }
>>     bool isDef(VarDecl const* var) const { return 
>> isDef(DefUseNode(var)); }
>>
>>     ~DefUse();
>>
>>     static DefUse* BuildDefUseChains(Stmt* body, ASTContext *ctx, 
>> DefUseBasicHelper* helper = 0,
>>                                         CFG* cfg = 0, ParentMap* pm = 
>> 0, bool verbose = false,
>>                                         llvm::raw_ostream& out = 
>> llvm::outs());
>> };
>
> I think having a "defuse" namespace is fine (for all the special 
> typedefs, etc.), but it should only contain the implementation details 
> for DefUse.  The DefUse class should be in the clang namespace.
>
>>
>> //===------------------------- DefUseBasicHelper 
>> -------------------------===//
>
> It seems like DefUseBasicHelper allows a bunch of hooks for subclasses 
> via the 'HandleXXX' methods, but this level of hooks is not going to 
> fair well from a performance standpoint.  I believe that one base 
> class can implement most of the HandleXXX logic without making those 
> methods virtual (and thus dispense with the HandleXXX methods entirely 
> and just have VisitXXX methods).  I think most subclasses will need 
> very few hooks, if any at all.  I think it's worth identifying the one 
> or two methods that need to be virtual and then leave the rest of them 
> as non-virtual methods.
>
> Also, if this truly is the base class, how about calling it 
> 'DefUseHelper'?  That seems more succinct and clear from an interface 
> perspective.
>
>>
>> class DefUseBasicHelper: public StmtVisitor<DefUseBasicHelper>{
>>     class DefUseBasicHelperImpl;
>>     DefUseBasicHelperImpl*     pimpl;
>>
>>     void InitializeValues(DefUseData* data, VarRefBlockMap* bm, 
>> VarDeclMap* dm);
>>
>>     friend class DefUse;
>> public:
>>     DefUseNode::UseKind current_use;
>>     DefUseBasicHelper();
>>
>>     virtual void HandleDeclRefExpr(DeclRefExpr *DR); // remember to 
>> call the super class implementation of the method
>>     void HandleDeclStmt(DeclStmt *DS);
>>
>>     virtual void HandleBinaryOperator(BinaryOperator* B);
>>     virtual void HandleConditionalOperator(ConditionalOperator* C);
>>     virtual void HandleCallExpr(CallExpr* C);
>>     virtual void HandleUnaryOperator(UnaryOperator* U);
>>     virtual void HandleArraySubscriptExpr(ArraySubscriptExpr* AS);
>>     // ...
>
> Is there a need for the '// ..' line?
>
>>
>>     void VisitDeclRefExpr(DeclRefExpr *DR){ return 
>> HandleDeclRefExpr(DR); }
>>     void VisitDeclStmt(DeclStmt *DS){ return HandleDeclStmt(DS); }
>>     void VisitBinaryOperator(BinaryOperator* B){ return 
>> HandleBinaryOperator(B); }
>>     void VisitConditionalOperator(ConditionalOperator* C){ return 
>> HandleConditionalOperator(C); }
>>     void VisitCallExpr(CallExpr* C){ return HandleCallExpr(C); }
>>     void VisitUnaryOperator(UnaryOperator* U){ return 
>> HandleUnaryOperator(U); }
>>     void VisitArraySubscriptExpr(ArraySubscriptExpr* AS){ return 
>> HandleArraySubscriptExpr(AS); }
>>     // ...
>
> Is there a need for the '// ..' line?
>
>>
>>     void VisitCFGBlock(clang::CFGBlock const& blk, int root);
>
> The 'root' argument is not necessary and should  be removed (see 
> comment below).
>
>>     void VisitStmt(Stmt* S);
>>
>>     virtual ~DefUseBasicHelper();
>> };
>>
>> //===------------------------- DefUseChainTest 
>> -------------------------===//
>>
>> class DefUseChainTest: public StmtVisitor<DefUseChainTest>, public 
>> ASTConsumer {
>>     llvm::raw_ostream&        out;
>>     ASTContext*             ctx;
>>     DefUse const*            du;
>>
>> public:
>>     DefUseChainTest(llvm::raw_ostream& out_): out(out_), ctx(NULL), 
>> du(NULL) {    }
>>
>>     void Initialize(ASTContext& context) { ctx = &context; }
>>     void HandleTopLevelDecl(DeclGroupRef declGroupRef);
>>     // void HandleTranslationUnit(clang::ASTContext& context);
>>
>>     void VisitDeclRefExpr(DeclRefExpr *DR);
>>     void VisitDeclStmt(DeclStmt *DS);
>>     void VisitStmt(Stmt* S);
>> };
>>
>> void PrintVarDefs(DefUse const* DU, DeclRefExpr const* DR, 
>> ASTContext& ctx, llvm::raw_ostream& out);
>> void PrintVarUses(DefUse const* DU, DeclRefExpr const* DR, 
>> ASTContext& ctx, llvm::raw_ostream& out);
>> void PrintVarUses(DefUse const* DU, VarDecl const* VD, ASTContext& 
>> ctx, llvm::raw_ostream& out);
>>
>> } // end namespace defuse
>
> The DefUseChainTest can be made entirely private in DefUse.cpp, and 
> have a static function create it (and return an ASTConsumer).  There 
> is no need to have it in the public API interface.
>
>> } // end namespace clang
>>
>> #endif /* DEF_USE_HPP_ */
>
> Some compilers complain about the extra tokens after #endif.  Please 
> remove this comment.
>
>
>> //===-- clang/Analysis/DefUse.cpp - DefUse analysis -------*- C++ 
>> -*-===//
>> //
>> //                     The LLVM Compiler Infrastructure
>> //
>> // This file is distributed under the University of Illinois Open Source
>> // License. See LICENSE.TXT for details.
>> //
>> //===----------------------------------------------------------------------===// 
>>
>> //
>> // This file contains the implementation of DefUse analysis
>> //
>> //===----------------------------------------------------------------------===// 
>>
>>
>> #define __STDC_LIMIT_MACROS
>> #define __STDC_CONSTANT_MACROS
>
> What are these two defines used for?
>
>>
>> #include "DefUse.h"
>>
>> #include "clang/Basic/SourceManager.h"
>> #include "clang/Analysis/CFG.h"
>>
>> #include <llvm/Support/raw_ostream.h>
>>
>> #include <set>
>> #include <map>
>
> Eventually the code should probably use DenseSet and DenseMap for 
> performance reasons, as std::map and std::set are very slow and make 
> heavy use of malloc().  I understand that a significant amount of code 
> may need to be tweaked, but the performance gain can easily be 2-3x.
>
>>
>> using namespace clang;
>> using namespace defuse;
>>
>> // ASSERT MACRO
>> #define ASSERT(COND, MSG) do{ \
>>     std::string str; \
>>     llvm::raw_string_ostream ss(str); \
>>     ss << MSG; \
>>     assert((COND) && ss.str().c_str()); }while(false)
>
> I don't think this will do what you intend.  If I did:
>
> ASSERT(false, 10)
>
> I'd see:
>
>   "false && ss.str().c_str()"
>
> as the failed assertion condition in the terminal output.
>
> One option to get better error reporting is to use the facilities in 
> 'clang/Basic/PrettyStackTrace.h', which are used in various places in 
> Clang.  These can be used to print the location within the analyzed 
> source that the assertion fails.  In general we don't like specialized 
> error reporting machinery for just one file (as in the case of 
> 'ASSERT'), but like adding to the general error reporting machinery 
> that all of LLVM/Clang can use.
>
> I also don't think this ASSERTION macro adds much value and causes 
> some of the client code below to be structured in some slightly 
> suboptimal ways.
>
>>
>> // utility functions
>> unsigned Line(SourceLocation const& l, clang::SourceManager const& sm) {
>>     PresumedLoc pl = sm.getPresumedLoc(l);
>>     return pl.getLine();
>> }
>>
>> unsigned Column(SourceLocation const& l, SourceManager const& sm) {
>>     PresumedLoc pl = sm.getPresumedLoc(l);
>>     return pl.getColumn();
>> }
>
> Instead of adding these functions, we can add 'getPresumedLoc' to 
> FullSourceLoc and then use the 'getColumn()' and 'getLine()' methods 
> from that class.
>
>>
>> std::string PrintClangStmt(Stmt const* stmt, ASTContext const& ctx){
>>     std::string str;
>>     llvm::raw_string_ostream pp(str);
>>     stmt->printPretty(pp, 0, PrintingPolicy(ctx.getLangOptions()), 0);
>>     return pp.str();
>> }
>
> I don't think this function is needed.  If you need it, please declare 
> it static to keep it private to this file.
>
>
>>
>> //===------------------------- DefUseNode -------------------------===//
>> inline bool defuse::DefUseNode::operator==(DefUseNode const& n) const{
>>     if(n.usage != usage || n.kind != kind)
>>         return false;
>
> Please add a space between 'if' and the '('.  This is the coding style 
> used throughout LLVM.
>
> Also, please use 2 space indentation like the rest of LLVM and Clang.
>
>
>>     if(kind == VarDecl)
>>         return var_decl == n.var_decl;
>>     else
>>         return var_ref == n.var_ref;
>>     return false;
>> }
>
> The 'return false' is dead code.  Please remove it.
>
> This can be more succinctly written as:
>
>   return kind == VarDecl ? var_decl == n.var_decl : var_Ref == n.var_ref;
>
> Alternatively:
>
>   if (kind == VarDecl)
>     return var_decl == n.var_decl;
>
>   return var_ref == n.var_ref;
>
>> inline clang::VarDecl const* DefUseNode::getDecl() const{
>>     if(kind == VarRef)
>>         return cast<clang::VarDecl>(var_ref->getDecl());
>>     return cast<clang::VarDecl>(var_decl);
>> }
>
> Same as the previous comment.
>
>   if (kind == VarRef)
>     return cast<VarDecl>(var_ref->getDecl());
>
>   return cast<VarDecl>(var_decl);
>
> or:
>
>   return kind == VarRef ? cast<VarDecl>(var_ref->getDecl()) : 
> cast<VarDecl>(var_decl);
>
>>
>> typedef std::set<DefUseNode const*>             VarRefsSet;
>> typedef std::set<VarDecl const*>                 VarDeclsSet;
>> typedef std::map<VarDecl const*, DefUseVect>     DefUseChain;
>
> Please use llvm::DenseSet and llvm::DenseMap.  You'll need to use 
> pointers to DefUseVects instead (and then destroy them), but the speed 
> improvement will be significant.
>
>>
>> //===------------------------- DefUseBlock 
>> -------------------------===//
>>
>> struct defuse::DefUseBlock: public DefUseChain{
>
> This should be 'class' instead of 'struct'.  You declared it earlier 
> as a 'class'.  While gcc will eat this, MSVC will not.  Also, since 
> this isn't POD, it should be a class anyway.
>
>
>>     VarDeclsSet     uses;
>>     VarRefsVect     defsout;
>>     VarDeclsSet        killed;
>>     VarRefsVect     reaches;
>>
>>     class not_killed_set_iterator: public 
>> std::iterator<std::input_iterator_tag, DefUseNode, std::ptrdiff_t, 
>> DefUseNode const*>{
>>         DefUseBlock&                     block;
>>         VarRefsVect::const_iterator     ptr_it;
>>
>>         not_killed_set_iterator& inc(bool first){
>>             if(ptr_it != block.reaches.end() && !first)
>>                 ptr_it++;
>
> Please use '++ptr_it' instead of 'ptr_it++'.  While it makes no 
> difference in this case, general speaking it performs better and there 
> is no reason to make any assumptions about the underlying data 
> structure and how the iterators are implemented.
>
>>             while(ptr_it != block.reaches.end() &&
>>                     (block.killed.find((*ptr_it)->getDecl()) != 
>> block.killed.end() &&
>>                             std::find(block.defsout.begin(), 
>> block.defsout.end(), *ptr_it) == block.defsout.end()))
>>                 ptr_it++;
>>             return *this;
>>         }
>>     public:
>>         not_killed_set_iterator(DefUseBlock& block_, 
>> VarRefsVect::const_iterator ptr_it_): block(block_), ptr_it(ptr_it_){ 
>> inc(true); }
>>         not_killed_set_iterator& operator++() { return inc(false); }
>>         not_killed_set_iterator& operator++(int) { return inc(false); }
>
> Do we even need an operator++(int)?  By disallowing it people won't 
> accidentally use it.  This also isn't the standard semantics of this 
> operator, so by implementing it this way clients will silently be 
> doing the wrong thing instead of getting a compilation error.
>
>>         bool operator!=(not_killed_set_iterator const& iter) {    
>> return !((&iter.block) == &(block) && iter.ptr_it == ptr_it);    }
>
> This method can (and should) be declared const.
>
>>         const DefUseNode* const& operator*() const { return *ptr_it; }
>>     };
>>
>>     not_killed_set_iterator begin_not_killed_set(){    return 
>> DefUseBlock::not_killed_set_iterator(*this, reaches.begin()); }
>>     not_killed_set_iterator end_not_killed_set(){ return 
>> DefUseBlock::not_killed_set_iterator(*this, reaches.end()); }
>
> Please add a space between () and {
>
>> };
>>
>> //===------------------------- VarDeclMap -------------------------===//
>>
>> struct defuse::VarDeclMap: public StmtVisitor<VarDeclMap>, public 
>> llvm::DenseMap<VarDecl const*, DeclStmt const*> {
>
> LLVM has an 80 column coding convention:
>
> http://llvm.org/docs/CodingStandards.html#scf_codewidth
>
> Please wrap lines that exceed this length.
>
>>
>>     VarDeclMap(Stmt const* body){ VisitStmt(const_cast<Stmt*>(body)); }
>>
>>     void VisitDeclStmt(DeclStmt *DS){
>>         for(DeclStmt::decl_iterator it = DS->decl_begin(); it != 
>> DS->decl_end(); it++){
>>             if((*it)->getKind() != Decl::Var)
>>                 continue;
>
> Use dyn_cast instead of getKind.
>
>>             insert(std::make_pair(cast<VarDecl>(*it), DS));
>>         }
>>     }
>>     void VisitStmt(Stmt* S){
>>         for(Stmt::child_iterator I = S->child_begin(), E = 
>> S->child_end(); I != E; ++I)
>>             if(*I) Visit(*I);
>>     }
>> };
>
> Please make this is a class, as the earlier declaration was also a class.
>
> Please also add a space between 'for' and '('.  This is the LLVM 
> coding convention.
>
>>
>> //===------------------------- DefUseBasicHelper 
>> -------------------------===//
>>
>> struct defuse::DefUseBasicHelper::DefUseBasicHelperImpl{
>>     DefUseData*            data;
>>     VarRefBlockMap*        bm;
>>     VarDeclMap*            dm;
>>     unsigned             blockID;
>>
>>     DefUseBasicHelperImpl(): data(NULL), bm(NULL), dm(NULL){}
>> };
>> defuse::DefUseBasicHelper::DefUseBasicHelper(): pimpl(new 
>> DefUseBasicHelperImpl), current_use(DefUseNode::Use){}
>> defuse::DefUseBasicHelper::~DefUseBasicHelper(){ delete pimpl; }
>>
>> void defuse::DefUseBasicHelper::InitializeValues(DefUseData* data, 
>> VarRefBlockMap* bm, VarDeclMap* dm){
>>     pimpl->data = data;    pimpl->bm = bm;    pimpl->dm = dm;
>> }
>>
>> void defuse::DefUseBasicHelper::HandleDeclRefExpr(DeclRefExpr *DR){
>>     VarRefBlockMap& bm = *pimpl->bm;
>>     unsigned blockID = pimpl->blockID;
>>     DefUseBlock& block_data = (*pimpl->data)[blockID];
>>     if(DR->getDecl()->getKind() == Decl::Var || 
>> DR->getDecl()->getKind() == Decl::ParmVar){
>
> The condition is overly verbose.  We also discourage the use of 
> 'getKind' unless you are using it in a switch statement.
>
> Suggestion:
>
>   Decl *D = DR->getDecl();
>   if (isa<VarDecl>(D)) {
>    ...
>   }
>
> isa<> uses the value returned from VarDecl::classof() to determine if 
> 'D' is a VarDecl.
>
>>         VarDecl* VD = cast<VarDecl>(DR->getDecl());
>
> Actually, instead of using isa, here is a better style:
>
> if (VarDecl *VD = dyn_cast<VarDecl>(DR->getDecl()) {
>   ...
> }
>
> The dyn_cast combines 'isa' with 'cast', and returns a NULL pointer if 
> isa would return false.  This is the encouraged style.
>
>
>>         // map this variable to the current block
>>         bm[DR] = blockID;
>>         if(block_data.find(VD) == block_data.end()){
>>             block_data.insert(std::make_pair(VD, DefUseVect()));
>>             block_data.uses.insert(VD); // this variable is used in 
>> the current block and it has
>>                                           // no prior definition 
>> within the block
>
> If you end up using a DenseMap, this code can be made faster by 
> avoiding the double lookup (the first for block_data.find() and the 
> second for the insertion):
>
>    DefUseBlock *&B = block_data[VD];
>    if (!B) {
>     B = new DefUseBlock();
>     B->users.insert(VD);
>   }
>
> Using a DenseMap (which involves changing the data value to a 
> pointer), the first line involves a quick hashtable lookup and creates 
> a NULL pointer entry if the key could not be found.  The code that 
> follows conjures up the new DefUseBlock without having to do a second 
> search of the map.
>
>>         }
>>         if(current_use == DefUseNode::UseDef){ // in case of compound 
>> operators
>>             block_data[VD].push_back(DefUseNode(DR, DefUseNode::Use));
>>             block_data[VD].push_back(DefUseNode(DR, DefUseNode::Def));
>
> These lines cause two identical (and redundant) lookups to 
> block_data.  Consider:
>
>   DefUseBlock &B = block_data[VD];
>   B.push_back(...);
>   B.push_back(...);
>
>>         }else
>>             block_data[VD].push_back(DefUseNode(DR, current_use));
>>     }
>> }
>
> Please add spaces between 'if' and '(' and 'else' and '{'.
>
>>
>> void defuse::DefUseBasicHelper::HandleDeclStmt(DeclStmt *DS){
>>     for(DeclStmt::decl_iterator I = DS->decl_begin(), E = 
>> DS->decl_end(); I != E; I++){
>>         if((*I)->getKind() != Decl::Var)
>>             continue;
>>         VarRefBlockMap& bm = *pimpl->bm;
>>         VarDeclMap& dm = *pimpl->dm;
>>         unsigned blockID = pimpl->blockID;
>>         DefUseBlock& block_data = (*pimpl->data)[blockID];
>>
>>         VarDecl* VD = cast<VarDecl>(*I);
>>         if(block_data.find(VD) != block_data.end())
>>             ASSERT(true, "Variable '" << VD->getNameAsString() << "' 
>> redefined in the same block");
>
> Why not do:
>
>    assert(block_data.find(VD) != block_data.end() && "Variable 
> redefined in the same block");
>
> It is far more succinct, doesn't add more code in a Release build (as 
> ASSERT does), and if we really care what VD is we can run this code in 
> the debugger (which we would likely need to do anyway).  It also 
> reduces the number of lookups in the map in a Release build from two 
> to one.
>
>> void defuse::DefUseBasicHelper::HandleBinaryOperator(BinaryOperator* B){
>>     DefUseNode::UseKind backup = current_use; // backup the usage
>>     if(B->isAssignmentOp()){
>>         current_use = DefUseNode::Use;
>>         Visit(B->getRHS());
>>         current_use = DefUseNode::Def;
>>         if(B->isCompoundAssignmentOp())
>>             current_use = DefUseNode::UseDef;
>>         Visit(B->getLHS());
>>     }else{
>>         Visit(B->getLHS());
>>         current_use = DefUseNode::Use;
>>         Visit(B->getRHS());
>>     }
>>     current_use = backup; // write back the usage to the current usage
>> }
>>
>> void 
>> defuse::DefUseBasicHelper::HandleConditionalOperator(ConditionalOperator* 
>> C){    }
>>
>> void defuse::DefUseBasicHelper::HandleCallExpr(CallExpr* C) {
>>     DefUseNode::UseKind backup = current_use; // backup the usage
>>     for(CallExpr::arg_iterator I = C->arg_begin(), E = C->arg_end(); 
>> I != E; ++I){
>>         if((*I)->getType()->isPointerType() || 
>> (*I)->getType()->isReferenceType())
>>             current_use = DefUseNode::Def;
>>         Visit(*I);
>>         current_use = backup;
>>     }
>> }
>>
>> void defuse::DefUseBasicHelper::HandleUnaryOperator(UnaryOperator* U){
>>     DefUseNode::UseKind backup = current_use; // backup the usage
>>     switch(U->getOpcode()){
>>     case UnaryOperator::PostInc:
>>     case UnaryOperator::PostDec:
>>     case UnaryOperator::PreInc:
>>     case UnaryOperator::PreDec:
>>         current_use = DefUseNode::UseDef;
>>         break;
>>     case UnaryOperator::Plus:
>>     case UnaryOperator::Minus:
>>     case UnaryOperator::Not:
>>     case UnaryOperator::LNot:
>>         current_use = DefUseNode::Use;
>>         break;
>>     case UnaryOperator::AddrOf:
>>     case UnaryOperator::Deref:
>>         // use the current_use
>>         break;
>>     default:
>>         // DEBUG("Operator " << 
>> UnaryOperator::getOpcodeStr(U->getOpcode()) << " not supported in 
>> def-use analysis");
>>         break;
>>     }
>>     Visit(U->getSubExpr());
>>     current_use = backup; // write back the usage to the current usage
>> }
>>
>> void 
>> defuse::DefUseBasicHelper::HandleArraySubscriptExpr(ArraySubscriptExpr* 
>> AS){
>>     DefUseNode::UseKind backup = current_use; // backup the usage
>>     Visit(AS->getBase());
>>     current_use = DefUseNode::Use;
>>     Visit(AS->getIdx());
>>     current_use = backup; // write back the usage to the current usage
>> }
>>
>> void defuse::DefUseBasicHelper::VisitCFGBlock(CFGBlock const& blk, 
>> int root){
>>     pimpl->blockID = blk.getBlockID();
>>     for(CFGBlock::const_iterator I = blk.begin(), E = blk.end(); I != 
>> E; I++)
>>         Visit(*I);
>>
>>     unsigned blockID = pimpl->blockID;
>>     DefUseBlock& block_data = (*pimpl->data)[blockID];
>>
>>     // Post processing
>>     // build up the uses(b), defsout(b) and killed(b)
>>
>>     // for each variable
>>     for(DefUseBlock::iterator I = block_data.begin(), E = 
>> block_data.end(); I != E; ++I){
>>         // in case of references to ParmVar what we do is insert the 
>> variable definition in the first block of the CFG
>>         DefUseBlock& root_data = (*pimpl->data)[root];
>>        if(I->first->getKind() == Decl::ParmVar &&
>>                std::find(root_data[I->first].begin(), 
>> root_data[I->first].end(), DefUseNode(I->first)) == 
>> root_data[I->first].end()){
>>            root_data[I->first].push_back(DefUseNode(I->first));
>>            root_data.defsout.push_back(&root_data[I->first].back());
>>        }
>>         // if this is a local variable, should't be in the 
>> defsout,killed sets
>>         // defsout is calculated by iterating backwards to find the 
>> last use of the variable
>>         for(DefUseVect::const_reverse_iterator UI = 
>> I->second.rbegin(), UE = I->second.rend(); UI != UE; UI++)
>>             if(UI->getUse() == DefUseNode::Def && 
>> (block_data.defsout.empty() || block_data.defsout.back()->getDecl() 
>> != I->first)){
>>                 block_data.defsout.push_back(&(*UI));
>>                 block_data.killed.insert(UI->getDecl());
>>                 break;
>>             }
>>     }
>> }
>>
>> void defuse::DefUseBasicHelper::VisitStmt(Stmt* S){
>>     for(Stmt::child_iterator I = S->child_begin(), E = 
>> S->child_end(); I != E; ++I)
>>         if(*I) Visit(*I);
>> }
>>
>> bool ContainsStmt(Stmt const* block, Stmt const* stmt){
>>     if(block == stmt)
>>         return true;
>>     for(Stmt::const_child_iterator I = block->child_begin(), E = 
>> block->child_end(); I != E; ++I)
>>         if(*I){
>>             if(ContainsStmt(*I, stmt))    return true;
>>         }
>
> How about:
>
>   if (*I && ContainsStmt(*I, stmt))
>     return true;
>
>>     return false;
>> }
>>
>> Stmt const* GetEnclosingBlock(ParentMap const& pm, Stmt const* stmt){
>>     Stmt const* E = stmt;
>>     while(E){
>>         Stmt const* cond = NULL;
>>         switch(E->getStmtClass()){
>>         case Stmt::IfStmtClass:
>>             cond = cast<IfStmt>(E)->getCond();
>>             break;
>>         case Stmt::WhileStmtClass:
>>             cond = cast<WhileStmt>(E)->getCond();
>>             break;
>>         case Stmt::DoStmtClass:
>>             cond = cast<DoStmt>(E)->getCond();
>>             break;
>>         case Stmt::ForStmtClass:
>>             cond = cast<ForStmt>(E)->getCond();
>>             break;
>>         case Stmt::SwitchStmtClass:
>>             cond = cast<SwitchStmt>(E)->getCond();
>>             break;
>>         case Stmt::CompoundStmtClass:
>>             return E;
>>         default:
>>             break;
>>         }
>>         if(cond && !ContainsStmt(cond, stmt))
>>             return E;
>>         E = pm.getParent(E);
>>     }
>>     return NULL;
>> }
>>
>> void RemoveLocalDefs(CFG const& cfg, ParentMap const& pm, VarDeclMap& 
>> decl_vars, DefUseData& data){
>>     for(CFG::const_reverse_iterator it = cfg.rbegin(), cend = 
>> cfg.rend(); it != cend; it++) {
>>         // every time we find a node with more than 1 predecessor it 
>> means
>>         // we are at a junction point and the local variable defined 
>> in the previuos
>>         // blocks must be removed from the defsout set
>>         if(it->pred_size() <= 1)
>>             continue;
>>         // for each predecessor
>>         for(CFGBlock::const_pred_iterator pit = it->pred_begin(), end 
>> = it->pred_end(); pit != end && (*pit)->size(); pit++){
>>             DefUseBlock& block_data = data[(*pit)->getBlockID()];
>>             for(VarRefsVect::iterator dit = 
>> block_data.defsout.begin(); dit != block_data.defsout.end(); dit++){
>>                 // we have to be sure the element we are using to see 
>> if the Decl is in the same block of block statements
>>                 // is also present in the AST, so if the first 
>> statement of the CFG block is a VarDecl, we use the VarDeclMap
>>                 // to retrieve the corresponding AST node
>>                 Stmt const* testStmt = (*pit)->front();
>>                 if(testStmt->getStmtClass() == Stmt::DeclStmtClass)
>>                     testStmt = 
>> decl_vars[cast<VarDecl>(cast<DeclStmt>(testStmt)->getSingleDecl())];
>
> Use 'dyn_cast' or 'isa' instead of 'getStmtClass()'
>
>
>>                 // if the variable is local to one of the prec blocks 
>> and we are at a junction point
>>                 // it means the variable must be removed from the 
>> defsout set
>>                 if(GetEnclosingBlock(pm, 
>> decl_vars[(*dit)->getDecl()]) == GetEnclosingBlock(pm, testStmt)){
>>                     // it can be the case the following block is in 
>> the same scope of the previous block (for example if stmt without else)
>>                     Stmt const* testStmt = it->front();
>>                     if(testStmt->getStmtClass() == Stmt::DeclStmtClass)
>>                         testStmt = 
>> decl_vars[cast<VarDecl>(cast<DeclStmt>(testStmt)->getSingleDecl())];
>
> Use 'dyn_cast' or 'isa' instead of 'getStmtClass()'
>
>
>>                     if(GetEnclosingBlock(pm, 
>> decl_vars[(*dit)->getDecl()]) != GetEnclosingBlock(pm, testStmt)){
>>                         block_data.defsout.erase(dit--);
>>                     }
>>                 }
>>             }
>>         }
>>     }
>> }
>>
>> // implements reaching definitions according to Kennedy's book 
>> iterative algorithm
>> void ComputeReaches(CFG const& cfg, DefUseData& data){
>>     bool changed = true;
>>     while(changed){
>>         changed = false;
>>         VarRefsSet newreaches;
>>         for(CFG::const_reverse_iterator it = cfg.rbegin(); it != 
>> cfg.rend(); it++) {
>>             DefUseBlock& BlockData = data[it->getBlockID()];
>>             newreaches = VarRefsSet();
>>             for(CFGBlock::const_pred_iterator pit = it->pred_begin(), 
>> end = it->pred_end(); pit != end; pit++){
>>                 unsigned p = (*pit)->getBlockID();
>>                 // s(p) = reaches(p) - killed(p)
>>                 VarRefsSet s, temp_set, temp_set_1;
>>                 copy(data[p].begin_not_killed_set(), 
>> data[p].end_not_killed_set(), inserter(s, s.begin()));
>>                 // tmp_set(p) = defsout(p) U s(p)
>>                 set_union(s.begin(), s.end(), 
>> data[p].defsout.begin(), data[p].defsout.end(), inserter(temp_set, 
>> temp_set.begin()));
>>                 // tmp_set_1 = newreaches U tmp_set(p)
>>                 set_union(newreaches.begin(), newreaches.end(), 
>> temp_set.begin(), temp_set.end(), inserter(temp_set_1, 
>> temp_set_1.begin()));
>>                 newreaches.swap(temp_set_1);
>>             }
>>             if(BlockData.reaches.size() != newreaches.size()){
>>                 BlockData.reaches = VarRefsVect(newreaches.begin(), 
>> newreaches.end());
>>                 changed = true;
>>             }
>>         }
>>     }
>> }
>>
>> //===------------------------- Printing utilities 
>> -------------------------===//
>>
>> struct printer: public std::iterator<std::input_iterator_tag, 
>> printer, std::ptrdiff_t, printer>{
>>     printer(llvm::raw_ostream& x, ASTContext const& ctx_, const char* 
>> s = "") : o(x), ctx(ctx_), delim(s) { }
>>     template<typename T>
>>     printer& operator=(const T& x);
>>     printer& operator*() { return *this; }
>>     printer& operator++() { return *this; }
>>     printer& operator++(int) { return *this; }
>>
>>     mutable llvm::raw_ostream& o;
>>     ASTContext const& ctx;
>>     const char* delim;
>> };
>>
>> template<>
>> printer& printer::operator=<VarDecl const*>(const VarDecl* const& x){
>>     o << x->getNameAsString() << delim;
>>     return *this;
>> }
>>
>> template<>
>> printer& printer::operator=<DefUseNode const*>(const DefUseNode* 
>> const& x){
>>     SourceLocation loc;
>>     if(x->getKind() == DefUseNode::VarRef)
>>         loc = x->getVarRef()->getLocStart();
>>     else
>>         loc = x->getVarDecl()->getTypeSpecStartLoc();
>>     o << x->getDecl()->getNameAsString() << " " << "(" << Line(loc, 
>> ctx.getSourceManager()) << ","
>>        << Column(loc, ctx.getSourceManager()) << " "
>>        << ((x->getUse() == DefUseNode::Use)?"USE":"DEF") << ")" << 
>> delim;
>>     return *this;
>> }
>>
>> template<>
>> printer& printer::operator=<DefUseNode>(DefUseNode const& x){ return 
>> this->operator=(&x); }
>>
>> void printDefUseData(llvm::raw_ostream& out, ASTContext const& ctx, 
>> DefUseData const& data){
>>     out << "--------------- PRINTING ANALYSIS DATA 
>> ----------------------\n";
>>     unsigned blockID = data.size() - 1;
>>     for(DefUseData::const_reverse_iterator it = data.rbegin(); it != 
>> data.rend(); it++, blockID--){
>>         out << "* BlockID '" << blockID << "' uses vars:\n";
>>         for(DefUseBlock::const_iterator pit = it->begin(); pit != 
>> it->end(); pit++){
>>             out << "'" << pit->first->getNameAsString() << "':\n\t";
>>             std::copy( pit->second.begin(), pit->second.end(), 
>> printer( out, ctx, ", " ) );
>>             out << "\n";
>>         }
>>     }
>>
>>     blockID = data.size() - 1;
>>     for(DefUseData::const_reverse_iterator it = data.rbegin(); it != 
>> data.rend(); it++, blockID--) {
>>         DefUseBlock const& BlockData = *it;
>>         out << "---------------- Block: " << blockID << " 
>> ----------------\n";
>>         out << "# uses(" << blockID << "): {";
>>         copy( BlockData.uses.begin(), BlockData.uses.end(), printer( 
>> out, ctx, ", " ) );
>>         out << "}\n";
>>         out << "# defsout(" << blockID << "): {";
>>         copy( BlockData.defsout.begin(), BlockData.defsout.end(), 
>> printer( out, ctx, ", " ) );
>>         out << "}\n";
>>         out << "# killed(" << blockID << "): {";
>>         copy( BlockData.killed.begin(), BlockData.killed.end(), 
>> printer( out, ctx, ", " ) );
>>         out << "}\n";
>>         out << "# reaches(" << blockID << "): {";
>>         copy( BlockData.reaches.begin(), BlockData.reaches.end(), 
>> printer( out, ctx, ", " ) );
>>         out << "}\n";
>>     }
>> }
>>
>> //===------------------------- DefUse -------------------------===//
>>
>> DefUse::~DefUse(){ delete analysis_data; delete decl_map; delete 
>> block_map; }
>>
>> bool isDef(ASTContext const& ctx, DefUseNode const& n, int* blockID, 
>> DefUseVect::const_iterator* iter,
>>         DefUseData const* analysis_data, VarDeclMap const* decl_map, 
>> VarRefBlockMap const*  block_map, Stmt const** stmt = NULL){
>>     // retrieve the blockID of the CFG where the statement N (which 
>> can be wither a VarDecl or a DeclRefExpr) is placed
>>     Stmt const* _stmt_;
>>     if(n.getKind() == DefUseNode::VarDecl){
>>         VarDecl const* vd = n.getVarDecl();
>>         VarDeclMap::const_iterator it = decl_map->find(vd);
>>         // we use the decl_map to translate the VarDecl to the 
>> corresponding DeclStmt
>>         // this is useful when DeclStmts from the CFG (which usually 
>> are different from the original ones) are used
>>         ASSERT(it != decl_map->end(), "Variable declaration for '" << 
>> vd->getNameAsString() << "' not mapped to any statement in the syntax 
>> tree");
>>         _stmt_ = it->second;
>>     }else
>>         _stmt_ = n.getVarRef();
>>     assert(_stmt_ != NULL);
>>     VarRefBlockMap::const_iterator it = block_map->find(_stmt_);
>>     ASSERT(it != block_map->end(), "Statement '" << 
>> PrintClangStmt(_stmt_, ctx) << "' not mapped to any block of the CFG");
>>
>>     // We have to find the definition inside the block
>>     DefUseBlock::const_iterator data_it = 
>> (*analysis_data)[it->second].find(n.getDecl());
>>     // if we don't find the variable among the variable used or 
>> defined in the block it means something went wrong
>>     ASSERT(data_it != (*analysis_data)[it->second].end(), "Block ID: 
>> " << it->second << " doesn't contains usage or definitions of 
>> variable " << n.getDecl()->getNameAsString());
>>
>>     // we find a Variable definition, now we have to iterate through 
>> the DefUseNodes in order to find the
>>     // the node corresponding to the n variable passed by the user
>>     DefUseVect::const_iterator _iter_ = 
>> std::find(data_it->second.begin(), data_it->second.end(), n);
>>     if(blockID)        *blockID = it->second;
>>     if(iter)        *iter = _iter_;
>>     if(stmt)        *stmt = _stmt_;
>>     return _iter_ != data_it->second.end();
>> }
>>
>> bool isUse(ASTContext const& ctx, clang::DeclRefExpr const* var, int* 
>> blockID, DefUseVect::const_iterator* iter,
>>         DefUseData const* analysis_data, VarRefBlockMap const*  
>> block_map){
>>     Stmt const* stmt = var;
>>     VarRefBlockMap::const_iterator it = block_map->find(stmt);
>>     ASSERT(it != block_map->end(), "Statement " << 
>> PrintClangStmt(stmt, ctx) << " not mapped to any block of the CFG!");
>>
>>     DefUseBlock::const_iterator data_it = 
>> (*analysis_data)[it->second].find(cast<VarDecl>(var->getDecl()));
>>     ASSERT(data_it != (*analysis_data)[it->second].end(),
>>         "Block ID: " << it->second << " doesn't contains usage or 
>> definitions of variable " << var->getDecl()->getNameAsString());
>>
>>     DefUseVect::const_iterator _iter_ = 
>> std::find(data_it->second.begin(), data_it->second.end(), 
>> DefUseNode(var, DefUseNode::Use));
>>     if(blockID)        *blockID = it->second;
>>     if(iter)        *iter = _iter_;
>>     return _iter_ != data_it->second.end();
>> }
>>
>> //===------------------------- uses_iterator 
>> -------------------------===//
>>
>> DefUse::uses_iterator::uses_iterator(DefUse const* du, DefUseNode 
>> const& n):
>>     DefUse::iterator_impl(du), iter_ptr(-1), inDefBlock(true)
>> {
>>     Stmt const* stmt = NULL;
>>     if(!::isDef(du->ctx, n, &iter_ptr.block_id, &iter_ptr.block_it, 
>> du->analysis_data, du->decl_map, du->block_map, &stmt))
>>         throw NotDefExcpetion(stmt);     // there is no definition 
>> for the DeclRefExpr passed by the user
>>                                         // this means he trying to 
>> list the uses of a declrefexpr which is a use!
>>                                         // an excpetion is thrown.
>>     node = &(*(iter_ptr.block_it));
>>     inc(true);
>> }
>>
>> bool DefUse::uses_iterator::operator!=(uses_iterator const& iter){
>>     return !(iter_ptr.block_id == iter.iter_ptr.block_id && 
>> iter.iter_ptr.block_it ==  iter.iter_ptr.block_it);
>> }
>>
>> DeclRefExpr const* DefUse::uses_iterator::operator*() {
>>     assert(iter_ptr.block_it->getKind() == DefUseNode::VarRef);
>>     return iter_ptr.block_it->getVarRef();
>> }
>>
>> DefUse::uses_iterator& DefUse::uses_iterator::inc(bool first){
>>     if(iter_ptr.block_id == -1)        return *this;
>>     DefUseBlock const& m = (*du->analysis_data)[iter_ptr.block_id];
>>     DefUseBlock::const_iterator it = m.find(node->getDecl());
>>     assert(it != m.end());
>>     if(iter_ptr.block_it != it->second.end())
>>         iter_ptr.block_it++;
>>
>>     if(iter_ptr.block_it != it->second.end() && 
>> iter_ptr.block_it->getDecl() == node->getDecl()){
>>         if(iter_ptr.block_it->getUse() == DefUseNode::Use)
>>             return *this; // this is a use of the variable
>>         else if(inDefBlock){
>>             // found a new definition, stop iterating
>>             iter_ptr.block_it = DefUseVect::const_iterator();
>>             iter_ptr.block_id = -1;
>>             return *this;
>>         }
>>     }
>>     // if variable in defsout we have to check successor blocks
>>     if(inDefBlock && std::find(m.defsout.begin(), m.defsout.end(), 
>> node) == m.defsout.end()){
>>         iter_ptr.block_it = DefUseVect::const_iterator();
>>         iter_ptr.block_id = -1; // it means there are no other uses 
>> of this variable
>>         return *this;
>>     }
>>     // We have to change block
>>     // the following code has the precondition the blocks are 
>> numbered in a decrescent order
>>     // we lock into the block_id and see if the decl is in the 
>> reaches() set or not
>>     int new_block = iter_ptr.block_id;
>>     while(--new_block >= 0){
>>         DefUseBlock const& m = (*du->analysis_data)[new_block];
>>         if(std::find(m.reaches.begin(), m.reaches.end(), node) != 
>> m.reaches.end()){
>>             inDefBlock = false;
>>             DefUseBlock::const_iterator data_it = 
>> (*du->analysis_data)[new_block].find(node->getDecl());
>>             if(data_it != (*du->analysis_data)[new_block].end())
>>                 if(data_it->second.begin()->getUse() == 
>> DefUseNode::Use){
>>                     iter_ptr.block_id = new_block;
>>                     iter_ptr.block_it = data_it->second.begin();
>>                     return *this;
>>                 }
>>         }
>>     }
>>     iter_ptr.block_it = DefUseVect::const_iterator();
>>     iter_ptr.block_id = -1;
>>     return *this;
>> }
>>
>> //===------------------------- defs_iterator 
>> -------------------------===//
>>
>> DefUse::defs_iterator::defs_iterator(DefUse const* du, DeclRefExpr 
>> const& n):
>>     iterator_impl(du), iter_ptr(-1), blockDef(false)
>> {
>>     if(!::isUse(du->ctx, &n, &iter_ptr.block_id, &iter_ptr.block_it, 
>> du->analysis_data, du->block_map))
>>         throw NotUseExcpetion(&n); // the DeclRefExpr passed by the 
>> user is never used in this block but only defined,
>>                                    // the user is trying to iterate 
>> through the definitions of a definition... which doesn't make sense
>>                                    // an exception must be thrown
>>     iter_ptr.reaches_it = 
>> (*du->analysis_data)[iter_ptr.block_id].reaches.begin();
>>     node = &(*(iter_ptr.block_it));
>>     inc(true);
>> }
>> bool DefUse::defs_iterator::operator!=(defs_iterator const& iter){
>>     return !(iter_ptr.block_id == iter.iter_ptr.block_id &&
>>             iter.iter_ptr.block_it ==  iter.iter_ptr.block_it &&
>>             iter.iter_ptr.reaches_it == iter_ptr.reaches_it);
>> }
>>
>> DefUseNode const* DefUse::defs_iterator::operator*() {
>>     DefUseBlock::const_iterator it = 
>> (*du->analysis_data)[iter_ptr.block_id].find(node->getDecl());
>>     assert(it != (*du->analysis_data)[iter_ptr.block_id].end());
>>     if(blockDef)
>>         return &(*iter_ptr.block_it);
>>     return *iter_ptr.reaches_it;
>> }
>>
>> DefUse::defs_iterator::defs_iterator& DefUse::defs_iterator::inc(bool 
>> first){
>>     if(iter_ptr.block_id == -1)        return *this;
>>     // look inside reaches vector
>>     DefUseBlock::const_iterator it = 
>> (*du->analysis_data)[iter_ptr.block_id].find(node->getDecl());
>>     assert(it != (*du->analysis_data)[iter_ptr.block_id].end());
>>     if(first){
>>         // we have to find a def in the block, if exist
>>         for(; iter_ptr.block_it >= it->second.begin(); 
>> iter_ptr.block_it--)
>>             if(iter_ptr.block_it->getUse() == DefUseNode::Def){
>>                 blockDef = true;
>>                 break;
>>             }
>>         // the def is in the block, so it means we don't have to look 
>> any further
>>         // next time the iterator is incremented we return an empty one
>>         if(blockDef)
>>             return *this;
>>
>>     }else if(blockDef){
>>         iter_ptr.block_it = DefUseVect::const_iterator();
>>         iter_ptr.reaches_it = VarRefsVect::const_iterator();
>>         iter_ptr.block_id = -1;
>>         return *this;
>>     }
>>     // we didn't find a def in the block, so we have to look at the 
>> reaches set
>>     VarRefsVect const& reaches_v = 
>> (*du->analysis_data)[iter_ptr.block_id].reaches;
>>     if(!first && iter_ptr.reaches_it != reaches_v.end())
>>         iter_ptr.reaches_it++;
>>     for(; iter_ptr.reaches_it != reaches_v.end(); iter_ptr.reaches_it++)
>>         if((*iter_ptr.reaches_it)->getDecl() == node->getDecl())
>>             return *this;
>>
>>     iter_ptr.block_it = DefUseVect::const_iterator();
>>     iter_ptr.reaches_it = VarRefsVect::const_iterator();
>>     iter_ptr.block_id = -1;
>>     return *this;
>> }
>>
>> DefUse::uses_iterator defuse::DefUse::uses_begin(clang::DeclRefExpr 
>> const* var) const{
>>     return DefUse::uses_iterator(this, DefUseNode(var, 
>> DefUseNode::Def));
>> }
>> DefUse::uses_iterator defuse::DefUse::uses_begin(clang::VarDecl 
>> const* var) const{
>>     return DefUse::uses_iterator(this, DefUseNode(var));
>> }
>> DefUse::uses_iterator defuse::DefUse::uses_end() const{ return 
>> DefUse::uses_iterator(); }
>>
>> DefUse::defs_iterator defuse::DefUse::defs_begin(clang::DeclRefExpr 
>> const* var) const{
>>     return DefUse::defs_iterator(this, *var);
>> }
>> DefUse::defs_iterator defuse::DefUse::defs_end() const{ return 
>> DefUse::defs_iterator(); }
>>
>> bool defuse::DefUse::isUse(clang::DeclRefExpr const* var) const { 
>> return ::isUse(ctx, var, NULL, NULL, analysis_data, block_map); }
>>
>> bool defuse::DefUse::isDef(DefUseNode const& n) const { return 
>> ::isDef(ctx, n, NULL, NULL, analysis_data, decl_map, block_map); }
>>
>> //===------------------------- BuildDefUseChains 
>> -------------------------===//
>>
>> DefUse* defuse::DefUse::BuildDefUseChains(Stmt* body, ASTContext 
>> *ctx, DefUseBasicHelper* helper,
>>        CFG* cfg, ParentMap* pm, bool verbose, llvm::raw_ostream& out)
>> {
>>     bool cleanCFG = false, cleanParentMap = false, cleanHelper = false;
>>     if(!cfg){
>>         cfg = CFG::buildCFG(body, ctx);
>>         cleanCFG = true;
>>     }
>>     if(!pm){
>>         pm = new ParentMap(body);
>>         cleanParentMap = true;
>>     }
>>     DefUseData*    data = new DefUseData(cfg->getNumBlockIDs());
>>
>>     VarDeclMap*    dm = new VarDeclMap(body); // map VarDecl to AST 
>> nodes - this is needed as when the CFG is created by CLANG
>>                                            // DeclStmt with several 
>> declarations are rewritten into separate DeclStmts
>>
>>     VarRefBlockMap* bm = new VarRefBlockMap;      // map each VarRef 
>> to the CFG block where it has been used
>>
>>     if(verbose)
>>         cfg->dump(ctx->getLangOptions());
>>
>>     if(!helper){
>>         helper = new DefUseBasicHelper;
>>         cleanHelper = true;
>>     }
>>     helper->InitializeValues(data, bm, dm);
>>
>>     for(CFG::const_iterator it = cfg->begin(); it != cfg->end(); it++)
>>         helper->VisitCFGBlock(*it, cfg->getNumBlockIDs()-1);
>
> You should not rely on the root being number 'getNumBlockIDs() - 1'.  
> This actually is not an invariant you should depend on.  If you want 
> to identify the entry block, use 'CFG::getEntry()'.
>
>>
>>     // remove from defsout local variables
>>     RemoveLocalDefs(*cfg, *pm, *dm, *data);
>>     // Calculates the reaches set for each block of the CFG
>>     ComputeReaches(*cfg, *data);
>>
>>     if(verbose)
>>         printDefUseData(out, *ctx, *data);
>>
>>     unsigned block_id = cfg->getNumBlockIDs();
>>     if(cfg && cleanCFG)
>>         delete cfg;
>>     if(pm && cleanParentMap)
>>         delete pm;
>>     if(helper && cleanHelper)
>>         delete helper;
>>     return new DefUse(*ctx, data, dm, bm, block_id);
>> }
>
> Instead of BuildDefUseChains building its own CFG and ParentMap, have 
> these be passed in from the client.  This allows reuse of existing 
> data that has likely been constructed for other uses.  You can also 
> make an alternate version of BuildDefUseChains that is passed an 
> AnalysisManager and uses those to create the CFGs and ParentMap.
>
> Also, consider using llvm::OwningPtr to auto-reclaim 'helper' and 
> friends.  For example:
>
>  llvm::OwningPtr<DefUseBasicHelper> helperReclaim;
>  if (!helper) {
>    helper = new DefUseBasicHelper();
>    helperReclaim.reset(helper);
>  }
>
> You can then forget about the later cleanup because it will 
> automatically happen when the function returns.  This idiom is really 
> powerful when you have multiple return sites, as you can easily forget 
> to do the proper cleanup.  It also cleans up the code, as all the 
> cleanup is just boilerplate that is easily handled by the OwningPtr.
>
>
>>
>> //===------------------------- DefUseChainTest 
>> -------------------------===//
>>
>> void DefUseChainTest::HandleTopLevelDecl(DeclGroupRef declRef) {
>>     for(DeclGroupRef::iterator it = declRef.begin(); it < 
>> declRef.end(); it++){
>>         if((*it)->getKind() != Decl::Function)
>>             continue;
>
> Use 'dyn_cast' or 'isa' instead of 'getKind()'
>
>
>>         // Top-level Function declaration
>>         FunctionDecl* func_decl = cast<FunctionDecl> (*it);
>>         Stmt* func_body = func_decl->getBodyIfAvailable();
>>         if(!func_body)
>>             continue;
>>
>>         out << "--- Building Def-Use Chains for function '" << 
>> func_decl->getNameAsString() << ":" << func_decl->getNumParams() << 
>> "' ---\n";
>>         du = DefUse::BuildDefUseChains(func_body, ctx, 0, 0, 0, true);
>>         out << "------ Testing Def-Use iterators ------\n";
>>         VisitStmt(func_body);
>>         out << "**************************************************\n";
>>         delete du;
>>     }
>> }
>>
>> void defuse::PrintVarDefs(DefUse const* DU, DeclRefExpr const* DR, 
>> ASTContext& ctx, llvm::raw_ostream& out){
>>     if(DR->getDecl()->getKind() != Decl::Var && 
>> DR->getDecl()->getKind() != Decl::ParmVar)
>>         return;
>
> Use 'dyn_cast' or 'isa' instead of 'getKind()'
>
>
>>
>>     VarDecl const* VD = cast<VarDecl>(DR->getDecl());
>>     SourceLocation loc =  DR->getLocStart();
>>     out << "Variable ref '" << VD->getNameAsCString() << "' in LOC: "
>>                 << "(" << Line(loc, ctx.getSourceManager()) << "," << 
>> Column(loc, ctx.getSourceManager()) << ") -> defs_iterator:\n";
>>     try{
>>         for(DefUse::defs_iterator it = DU->defs_begin(DR); it != 
>> DU->defs_end(); it++){
>>             SourceLocation loc;
>>             if((*it)->getKind() == DefUseNode::VarRef)
>>                 loc = (*it)->getVarRef()->getLocStart();
>>             else
>>                 loc = (*it)->getVarDecl()->getTypeSpecStartLoc();
>>             out << "\t* Found DEF in loc: " << "(" << Line(loc, 
>> ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager()) 
>> << ")\n";
>>         }
>>     }catch(NotUseExcpetion e){
>>         SourceLocation loc = e.expr->getLocStart();
>>         out << "ERROR: Variable " << PrintClangStmt(e.expr, ctx) << " 
>> in location : "
>>                 << "(" << Line(loc, ctx.getSourceManager()) << "," << 
>> Column(loc, ctx.getSourceManager()) << ") is not a use for variable: 
>> " <<  VD->getNameAsCString() << "\n";
>
> Again, LLVM doesn't use C++ exceptions.  You'll need to come up with 
> an alternate mechanism.
>
>
>>     }
>> }
>> void PrintVarUses(DefUse const* DU, DefUseNode const& n, ASTContext& 
>> ctx, llvm::raw_ostream& out){
>>     VarDecl const* VD = n.getDecl();
>>     SourceLocation loc =  VD ->getTypeSpecStartLoc();
>>     if(n.getKind() == DefUseNode::VarRef)
>>         loc = n.getVarRef()->getLocStart();
>>
>>     out << "Variable ref '" << VD->getNameAsCString() << "' in LOC: "
>>             << "(" << Line(loc, ctx.getSourceManager()) << "," << 
>> Column(loc, ctx.getSourceManager()) << ") -> uses_iterator:\n";
>>     try{
>>         DefUse::uses_iterator it = DU->uses_end();
>>         if(n.getKind() == DefUseNode::VarRef)
>>             it = DU->uses_begin(n.getVarRef());
>>         else
>>             it = DU->uses_begin(VD);
>>         for(; it != DU->uses_end(); it++){
>>             SourceLocation loc =  (*it)->getLocStart();
>>             out << "\t* Found USE in loc: " << "(" << Line(loc, 
>> ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager()) 
>> << ")\n";
>>         }
>>     }catch(NotDefExcpetion e){
>>         SourceLocation loc = e.stmt->getLocStart();
>>         out << "ERROR: Variable " << PrintClangStmt(e.stmt, ctx) << " 
>> in location : "
>>                 << "(" << Line(loc, ctx.getSourceManager()) << "," << 
>> Column(loc, ctx.getSourceManager()) << ") is not a definition for 
>> variable: " << VD->getNameAsCString() << "\n";
>>     }
>
> Again, LLVM doesn't use C++ exceptions.  You'll need to come up with 
> an alternate mechanism.
>
>> }
>>
>> inline void defuse::PrintVarUses(DefUse const* DU, DeclRefExpr const* 
>> DR, ASTContext& ctx, llvm::raw_ostream& out){ ::PrintVarUses(DU, 
>> DefUseNode(DR), ctx, out); }
>> inline void defuse::PrintVarUses(DefUse const* DU, VarDecl const* VD, 
>> ASTContext& ctx, llvm::raw_ostream& out){ ::PrintVarUses(DU, 
>> DefUseNode(VD), ctx, out); }
>>
>> void defuse::DefUseChainTest::VisitDeclRefExpr(DeclRefExpr *DR){
>>     if(DR->getDecl()->getKind() == Decl::Var || 
>> DR->getDecl()->getKind() == Decl::ParmVar){
>
> Use 'dyn_cast' or 'isa' instead of 'getKind()'
>
>>         PrintVarUses(du, DR, *ctx, out);
>>         PrintVarDefs(du, DR, *ctx, out);
>>     }
>> }
>>
>> void defuse::DefUseChainTest::VisitDeclStmt(DeclStmt *DS){
>>     for(DeclStmt::decl_iterator it = DS->decl_begin(); it != 
>> DS->decl_end(); it++){
>>         if((*it)->getKind() != Decl::Var)
>>             continue;
>>
>>         VarDecl* VD = cast<VarDecl>(*it);
>>         PrintVarUses(du, VD, *ctx, out);
>>
>>         if(VD->getInit())
>>             Visit(VD->getInit());
>
> How about:
>
>   if (Stmt *S = VD->getInit())
>     Visit(S);
>
>
>>     }
>> }
>>
>> void defuse::DefUseChainTest::VisitStmt(Stmt* S){
>>     for(Stmt::child_iterator I = S->child_begin(), E = 
>> S->child_end(); I != E; ++I)
>>         if(*I) Visit(*I);
>> }
>




More information about the cfe-dev mailing list