[cfe-dev] Constructing use-def chains
Simone Pellegrini
spellegrini at dps.uibk.ac.at
Wed Sep 9 04:37:17 PDT 2009
Hi Ted,
Thanks for going through all that messy code, I read the code reviews, I
definitely agree with them, I will fix it before the end of the week and
I will send you a better version hopefully next Monday, so we can go
through another iteration.
Sorry for the 80 columns formatting and white spaces issues, I didn't
have time to fix it last week and I wanted to send you this code before
the end of the week... so I try to get lucky :)
I have a couple of questions: Should I provide patch files? or we can
continue with sending C++ files at this stage? Should I also provide
some test cases?
cheers, Simone
Ted Kremenek wrote:
>
> On Sep 4, 2009, at 8:43 AM, Simone Pellegrini wrote:
>
>> Hi again,
>> Here it is my current implementation of DefUse chain analysis. I
>> adjust the code to be compliant with some of the clang rules (I had
>> some problem in switching from std::map to llvm::DenseMap and I will
>> try to better investigate this issue in the future).
>>
>> So as this is not a patch, but a complete new feature I don't know
>> where it should be placed in the source tree (clang/Analysis seems
>> fine), so I will let u decide. The analysis is composed by 2 files
>> DefUse.h and DefUse.cpp which represent respectively the interface
>> and the implementation of the DefUse analysis. The usage of DefUse is
>> very similar to CFG. You build the all DefUse chains for a specific
>> piece of code (usually a function body) and then u start querying the
>> object.
>>
>> #include "DefUse.h"
>> using namespace defuse;
>>
>> clang::FunctionDecl* fd = ...
>> DefUse* DU = DefUse::BuildDefUseChains(fd->getBody(), astCtx);
>>
>> now if we have a reference to a variable inside the function body
>> (usually a DeclRefExpr) we can query the DefUse object to know if the
>> VarRef is a "definition" or an "use":
>>
>> DeclRefExpr* DR = ...
>> DU->isDef(DR);
>> DU->isUse(DR);
>>
>> Now if our variable is a "definition" for example we can iterate
>> among all the uses referred to the definition in the following way:
>> for(DefUse::uses_iterator I = DU->uses_begin(DR), E = DU->uses_end();
>> I != E, I++){
>> DeclRefExpr* U = *I;
>> ...
>> }
>>
>> If the variable is a use, we can iterate among all the definitions:
>> for(DefUse::defs_iterator I = DU->defs_begin(DR), E = DU->defs_end();
>> I != E, I++){
>> ...
>> }
>> here the problem is that a definition can be either a DeclRefExpr in
>> situations like:
>> -> a = s; // this is a definition for var. a
>>
>> or a VarDecl in situations like:
>> -> int a; // this is a definition for variable a
>>
>> For these reasons I have introduced a wrapper class called DefUseNode
>> (the name can be changed) that can contains VarDecls or DeclRefExprs.
>>
>> And that's the basic usage of DefUse.
>>
>> A more advanced use is also possible, I don't want to go into the
>> details of the algorithm used to compute the defuse chains, but you
>> can imagine it divided into 2 steps. In the first steps all the
>> variable references are collected and marked as definitions or uses
>> depending on the semantic of the instruction and then, in the second
>> step, this informations are elaborated in order to build the graph.
>> We can discuss about it but It comes out that marking the use of a
>> variable can depends from the context, and the semantics can be
>> different. For example if we are calling a function and the variable
>> is passed as a reference or pointer we have to assume that this is a
>> definition for the variable, but if we know the semantics of the
>> function and we are sure the variable is not modified in the function
>> body we can "risk" a little and mark the variable as an use. For this
>> reason different semantics can be defined by the user in order to
>> cover specific cases.
>>
>> This is possible by extending the DefUseBasicHelper. The basic helper
>> already implements the default semantics (a very conservative one)
>> but it exposes some virtual methods that can be extended by the user
>> who wants to implement a different behavior. The helper should be
>> passed to the BuildDefUseChain method in this way:
>>
>> DefUseCustomHelper helper;
>> DefUse* DU = DefUse::BuildDefUseChains(fd->getBody(), astCtx, &helper);
>> ...
>>
>> At last, I added a class to test the algorithm DefUseChainTest. I
>> produces a lot of output (badly formatted) but you can easily see how
>> ( and if :) ) the whole analysis works just by:
>> defuse::DefUseChainTest Consumer(llvm::outs());
>> ...
>> ParseAST(PP, Consumer, *ContextOwner, false);
>>
>> The test class is also meant to be an example of how to use the DefUse.
>>
>> N.B.: As I already stated in my previous emails it is important (when
>> and if the patch will be committed in the CLANG source tree) the
>> institution where I am working, the "University of Innsbruck", and if
>> possible my name (Simone Pellegrini) appear in the CREDITS.TXT file
>> as contributors for this type of analysis.
>>
>> have fun with DefUses, cheers Simone
>>
>> P.S.: Unfortunately next week I will be on a business trip so not
>> reachable. I will reply and address your comments/feedbacks as soon
>> as I am back in the office.
>
> Hi Simone,
>
> Sorry for the delay in replying! The overall interface for this class
> looks pretty good. I have a bunch of comments with regards to the
> implementation and the coding style that we should iterate on before
> committing back to the Clang tree. My comments are inline with the
> code below. Feel free to reply whenever you have time.
>
> As for where this code should live, I suggest:
>
> include/clang/Analysis/Analyses/DefUse.h
> lib/Analysis/DefUse.cpp
>
> My comments reflect a first pass over this patch. There is a lot
> going on here, so I have focused only on a subset of the details.
>
>>
>> //===-- clang/Analysis/DefUse.h - DefUse analysis -------*- C++ -*-===//
>> //
>> // The LLVM Compiler Infrastructure
>> //
>> // This file is distributed under the University of Illinois Open Source
>> // License. See LICENSE.TXT for details.
>> //
>> //===----------------------------------------------------------------------===//
>>
>> //
>> // This file contains the declaration of DefUse analysis headers
>> //
>> //===----------------------------------------------------------------------===//
>>
>>
>> #ifndef DEF_USE_HPP_
>> #define DEF_USE_HPP_
>
> Please use "LLVM_CLANG_DEFUSE_H" to avoid any conflict with other
> defuse implementations.
>
>> #include "llvm/ADT/DenseMap.h"
>>
>> #include "clang/AST/ASTConsumer.h"
>> #include "clang/AST/ParentMap.h"
>> #include "clang/AST/StmtVisitor.h"
>
> Looks fine.
>
>>
>> #include <exception>
>> #include <stdexcept>
>
> We don't use C++ exceptions at all in LLVM, largely for performance
> reasons.
>
>> namespace clang {
>>
>> // forward definitions
>> class CFG;
>> class CFGBlock;
>>
>> namespace defuse {
>>
>> //===------------------------- Exceptions -------------------------===//
>>
>> struct NotDefExcpetion {
>> Stmt const* stmt;
>> NotDefExcpetion(Stmt const* stmt_): stmt(stmt_){}
>> };
>> struct NotUseExcpetion {
>> DeclRefExpr const* expr;
>> NotUseExcpetion(DeclRefExpr const* expr_): expr(expr_){}
>> };
>
> These will need to be replaced with an alternate mechanism.
>
>>
>> //===------------------------- DefUseNode -------------------------===//
>> // Keeps the information for a particular 'definition' or 'use' of a
>> variable
>> // The structure is needed because a defuse chain can contains
>> variable declarations
>> // as well as variable reference. As a DeclStmt can contains several
>> declarations
>> // to address a particular variable we need to store its VarDecl or
>> DeclRefExp
>> struct DefUseNode{
>> enum NodeKind{ VarDecl, VarRef };
>> enum UseKind{ Use, Def, UseDef };
>>
>> DefUseNode(clang::VarDecl const* decl): var_decl(decl),
>> kind(VarDecl), usage(Def){}
>> DefUseNode(DeclRefExpr const* ref, UseKind u = Use):
>> var_ref(ref), kind(VarRef), usage(u){}
>>
>> clang::VarDecl const* getDecl() const;
>>
>> NodeKind const& getKind()const { return kind; }
>> UseKind const& getUse() const { return usage; }
>>
>> clang::VarDecl const* getVarDecl() const {
>> assert(kind==VarDecl); return var_decl;
>> }
>> DeclRefExpr const* getVarRef() const {
>> assert(kind==VarRef); return var_ref;
>> }
>>
>> bool operator==(DefUseNode const& n) const;
>>
>> private:
>> // a def-use node can be either a VarDecl or a DeclRefExpr
>> union{
>> clang::VarDecl const* var_decl;
>> DeclRefExpr const* var_ref;
>> };
>> NodeKind kind;
>> UseKind usage;
>> };
>
> Since this 'struct' has private data, please make it a class instead:
>
> http://llvm.org/docs/CodingStandards.html#ci_class_struct
>
>
>>
>> //===------------------------- Typedefs -------------------------===//
>> class DefUseBlock;
>> class VarDeclMap;
>> typedef std::vector<DefUseBlock> DefUseData;
>> typedef llvm::DenseMap<Stmt const*, unsigned> VarRefBlockMap;
>> typedef std::vector<DefUseNode> DefUseVect;
>> typedef std::vector<DefUseNode const*> VarRefsVect;
>>
>> class DefUseBasicHelper;
>>
>> //===------------------------- DefUse -------------------------===//
>>
>> class DefUse{
>> ASTContext const& ctx;
>> DefUseData const* analysis_data;
>> VarDeclMap const* decl_map;
>> VarRefBlockMap const* block_map;
>> unsigned const num_cfg_blocks;
>>
>> DefUse(ASTContext const& ctx_, DefUseData const* analysis_data_,
>> VarDeclMap const* decl_map_,
>> VarRefBlockMap const* block_map_, unsigned num_blocks):
>> ctx(ctx_), analysis_data(analysis_data_),
>> decl_map(decl_map_), block_map(block_map_),
>> num_cfg_blocks(num_blocks){ }
>>
>> bool isDef(DefUseNode const& n) const;
>>
>> struct iterator_impl{
>
> To make MSVC happy, this should be a 'class'.
>
>> DefUse const* du;
>> DefUseNode const* node;
>>
>> struct iter{
>> int block_id;
>> DefUseVect::const_iterator block_it;
>> iter(int blk_id): block_id(blk_id){}
>> };
>>
>> iterator_impl(): du(NULL){ }
>> iterator_impl(DefUse const* du_): du(du_){ }
>> iterator_impl& operator++() { return inc(false); }
>> iterator_impl& operator++(int) { return inc(false); }
>> virtual iterator_impl& inc(bool) = 0;
>> };
>> public:
>> class uses_iterator: public
>> std::iterator<std::input_iterator_tag, DeclRefExpr, std::ptrdiff_t,
>> DeclRefExpr const*>,
>> public iterator_impl{
>> iter iter_ptr;
>> bool inDefBlock;
>>
>> uses_iterator(): iterator_impl(), iter_ptr(-1),
>> inDefBlock(true){}
>> uses_iterator(DefUse const* du, DefUseNode const& n);
>> uses_iterator& inc(bool first);
>> friend class DefUse;
>> public:
>> virtual bool operator!=(uses_iterator const& iter);
>
> Does this need to be virtual?
>
>> virtual DeclRefExpr const* operator*();
>
> Does this need to be virtual?
>
>> };
>>
>> class defs_iterator: public
>> std::iterator<std::input_iterator_tag, DefUseNode, std::ptrdiff_t,
>> DefUseNode const*>,
>> public iterator_impl{
>> struct iter_: public iter{
>> VarRefsVect::const_iterator reaches_it;
>> iter_(int blk_id): iter(blk_id){}
>> } iter_ptr;
>> bool blockDef;
>>
>> defs_iterator(): iterator_impl(), iter_ptr(-1),
>> blockDef(false) {}
>> defs_iterator(DefUse const* du, DeclRefExpr const& n);
>> defs_iterator& inc(bool first);
>> friend class DefUse;
>> public:
>> bool operator!=(defs_iterator const& iter);
>> DefUseNode const* operator*();
>> };
>>
>> // USES //
>> defs_iterator defs_begin(DeclRefExpr const* var) const;
>> defs_iterator defs_end() const;
>
> Please add doxygen comments for these methods.
>
>>
>> // DEFS //
>> uses_iterator uses_begin(DeclRefExpr const* var) const;
>> uses_iterator uses_begin(VarDecl const* var) const;
>> uses_iterator uses_end() const;
>
> Please add doxygen comments for these methods.
>
>>
>> bool isUse(DeclRefExpr const* var) const;
>> bool isDef(DeclRefExpr const* var) const { return
>> isDef(DefUseNode(var, DefUseNode::Def)); }
>> bool isDef(VarDecl const* var) const { return
>> isDef(DefUseNode(var)); }
>>
>> ~DefUse();
>>
>> static DefUse* BuildDefUseChains(Stmt* body, ASTContext *ctx,
>> DefUseBasicHelper* helper = 0,
>> CFG* cfg = 0, ParentMap* pm =
>> 0, bool verbose = false,
>> llvm::raw_ostream& out =
>> llvm::outs());
>> };
>
> I think having a "defuse" namespace is fine (for all the special
> typedefs, etc.), but it should only contain the implementation details
> for DefUse. The DefUse class should be in the clang namespace.
>
>>
>> //===------------------------- DefUseBasicHelper
>> -------------------------===//
>
> It seems like DefUseBasicHelper allows a bunch of hooks for subclasses
> via the 'HandleXXX' methods, but this level of hooks is not going to
> fair well from a performance standpoint. I believe that one base
> class can implement most of the HandleXXX logic without making those
> methods virtual (and thus dispense with the HandleXXX methods entirely
> and just have VisitXXX methods). I think most subclasses will need
> very few hooks, if any at all. I think it's worth identifying the one
> or two methods that need to be virtual and then leave the rest of them
> as non-virtual methods.
>
> Also, if this truly is the base class, how about calling it
> 'DefUseHelper'? That seems more succinct and clear from an interface
> perspective.
>
>>
>> class DefUseBasicHelper: public StmtVisitor<DefUseBasicHelper>{
>> class DefUseBasicHelperImpl;
>> DefUseBasicHelperImpl* pimpl;
>>
>> void InitializeValues(DefUseData* data, VarRefBlockMap* bm,
>> VarDeclMap* dm);
>>
>> friend class DefUse;
>> public:
>> DefUseNode::UseKind current_use;
>> DefUseBasicHelper();
>>
>> virtual void HandleDeclRefExpr(DeclRefExpr *DR); // remember to
>> call the super class implementation of the method
>> void HandleDeclStmt(DeclStmt *DS);
>>
>> virtual void HandleBinaryOperator(BinaryOperator* B);
>> virtual void HandleConditionalOperator(ConditionalOperator* C);
>> virtual void HandleCallExpr(CallExpr* C);
>> virtual void HandleUnaryOperator(UnaryOperator* U);
>> virtual void HandleArraySubscriptExpr(ArraySubscriptExpr* AS);
>> // ...
>
> Is there a need for the '// ..' line?
>
>>
>> void VisitDeclRefExpr(DeclRefExpr *DR){ return
>> HandleDeclRefExpr(DR); }
>> void VisitDeclStmt(DeclStmt *DS){ return HandleDeclStmt(DS); }
>> void VisitBinaryOperator(BinaryOperator* B){ return
>> HandleBinaryOperator(B); }
>> void VisitConditionalOperator(ConditionalOperator* C){ return
>> HandleConditionalOperator(C); }
>> void VisitCallExpr(CallExpr* C){ return HandleCallExpr(C); }
>> void VisitUnaryOperator(UnaryOperator* U){ return
>> HandleUnaryOperator(U); }
>> void VisitArraySubscriptExpr(ArraySubscriptExpr* AS){ return
>> HandleArraySubscriptExpr(AS); }
>> // ...
>
> Is there a need for the '// ..' line?
>
>>
>> void VisitCFGBlock(clang::CFGBlock const& blk, int root);
>
> The 'root' argument is not necessary and should be removed (see
> comment below).
>
>> void VisitStmt(Stmt* S);
>>
>> virtual ~DefUseBasicHelper();
>> };
>>
>> //===------------------------- DefUseChainTest
>> -------------------------===//
>>
>> class DefUseChainTest: public StmtVisitor<DefUseChainTest>, public
>> ASTConsumer {
>> llvm::raw_ostream& out;
>> ASTContext* ctx;
>> DefUse const* du;
>>
>> public:
>> DefUseChainTest(llvm::raw_ostream& out_): out(out_), ctx(NULL),
>> du(NULL) { }
>>
>> void Initialize(ASTContext& context) { ctx = &context; }
>> void HandleTopLevelDecl(DeclGroupRef declGroupRef);
>> // void HandleTranslationUnit(clang::ASTContext& context);
>>
>> void VisitDeclRefExpr(DeclRefExpr *DR);
>> void VisitDeclStmt(DeclStmt *DS);
>> void VisitStmt(Stmt* S);
>> };
>>
>> void PrintVarDefs(DefUse const* DU, DeclRefExpr const* DR,
>> ASTContext& ctx, llvm::raw_ostream& out);
>> void PrintVarUses(DefUse const* DU, DeclRefExpr const* DR,
>> ASTContext& ctx, llvm::raw_ostream& out);
>> void PrintVarUses(DefUse const* DU, VarDecl const* VD, ASTContext&
>> ctx, llvm::raw_ostream& out);
>>
>> } // end namespace defuse
>
> The DefUseChainTest can be made entirely private in DefUse.cpp, and
> have a static function create it (and return an ASTConsumer). There
> is no need to have it in the public API interface.
>
>> } // end namespace clang
>>
>> #endif /* DEF_USE_HPP_ */
>
> Some compilers complain about the extra tokens after #endif. Please
> remove this comment.
>
>
>> //===-- clang/Analysis/DefUse.cpp - DefUse analysis -------*- C++
>> -*-===//
>> //
>> // The LLVM Compiler Infrastructure
>> //
>> // This file is distributed under the University of Illinois Open Source
>> // License. See LICENSE.TXT for details.
>> //
>> //===----------------------------------------------------------------------===//
>>
>> //
>> // This file contains the implementation of DefUse analysis
>> //
>> //===----------------------------------------------------------------------===//
>>
>>
>> #define __STDC_LIMIT_MACROS
>> #define __STDC_CONSTANT_MACROS
>
> What are these two defines used for?
>
>>
>> #include "DefUse.h"
>>
>> #include "clang/Basic/SourceManager.h"
>> #include "clang/Analysis/CFG.h"
>>
>> #include <llvm/Support/raw_ostream.h>
>>
>> #include <set>
>> #include <map>
>
> Eventually the code should probably use DenseSet and DenseMap for
> performance reasons, as std::map and std::set are very slow and make
> heavy use of malloc(). I understand that a significant amount of code
> may need to be tweaked, but the performance gain can easily be 2-3x.
>
>>
>> using namespace clang;
>> using namespace defuse;
>>
>> // ASSERT MACRO
>> #define ASSERT(COND, MSG) do{ \
>> std::string str; \
>> llvm::raw_string_ostream ss(str); \
>> ss << MSG; \
>> assert((COND) && ss.str().c_str()); }while(false)
>
> I don't think this will do what you intend. If I did:
>
> ASSERT(false, 10)
>
> I'd see:
>
> "false && ss.str().c_str()"
>
> as the failed assertion condition in the terminal output.
>
> One option to get better error reporting is to use the facilities in
> 'clang/Basic/PrettyStackTrace.h', which are used in various places in
> Clang. These can be used to print the location within the analyzed
> source that the assertion fails. In general we don't like specialized
> error reporting machinery for just one file (as in the case of
> 'ASSERT'), but like adding to the general error reporting machinery
> that all of LLVM/Clang can use.
>
> I also don't think this ASSERTION macro adds much value and causes
> some of the client code below to be structured in some slightly
> suboptimal ways.
>
>>
>> // utility functions
>> unsigned Line(SourceLocation const& l, clang::SourceManager const& sm) {
>> PresumedLoc pl = sm.getPresumedLoc(l);
>> return pl.getLine();
>> }
>>
>> unsigned Column(SourceLocation const& l, SourceManager const& sm) {
>> PresumedLoc pl = sm.getPresumedLoc(l);
>> return pl.getColumn();
>> }
>
> Instead of adding these functions, we can add 'getPresumedLoc' to
> FullSourceLoc and then use the 'getColumn()' and 'getLine()' methods
> from that class.
>
>>
>> std::string PrintClangStmt(Stmt const* stmt, ASTContext const& ctx){
>> std::string str;
>> llvm::raw_string_ostream pp(str);
>> stmt->printPretty(pp, 0, PrintingPolicy(ctx.getLangOptions()), 0);
>> return pp.str();
>> }
>
> I don't think this function is needed. If you need it, please declare
> it static to keep it private to this file.
>
>
>>
>> //===------------------------- DefUseNode -------------------------===//
>> inline bool defuse::DefUseNode::operator==(DefUseNode const& n) const{
>> if(n.usage != usage || n.kind != kind)
>> return false;
>
> Please add a space between 'if' and the '('. This is the coding style
> used throughout LLVM.
>
> Also, please use 2 space indentation like the rest of LLVM and Clang.
>
>
>> if(kind == VarDecl)
>> return var_decl == n.var_decl;
>> else
>> return var_ref == n.var_ref;
>> return false;
>> }
>
> The 'return false' is dead code. Please remove it.
>
> This can be more succinctly written as:
>
> return kind == VarDecl ? var_decl == n.var_decl : var_Ref == n.var_ref;
>
> Alternatively:
>
> if (kind == VarDecl)
> return var_decl == n.var_decl;
>
> return var_ref == n.var_ref;
>
>> inline clang::VarDecl const* DefUseNode::getDecl() const{
>> if(kind == VarRef)
>> return cast<clang::VarDecl>(var_ref->getDecl());
>> return cast<clang::VarDecl>(var_decl);
>> }
>
> Same as the previous comment.
>
> if (kind == VarRef)
> return cast<VarDecl>(var_ref->getDecl());
>
> return cast<VarDecl>(var_decl);
>
> or:
>
> return kind == VarRef ? cast<VarDecl>(var_ref->getDecl()) :
> cast<VarDecl>(var_decl);
>
>>
>> typedef std::set<DefUseNode const*> VarRefsSet;
>> typedef std::set<VarDecl const*> VarDeclsSet;
>> typedef std::map<VarDecl const*, DefUseVect> DefUseChain;
>
> Please use llvm::DenseSet and llvm::DenseMap. You'll need to use
> pointers to DefUseVects instead (and then destroy them), but the speed
> improvement will be significant.
>
>>
>> //===------------------------- DefUseBlock
>> -------------------------===//
>>
>> struct defuse::DefUseBlock: public DefUseChain{
>
> This should be 'class' instead of 'struct'. You declared it earlier
> as a 'class'. While gcc will eat this, MSVC will not. Also, since
> this isn't POD, it should be a class anyway.
>
>
>> VarDeclsSet uses;
>> VarRefsVect defsout;
>> VarDeclsSet killed;
>> VarRefsVect reaches;
>>
>> class not_killed_set_iterator: public
>> std::iterator<std::input_iterator_tag, DefUseNode, std::ptrdiff_t,
>> DefUseNode const*>{
>> DefUseBlock& block;
>> VarRefsVect::const_iterator ptr_it;
>>
>> not_killed_set_iterator& inc(bool first){
>> if(ptr_it != block.reaches.end() && !first)
>> ptr_it++;
>
> Please use '++ptr_it' instead of 'ptr_it++'. While it makes no
> difference in this case, general speaking it performs better and there
> is no reason to make any assumptions about the underlying data
> structure and how the iterators are implemented.
>
>> while(ptr_it != block.reaches.end() &&
>> (block.killed.find((*ptr_it)->getDecl()) !=
>> block.killed.end() &&
>> std::find(block.defsout.begin(),
>> block.defsout.end(), *ptr_it) == block.defsout.end()))
>> ptr_it++;
>> return *this;
>> }
>> public:
>> not_killed_set_iterator(DefUseBlock& block_,
>> VarRefsVect::const_iterator ptr_it_): block(block_), ptr_it(ptr_it_){
>> inc(true); }
>> not_killed_set_iterator& operator++() { return inc(false); }
>> not_killed_set_iterator& operator++(int) { return inc(false); }
>
> Do we even need an operator++(int)? By disallowing it people won't
> accidentally use it. This also isn't the standard semantics of this
> operator, so by implementing it this way clients will silently be
> doing the wrong thing instead of getting a compilation error.
>
>> bool operator!=(not_killed_set_iterator const& iter) {
>> return !((&iter.block) == &(block) && iter.ptr_it == ptr_it); }
>
> This method can (and should) be declared const.
>
>> const DefUseNode* const& operator*() const { return *ptr_it; }
>> };
>>
>> not_killed_set_iterator begin_not_killed_set(){ return
>> DefUseBlock::not_killed_set_iterator(*this, reaches.begin()); }
>> not_killed_set_iterator end_not_killed_set(){ return
>> DefUseBlock::not_killed_set_iterator(*this, reaches.end()); }
>
> Please add a space between () and {
>
>> };
>>
>> //===------------------------- VarDeclMap -------------------------===//
>>
>> struct defuse::VarDeclMap: public StmtVisitor<VarDeclMap>, public
>> llvm::DenseMap<VarDecl const*, DeclStmt const*> {
>
> LLVM has an 80 column coding convention:
>
> http://llvm.org/docs/CodingStandards.html#scf_codewidth
>
> Please wrap lines that exceed this length.
>
>>
>> VarDeclMap(Stmt const* body){ VisitStmt(const_cast<Stmt*>(body)); }
>>
>> void VisitDeclStmt(DeclStmt *DS){
>> for(DeclStmt::decl_iterator it = DS->decl_begin(); it !=
>> DS->decl_end(); it++){
>> if((*it)->getKind() != Decl::Var)
>> continue;
>
> Use dyn_cast instead of getKind.
>
>> insert(std::make_pair(cast<VarDecl>(*it), DS));
>> }
>> }
>> void VisitStmt(Stmt* S){
>> for(Stmt::child_iterator I = S->child_begin(), E =
>> S->child_end(); I != E; ++I)
>> if(*I) Visit(*I);
>> }
>> };
>
> Please make this is a class, as the earlier declaration was also a class.
>
> Please also add a space between 'for' and '('. This is the LLVM
> coding convention.
>
>>
>> //===------------------------- DefUseBasicHelper
>> -------------------------===//
>>
>> struct defuse::DefUseBasicHelper::DefUseBasicHelperImpl{
>> DefUseData* data;
>> VarRefBlockMap* bm;
>> VarDeclMap* dm;
>> unsigned blockID;
>>
>> DefUseBasicHelperImpl(): data(NULL), bm(NULL), dm(NULL){}
>> };
>> defuse::DefUseBasicHelper::DefUseBasicHelper(): pimpl(new
>> DefUseBasicHelperImpl), current_use(DefUseNode::Use){}
>> defuse::DefUseBasicHelper::~DefUseBasicHelper(){ delete pimpl; }
>>
>> void defuse::DefUseBasicHelper::InitializeValues(DefUseData* data,
>> VarRefBlockMap* bm, VarDeclMap* dm){
>> pimpl->data = data; pimpl->bm = bm; pimpl->dm = dm;
>> }
>>
>> void defuse::DefUseBasicHelper::HandleDeclRefExpr(DeclRefExpr *DR){
>> VarRefBlockMap& bm = *pimpl->bm;
>> unsigned blockID = pimpl->blockID;
>> DefUseBlock& block_data = (*pimpl->data)[blockID];
>> if(DR->getDecl()->getKind() == Decl::Var ||
>> DR->getDecl()->getKind() == Decl::ParmVar){
>
> The condition is overly verbose. We also discourage the use of
> 'getKind' unless you are using it in a switch statement.
>
> Suggestion:
>
> Decl *D = DR->getDecl();
> if (isa<VarDecl>(D)) {
> ...
> }
>
> isa<> uses the value returned from VarDecl::classof() to determine if
> 'D' is a VarDecl.
>
>> VarDecl* VD = cast<VarDecl>(DR->getDecl());
>
> Actually, instead of using isa, here is a better style:
>
> if (VarDecl *VD = dyn_cast<VarDecl>(DR->getDecl()) {
> ...
> }
>
> The dyn_cast combines 'isa' with 'cast', and returns a NULL pointer if
> isa would return false. This is the encouraged style.
>
>
>> // map this variable to the current block
>> bm[DR] = blockID;
>> if(block_data.find(VD) == block_data.end()){
>> block_data.insert(std::make_pair(VD, DefUseVect()));
>> block_data.uses.insert(VD); // this variable is used in
>> the current block and it has
>> // no prior definition
>> within the block
>
> If you end up using a DenseMap, this code can be made faster by
> avoiding the double lookup (the first for block_data.find() and the
> second for the insertion):
>
> DefUseBlock *&B = block_data[VD];
> if (!B) {
> B = new DefUseBlock();
> B->users.insert(VD);
> }
>
> Using a DenseMap (which involves changing the data value to a
> pointer), the first line involves a quick hashtable lookup and creates
> a NULL pointer entry if the key could not be found. The code that
> follows conjures up the new DefUseBlock without having to do a second
> search of the map.
>
>> }
>> if(current_use == DefUseNode::UseDef){ // in case of compound
>> operators
>> block_data[VD].push_back(DefUseNode(DR, DefUseNode::Use));
>> block_data[VD].push_back(DefUseNode(DR, DefUseNode::Def));
>
> These lines cause two identical (and redundant) lookups to
> block_data. Consider:
>
> DefUseBlock &B = block_data[VD];
> B.push_back(...);
> B.push_back(...);
>
>> }else
>> block_data[VD].push_back(DefUseNode(DR, current_use));
>> }
>> }
>
> Please add spaces between 'if' and '(' and 'else' and '{'.
>
>>
>> void defuse::DefUseBasicHelper::HandleDeclStmt(DeclStmt *DS){
>> for(DeclStmt::decl_iterator I = DS->decl_begin(), E =
>> DS->decl_end(); I != E; I++){
>> if((*I)->getKind() != Decl::Var)
>> continue;
>> VarRefBlockMap& bm = *pimpl->bm;
>> VarDeclMap& dm = *pimpl->dm;
>> unsigned blockID = pimpl->blockID;
>> DefUseBlock& block_data = (*pimpl->data)[blockID];
>>
>> VarDecl* VD = cast<VarDecl>(*I);
>> if(block_data.find(VD) != block_data.end())
>> ASSERT(true, "Variable '" << VD->getNameAsString() << "'
>> redefined in the same block");
>
> Why not do:
>
> assert(block_data.find(VD) != block_data.end() && "Variable
> redefined in the same block");
>
> It is far more succinct, doesn't add more code in a Release build (as
> ASSERT does), and if we really care what VD is we can run this code in
> the debugger (which we would likely need to do anyway). It also
> reduces the number of lookups in the map in a Release build from two
> to one.
>
>> void defuse::DefUseBasicHelper::HandleBinaryOperator(BinaryOperator* B){
>> DefUseNode::UseKind backup = current_use; // backup the usage
>> if(B->isAssignmentOp()){
>> current_use = DefUseNode::Use;
>> Visit(B->getRHS());
>> current_use = DefUseNode::Def;
>> if(B->isCompoundAssignmentOp())
>> current_use = DefUseNode::UseDef;
>> Visit(B->getLHS());
>> }else{
>> Visit(B->getLHS());
>> current_use = DefUseNode::Use;
>> Visit(B->getRHS());
>> }
>> current_use = backup; // write back the usage to the current usage
>> }
>>
>> void
>> defuse::DefUseBasicHelper::HandleConditionalOperator(ConditionalOperator*
>> C){ }
>>
>> void defuse::DefUseBasicHelper::HandleCallExpr(CallExpr* C) {
>> DefUseNode::UseKind backup = current_use; // backup the usage
>> for(CallExpr::arg_iterator I = C->arg_begin(), E = C->arg_end();
>> I != E; ++I){
>> if((*I)->getType()->isPointerType() ||
>> (*I)->getType()->isReferenceType())
>> current_use = DefUseNode::Def;
>> Visit(*I);
>> current_use = backup;
>> }
>> }
>>
>> void defuse::DefUseBasicHelper::HandleUnaryOperator(UnaryOperator* U){
>> DefUseNode::UseKind backup = current_use; // backup the usage
>> switch(U->getOpcode()){
>> case UnaryOperator::PostInc:
>> case UnaryOperator::PostDec:
>> case UnaryOperator::PreInc:
>> case UnaryOperator::PreDec:
>> current_use = DefUseNode::UseDef;
>> break;
>> case UnaryOperator::Plus:
>> case UnaryOperator::Minus:
>> case UnaryOperator::Not:
>> case UnaryOperator::LNot:
>> current_use = DefUseNode::Use;
>> break;
>> case UnaryOperator::AddrOf:
>> case UnaryOperator::Deref:
>> // use the current_use
>> break;
>> default:
>> // DEBUG("Operator " <<
>> UnaryOperator::getOpcodeStr(U->getOpcode()) << " not supported in
>> def-use analysis");
>> break;
>> }
>> Visit(U->getSubExpr());
>> current_use = backup; // write back the usage to the current usage
>> }
>>
>> void
>> defuse::DefUseBasicHelper::HandleArraySubscriptExpr(ArraySubscriptExpr*
>> AS){
>> DefUseNode::UseKind backup = current_use; // backup the usage
>> Visit(AS->getBase());
>> current_use = DefUseNode::Use;
>> Visit(AS->getIdx());
>> current_use = backup; // write back the usage to the current usage
>> }
>>
>> void defuse::DefUseBasicHelper::VisitCFGBlock(CFGBlock const& blk,
>> int root){
>> pimpl->blockID = blk.getBlockID();
>> for(CFGBlock::const_iterator I = blk.begin(), E = blk.end(); I !=
>> E; I++)
>> Visit(*I);
>>
>> unsigned blockID = pimpl->blockID;
>> DefUseBlock& block_data = (*pimpl->data)[blockID];
>>
>> // Post processing
>> // build up the uses(b), defsout(b) and killed(b)
>>
>> // for each variable
>> for(DefUseBlock::iterator I = block_data.begin(), E =
>> block_data.end(); I != E; ++I){
>> // in case of references to ParmVar what we do is insert the
>> variable definition in the first block of the CFG
>> DefUseBlock& root_data = (*pimpl->data)[root];
>> if(I->first->getKind() == Decl::ParmVar &&
>> std::find(root_data[I->first].begin(),
>> root_data[I->first].end(), DefUseNode(I->first)) ==
>> root_data[I->first].end()){
>> root_data[I->first].push_back(DefUseNode(I->first));
>> root_data.defsout.push_back(&root_data[I->first].back());
>> }
>> // if this is a local variable, should't be in the
>> defsout,killed sets
>> // defsout is calculated by iterating backwards to find the
>> last use of the variable
>> for(DefUseVect::const_reverse_iterator UI =
>> I->second.rbegin(), UE = I->second.rend(); UI != UE; UI++)
>> if(UI->getUse() == DefUseNode::Def &&
>> (block_data.defsout.empty() || block_data.defsout.back()->getDecl()
>> != I->first)){
>> block_data.defsout.push_back(&(*UI));
>> block_data.killed.insert(UI->getDecl());
>> break;
>> }
>> }
>> }
>>
>> void defuse::DefUseBasicHelper::VisitStmt(Stmt* S){
>> for(Stmt::child_iterator I = S->child_begin(), E =
>> S->child_end(); I != E; ++I)
>> if(*I) Visit(*I);
>> }
>>
>> bool ContainsStmt(Stmt const* block, Stmt const* stmt){
>> if(block == stmt)
>> return true;
>> for(Stmt::const_child_iterator I = block->child_begin(), E =
>> block->child_end(); I != E; ++I)
>> if(*I){
>> if(ContainsStmt(*I, stmt)) return true;
>> }
>
> How about:
>
> if (*I && ContainsStmt(*I, stmt))
> return true;
>
>> return false;
>> }
>>
>> Stmt const* GetEnclosingBlock(ParentMap const& pm, Stmt const* stmt){
>> Stmt const* E = stmt;
>> while(E){
>> Stmt const* cond = NULL;
>> switch(E->getStmtClass()){
>> case Stmt::IfStmtClass:
>> cond = cast<IfStmt>(E)->getCond();
>> break;
>> case Stmt::WhileStmtClass:
>> cond = cast<WhileStmt>(E)->getCond();
>> break;
>> case Stmt::DoStmtClass:
>> cond = cast<DoStmt>(E)->getCond();
>> break;
>> case Stmt::ForStmtClass:
>> cond = cast<ForStmt>(E)->getCond();
>> break;
>> case Stmt::SwitchStmtClass:
>> cond = cast<SwitchStmt>(E)->getCond();
>> break;
>> case Stmt::CompoundStmtClass:
>> return E;
>> default:
>> break;
>> }
>> if(cond && !ContainsStmt(cond, stmt))
>> return E;
>> E = pm.getParent(E);
>> }
>> return NULL;
>> }
>>
>> void RemoveLocalDefs(CFG const& cfg, ParentMap const& pm, VarDeclMap&
>> decl_vars, DefUseData& data){
>> for(CFG::const_reverse_iterator it = cfg.rbegin(), cend =
>> cfg.rend(); it != cend; it++) {
>> // every time we find a node with more than 1 predecessor it
>> means
>> // we are at a junction point and the local variable defined
>> in the previuos
>> // blocks must be removed from the defsout set
>> if(it->pred_size() <= 1)
>> continue;
>> // for each predecessor
>> for(CFGBlock::const_pred_iterator pit = it->pred_begin(), end
>> = it->pred_end(); pit != end && (*pit)->size(); pit++){
>> DefUseBlock& block_data = data[(*pit)->getBlockID()];
>> for(VarRefsVect::iterator dit =
>> block_data.defsout.begin(); dit != block_data.defsout.end(); dit++){
>> // we have to be sure the element we are using to see
>> if the Decl is in the same block of block statements
>> // is also present in the AST, so if the first
>> statement of the CFG block is a VarDecl, we use the VarDeclMap
>> // to retrieve the corresponding AST node
>> Stmt const* testStmt = (*pit)->front();
>> if(testStmt->getStmtClass() == Stmt::DeclStmtClass)
>> testStmt =
>> decl_vars[cast<VarDecl>(cast<DeclStmt>(testStmt)->getSingleDecl())];
>
> Use 'dyn_cast' or 'isa' instead of 'getStmtClass()'
>
>
>> // if the variable is local to one of the prec blocks
>> and we are at a junction point
>> // it means the variable must be removed from the
>> defsout set
>> if(GetEnclosingBlock(pm,
>> decl_vars[(*dit)->getDecl()]) == GetEnclosingBlock(pm, testStmt)){
>> // it can be the case the following block is in
>> the same scope of the previous block (for example if stmt without else)
>> Stmt const* testStmt = it->front();
>> if(testStmt->getStmtClass() == Stmt::DeclStmtClass)
>> testStmt =
>> decl_vars[cast<VarDecl>(cast<DeclStmt>(testStmt)->getSingleDecl())];
>
> Use 'dyn_cast' or 'isa' instead of 'getStmtClass()'
>
>
>> if(GetEnclosingBlock(pm,
>> decl_vars[(*dit)->getDecl()]) != GetEnclosingBlock(pm, testStmt)){
>> block_data.defsout.erase(dit--);
>> }
>> }
>> }
>> }
>> }
>> }
>>
>> // implements reaching definitions according to Kennedy's book
>> iterative algorithm
>> void ComputeReaches(CFG const& cfg, DefUseData& data){
>> bool changed = true;
>> while(changed){
>> changed = false;
>> VarRefsSet newreaches;
>> for(CFG::const_reverse_iterator it = cfg.rbegin(); it !=
>> cfg.rend(); it++) {
>> DefUseBlock& BlockData = data[it->getBlockID()];
>> newreaches = VarRefsSet();
>> for(CFGBlock::const_pred_iterator pit = it->pred_begin(),
>> end = it->pred_end(); pit != end; pit++){
>> unsigned p = (*pit)->getBlockID();
>> // s(p) = reaches(p) - killed(p)
>> VarRefsSet s, temp_set, temp_set_1;
>> copy(data[p].begin_not_killed_set(),
>> data[p].end_not_killed_set(), inserter(s, s.begin()));
>> // tmp_set(p) = defsout(p) U s(p)
>> set_union(s.begin(), s.end(),
>> data[p].defsout.begin(), data[p].defsout.end(), inserter(temp_set,
>> temp_set.begin()));
>> // tmp_set_1 = newreaches U tmp_set(p)
>> set_union(newreaches.begin(), newreaches.end(),
>> temp_set.begin(), temp_set.end(), inserter(temp_set_1,
>> temp_set_1.begin()));
>> newreaches.swap(temp_set_1);
>> }
>> if(BlockData.reaches.size() != newreaches.size()){
>> BlockData.reaches = VarRefsVect(newreaches.begin(),
>> newreaches.end());
>> changed = true;
>> }
>> }
>> }
>> }
>>
>> //===------------------------- Printing utilities
>> -------------------------===//
>>
>> struct printer: public std::iterator<std::input_iterator_tag,
>> printer, std::ptrdiff_t, printer>{
>> printer(llvm::raw_ostream& x, ASTContext const& ctx_, const char*
>> s = "") : o(x), ctx(ctx_), delim(s) { }
>> template<typename T>
>> printer& operator=(const T& x);
>> printer& operator*() { return *this; }
>> printer& operator++() { return *this; }
>> printer& operator++(int) { return *this; }
>>
>> mutable llvm::raw_ostream& o;
>> ASTContext const& ctx;
>> const char* delim;
>> };
>>
>> template<>
>> printer& printer::operator=<VarDecl const*>(const VarDecl* const& x){
>> o << x->getNameAsString() << delim;
>> return *this;
>> }
>>
>> template<>
>> printer& printer::operator=<DefUseNode const*>(const DefUseNode*
>> const& x){
>> SourceLocation loc;
>> if(x->getKind() == DefUseNode::VarRef)
>> loc = x->getVarRef()->getLocStart();
>> else
>> loc = x->getVarDecl()->getTypeSpecStartLoc();
>> o << x->getDecl()->getNameAsString() << " " << "(" << Line(loc,
>> ctx.getSourceManager()) << ","
>> << Column(loc, ctx.getSourceManager()) << " "
>> << ((x->getUse() == DefUseNode::Use)?"USE":"DEF") << ")" <<
>> delim;
>> return *this;
>> }
>>
>> template<>
>> printer& printer::operator=<DefUseNode>(DefUseNode const& x){ return
>> this->operator=(&x); }
>>
>> void printDefUseData(llvm::raw_ostream& out, ASTContext const& ctx,
>> DefUseData const& data){
>> out << "--------------- PRINTING ANALYSIS DATA
>> ----------------------\n";
>> unsigned blockID = data.size() - 1;
>> for(DefUseData::const_reverse_iterator it = data.rbegin(); it !=
>> data.rend(); it++, blockID--){
>> out << "* BlockID '" << blockID << "' uses vars:\n";
>> for(DefUseBlock::const_iterator pit = it->begin(); pit !=
>> it->end(); pit++){
>> out << "'" << pit->first->getNameAsString() << "':\n\t";
>> std::copy( pit->second.begin(), pit->second.end(),
>> printer( out, ctx, ", " ) );
>> out << "\n";
>> }
>> }
>>
>> blockID = data.size() - 1;
>> for(DefUseData::const_reverse_iterator it = data.rbegin(); it !=
>> data.rend(); it++, blockID--) {
>> DefUseBlock const& BlockData = *it;
>> out << "---------------- Block: " << blockID << "
>> ----------------\n";
>> out << "# uses(" << blockID << "): {";
>> copy( BlockData.uses.begin(), BlockData.uses.end(), printer(
>> out, ctx, ", " ) );
>> out << "}\n";
>> out << "# defsout(" << blockID << "): {";
>> copy( BlockData.defsout.begin(), BlockData.defsout.end(),
>> printer( out, ctx, ", " ) );
>> out << "}\n";
>> out << "# killed(" << blockID << "): {";
>> copy( BlockData.killed.begin(), BlockData.killed.end(),
>> printer( out, ctx, ", " ) );
>> out << "}\n";
>> out << "# reaches(" << blockID << "): {";
>> copy( BlockData.reaches.begin(), BlockData.reaches.end(),
>> printer( out, ctx, ", " ) );
>> out << "}\n";
>> }
>> }
>>
>> //===------------------------- DefUse -------------------------===//
>>
>> DefUse::~DefUse(){ delete analysis_data; delete decl_map; delete
>> block_map; }
>>
>> bool isDef(ASTContext const& ctx, DefUseNode const& n, int* blockID,
>> DefUseVect::const_iterator* iter,
>> DefUseData const* analysis_data, VarDeclMap const* decl_map,
>> VarRefBlockMap const* block_map, Stmt const** stmt = NULL){
>> // retrieve the blockID of the CFG where the statement N (which
>> can be wither a VarDecl or a DeclRefExpr) is placed
>> Stmt const* _stmt_;
>> if(n.getKind() == DefUseNode::VarDecl){
>> VarDecl const* vd = n.getVarDecl();
>> VarDeclMap::const_iterator it = decl_map->find(vd);
>> // we use the decl_map to translate the VarDecl to the
>> corresponding DeclStmt
>> // this is useful when DeclStmts from the CFG (which usually
>> are different from the original ones) are used
>> ASSERT(it != decl_map->end(), "Variable declaration for '" <<
>> vd->getNameAsString() << "' not mapped to any statement in the syntax
>> tree");
>> _stmt_ = it->second;
>> }else
>> _stmt_ = n.getVarRef();
>> assert(_stmt_ != NULL);
>> VarRefBlockMap::const_iterator it = block_map->find(_stmt_);
>> ASSERT(it != block_map->end(), "Statement '" <<
>> PrintClangStmt(_stmt_, ctx) << "' not mapped to any block of the CFG");
>>
>> // We have to find the definition inside the block
>> DefUseBlock::const_iterator data_it =
>> (*analysis_data)[it->second].find(n.getDecl());
>> // if we don't find the variable among the variable used or
>> defined in the block it means something went wrong
>> ASSERT(data_it != (*analysis_data)[it->second].end(), "Block ID:
>> " << it->second << " doesn't contains usage or definitions of
>> variable " << n.getDecl()->getNameAsString());
>>
>> // we find a Variable definition, now we have to iterate through
>> the DefUseNodes in order to find the
>> // the node corresponding to the n variable passed by the user
>> DefUseVect::const_iterator _iter_ =
>> std::find(data_it->second.begin(), data_it->second.end(), n);
>> if(blockID) *blockID = it->second;
>> if(iter) *iter = _iter_;
>> if(stmt) *stmt = _stmt_;
>> return _iter_ != data_it->second.end();
>> }
>>
>> bool isUse(ASTContext const& ctx, clang::DeclRefExpr const* var, int*
>> blockID, DefUseVect::const_iterator* iter,
>> DefUseData const* analysis_data, VarRefBlockMap const*
>> block_map){
>> Stmt const* stmt = var;
>> VarRefBlockMap::const_iterator it = block_map->find(stmt);
>> ASSERT(it != block_map->end(), "Statement " <<
>> PrintClangStmt(stmt, ctx) << " not mapped to any block of the CFG!");
>>
>> DefUseBlock::const_iterator data_it =
>> (*analysis_data)[it->second].find(cast<VarDecl>(var->getDecl()));
>> ASSERT(data_it != (*analysis_data)[it->second].end(),
>> "Block ID: " << it->second << " doesn't contains usage or
>> definitions of variable " << var->getDecl()->getNameAsString());
>>
>> DefUseVect::const_iterator _iter_ =
>> std::find(data_it->second.begin(), data_it->second.end(),
>> DefUseNode(var, DefUseNode::Use));
>> if(blockID) *blockID = it->second;
>> if(iter) *iter = _iter_;
>> return _iter_ != data_it->second.end();
>> }
>>
>> //===------------------------- uses_iterator
>> -------------------------===//
>>
>> DefUse::uses_iterator::uses_iterator(DefUse const* du, DefUseNode
>> const& n):
>> DefUse::iterator_impl(du), iter_ptr(-1), inDefBlock(true)
>> {
>> Stmt const* stmt = NULL;
>> if(!::isDef(du->ctx, n, &iter_ptr.block_id, &iter_ptr.block_it,
>> du->analysis_data, du->decl_map, du->block_map, &stmt))
>> throw NotDefExcpetion(stmt); // there is no definition
>> for the DeclRefExpr passed by the user
>> // this means he trying to
>> list the uses of a declrefexpr which is a use!
>> // an excpetion is thrown.
>> node = &(*(iter_ptr.block_it));
>> inc(true);
>> }
>>
>> bool DefUse::uses_iterator::operator!=(uses_iterator const& iter){
>> return !(iter_ptr.block_id == iter.iter_ptr.block_id &&
>> iter.iter_ptr.block_it == iter.iter_ptr.block_it);
>> }
>>
>> DeclRefExpr const* DefUse::uses_iterator::operator*() {
>> assert(iter_ptr.block_it->getKind() == DefUseNode::VarRef);
>> return iter_ptr.block_it->getVarRef();
>> }
>>
>> DefUse::uses_iterator& DefUse::uses_iterator::inc(bool first){
>> if(iter_ptr.block_id == -1) return *this;
>> DefUseBlock const& m = (*du->analysis_data)[iter_ptr.block_id];
>> DefUseBlock::const_iterator it = m.find(node->getDecl());
>> assert(it != m.end());
>> if(iter_ptr.block_it != it->second.end())
>> iter_ptr.block_it++;
>>
>> if(iter_ptr.block_it != it->second.end() &&
>> iter_ptr.block_it->getDecl() == node->getDecl()){
>> if(iter_ptr.block_it->getUse() == DefUseNode::Use)
>> return *this; // this is a use of the variable
>> else if(inDefBlock){
>> // found a new definition, stop iterating
>> iter_ptr.block_it = DefUseVect::const_iterator();
>> iter_ptr.block_id = -1;
>> return *this;
>> }
>> }
>> // if variable in defsout we have to check successor blocks
>> if(inDefBlock && std::find(m.defsout.begin(), m.defsout.end(),
>> node) == m.defsout.end()){
>> iter_ptr.block_it = DefUseVect::const_iterator();
>> iter_ptr.block_id = -1; // it means there are no other uses
>> of this variable
>> return *this;
>> }
>> // We have to change block
>> // the following code has the precondition the blocks are
>> numbered in a decrescent order
>> // we lock into the block_id and see if the decl is in the
>> reaches() set or not
>> int new_block = iter_ptr.block_id;
>> while(--new_block >= 0){
>> DefUseBlock const& m = (*du->analysis_data)[new_block];
>> if(std::find(m.reaches.begin(), m.reaches.end(), node) !=
>> m.reaches.end()){
>> inDefBlock = false;
>> DefUseBlock::const_iterator data_it =
>> (*du->analysis_data)[new_block].find(node->getDecl());
>> if(data_it != (*du->analysis_data)[new_block].end())
>> if(data_it->second.begin()->getUse() ==
>> DefUseNode::Use){
>> iter_ptr.block_id = new_block;
>> iter_ptr.block_it = data_it->second.begin();
>> return *this;
>> }
>> }
>> }
>> iter_ptr.block_it = DefUseVect::const_iterator();
>> iter_ptr.block_id = -1;
>> return *this;
>> }
>>
>> //===------------------------- defs_iterator
>> -------------------------===//
>>
>> DefUse::defs_iterator::defs_iterator(DefUse const* du, DeclRefExpr
>> const& n):
>> iterator_impl(du), iter_ptr(-1), blockDef(false)
>> {
>> if(!::isUse(du->ctx, &n, &iter_ptr.block_id, &iter_ptr.block_it,
>> du->analysis_data, du->block_map))
>> throw NotUseExcpetion(&n); // the DeclRefExpr passed by the
>> user is never used in this block but only defined,
>> // the user is trying to iterate
>> through the definitions of a definition... which doesn't make sense
>> // an exception must be thrown
>> iter_ptr.reaches_it =
>> (*du->analysis_data)[iter_ptr.block_id].reaches.begin();
>> node = &(*(iter_ptr.block_it));
>> inc(true);
>> }
>> bool DefUse::defs_iterator::operator!=(defs_iterator const& iter){
>> return !(iter_ptr.block_id == iter.iter_ptr.block_id &&
>> iter.iter_ptr.block_it == iter.iter_ptr.block_it &&
>> iter.iter_ptr.reaches_it == iter_ptr.reaches_it);
>> }
>>
>> DefUseNode const* DefUse::defs_iterator::operator*() {
>> DefUseBlock::const_iterator it =
>> (*du->analysis_data)[iter_ptr.block_id].find(node->getDecl());
>> assert(it != (*du->analysis_data)[iter_ptr.block_id].end());
>> if(blockDef)
>> return &(*iter_ptr.block_it);
>> return *iter_ptr.reaches_it;
>> }
>>
>> DefUse::defs_iterator::defs_iterator& DefUse::defs_iterator::inc(bool
>> first){
>> if(iter_ptr.block_id == -1) return *this;
>> // look inside reaches vector
>> DefUseBlock::const_iterator it =
>> (*du->analysis_data)[iter_ptr.block_id].find(node->getDecl());
>> assert(it != (*du->analysis_data)[iter_ptr.block_id].end());
>> if(first){
>> // we have to find a def in the block, if exist
>> for(; iter_ptr.block_it >= it->second.begin();
>> iter_ptr.block_it--)
>> if(iter_ptr.block_it->getUse() == DefUseNode::Def){
>> blockDef = true;
>> break;
>> }
>> // the def is in the block, so it means we don't have to look
>> any further
>> // next time the iterator is incremented we return an empty one
>> if(blockDef)
>> return *this;
>>
>> }else if(blockDef){
>> iter_ptr.block_it = DefUseVect::const_iterator();
>> iter_ptr.reaches_it = VarRefsVect::const_iterator();
>> iter_ptr.block_id = -1;
>> return *this;
>> }
>> // we didn't find a def in the block, so we have to look at the
>> reaches set
>> VarRefsVect const& reaches_v =
>> (*du->analysis_data)[iter_ptr.block_id].reaches;
>> if(!first && iter_ptr.reaches_it != reaches_v.end())
>> iter_ptr.reaches_it++;
>> for(; iter_ptr.reaches_it != reaches_v.end(); iter_ptr.reaches_it++)
>> if((*iter_ptr.reaches_it)->getDecl() == node->getDecl())
>> return *this;
>>
>> iter_ptr.block_it = DefUseVect::const_iterator();
>> iter_ptr.reaches_it = VarRefsVect::const_iterator();
>> iter_ptr.block_id = -1;
>> return *this;
>> }
>>
>> DefUse::uses_iterator defuse::DefUse::uses_begin(clang::DeclRefExpr
>> const* var) const{
>> return DefUse::uses_iterator(this, DefUseNode(var,
>> DefUseNode::Def));
>> }
>> DefUse::uses_iterator defuse::DefUse::uses_begin(clang::VarDecl
>> const* var) const{
>> return DefUse::uses_iterator(this, DefUseNode(var));
>> }
>> DefUse::uses_iterator defuse::DefUse::uses_end() const{ return
>> DefUse::uses_iterator(); }
>>
>> DefUse::defs_iterator defuse::DefUse::defs_begin(clang::DeclRefExpr
>> const* var) const{
>> return DefUse::defs_iterator(this, *var);
>> }
>> DefUse::defs_iterator defuse::DefUse::defs_end() const{ return
>> DefUse::defs_iterator(); }
>>
>> bool defuse::DefUse::isUse(clang::DeclRefExpr const* var) const {
>> return ::isUse(ctx, var, NULL, NULL, analysis_data, block_map); }
>>
>> bool defuse::DefUse::isDef(DefUseNode const& n) const { return
>> ::isDef(ctx, n, NULL, NULL, analysis_data, decl_map, block_map); }
>>
>> //===------------------------- BuildDefUseChains
>> -------------------------===//
>>
>> DefUse* defuse::DefUse::BuildDefUseChains(Stmt* body, ASTContext
>> *ctx, DefUseBasicHelper* helper,
>> CFG* cfg, ParentMap* pm, bool verbose, llvm::raw_ostream& out)
>> {
>> bool cleanCFG = false, cleanParentMap = false, cleanHelper = false;
>> if(!cfg){
>> cfg = CFG::buildCFG(body, ctx);
>> cleanCFG = true;
>> }
>> if(!pm){
>> pm = new ParentMap(body);
>> cleanParentMap = true;
>> }
>> DefUseData* data = new DefUseData(cfg->getNumBlockIDs());
>>
>> VarDeclMap* dm = new VarDeclMap(body); // map VarDecl to AST
>> nodes - this is needed as when the CFG is created by CLANG
>> // DeclStmt with several
>> declarations are rewritten into separate DeclStmts
>>
>> VarRefBlockMap* bm = new VarRefBlockMap; // map each VarRef
>> to the CFG block where it has been used
>>
>> if(verbose)
>> cfg->dump(ctx->getLangOptions());
>>
>> if(!helper){
>> helper = new DefUseBasicHelper;
>> cleanHelper = true;
>> }
>> helper->InitializeValues(data, bm, dm);
>>
>> for(CFG::const_iterator it = cfg->begin(); it != cfg->end(); it++)
>> helper->VisitCFGBlock(*it, cfg->getNumBlockIDs()-1);
>
> You should not rely on the root being number 'getNumBlockIDs() - 1'.
> This actually is not an invariant you should depend on. If you want
> to identify the entry block, use 'CFG::getEntry()'.
>
>>
>> // remove from defsout local variables
>> RemoveLocalDefs(*cfg, *pm, *dm, *data);
>> // Calculates the reaches set for each block of the CFG
>> ComputeReaches(*cfg, *data);
>>
>> if(verbose)
>> printDefUseData(out, *ctx, *data);
>>
>> unsigned block_id = cfg->getNumBlockIDs();
>> if(cfg && cleanCFG)
>> delete cfg;
>> if(pm && cleanParentMap)
>> delete pm;
>> if(helper && cleanHelper)
>> delete helper;
>> return new DefUse(*ctx, data, dm, bm, block_id);
>> }
>
> Instead of BuildDefUseChains building its own CFG and ParentMap, have
> these be passed in from the client. This allows reuse of existing
> data that has likely been constructed for other uses. You can also
> make an alternate version of BuildDefUseChains that is passed an
> AnalysisManager and uses those to create the CFGs and ParentMap.
>
> Also, consider using llvm::OwningPtr to auto-reclaim 'helper' and
> friends. For example:
>
> llvm::OwningPtr<DefUseBasicHelper> helperReclaim;
> if (!helper) {
> helper = new DefUseBasicHelper();
> helperReclaim.reset(helper);
> }
>
> You can then forget about the later cleanup because it will
> automatically happen when the function returns. This idiom is really
> powerful when you have multiple return sites, as you can easily forget
> to do the proper cleanup. It also cleans up the code, as all the
> cleanup is just boilerplate that is easily handled by the OwningPtr.
>
>
>>
>> //===------------------------- DefUseChainTest
>> -------------------------===//
>>
>> void DefUseChainTest::HandleTopLevelDecl(DeclGroupRef declRef) {
>> for(DeclGroupRef::iterator it = declRef.begin(); it <
>> declRef.end(); it++){
>> if((*it)->getKind() != Decl::Function)
>> continue;
>
> Use 'dyn_cast' or 'isa' instead of 'getKind()'
>
>
>> // Top-level Function declaration
>> FunctionDecl* func_decl = cast<FunctionDecl> (*it);
>> Stmt* func_body = func_decl->getBodyIfAvailable();
>> if(!func_body)
>> continue;
>>
>> out << "--- Building Def-Use Chains for function '" <<
>> func_decl->getNameAsString() << ":" << func_decl->getNumParams() <<
>> "' ---\n";
>> du = DefUse::BuildDefUseChains(func_body, ctx, 0, 0, 0, true);
>> out << "------ Testing Def-Use iterators ------\n";
>> VisitStmt(func_body);
>> out << "**************************************************\n";
>> delete du;
>> }
>> }
>>
>> void defuse::PrintVarDefs(DefUse const* DU, DeclRefExpr const* DR,
>> ASTContext& ctx, llvm::raw_ostream& out){
>> if(DR->getDecl()->getKind() != Decl::Var &&
>> DR->getDecl()->getKind() != Decl::ParmVar)
>> return;
>
> Use 'dyn_cast' or 'isa' instead of 'getKind()'
>
>
>>
>> VarDecl const* VD = cast<VarDecl>(DR->getDecl());
>> SourceLocation loc = DR->getLocStart();
>> out << "Variable ref '" << VD->getNameAsCString() << "' in LOC: "
>> << "(" << Line(loc, ctx.getSourceManager()) << "," <<
>> Column(loc, ctx.getSourceManager()) << ") -> defs_iterator:\n";
>> try{
>> for(DefUse::defs_iterator it = DU->defs_begin(DR); it !=
>> DU->defs_end(); it++){
>> SourceLocation loc;
>> if((*it)->getKind() == DefUseNode::VarRef)
>> loc = (*it)->getVarRef()->getLocStart();
>> else
>> loc = (*it)->getVarDecl()->getTypeSpecStartLoc();
>> out << "\t* Found DEF in loc: " << "(" << Line(loc,
>> ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager())
>> << ")\n";
>> }
>> }catch(NotUseExcpetion e){
>> SourceLocation loc = e.expr->getLocStart();
>> out << "ERROR: Variable " << PrintClangStmt(e.expr, ctx) << "
>> in location : "
>> << "(" << Line(loc, ctx.getSourceManager()) << "," <<
>> Column(loc, ctx.getSourceManager()) << ") is not a use for variable:
>> " << VD->getNameAsCString() << "\n";
>
> Again, LLVM doesn't use C++ exceptions. You'll need to come up with
> an alternate mechanism.
>
>
>> }
>> }
>> void PrintVarUses(DefUse const* DU, DefUseNode const& n, ASTContext&
>> ctx, llvm::raw_ostream& out){
>> VarDecl const* VD = n.getDecl();
>> SourceLocation loc = VD ->getTypeSpecStartLoc();
>> if(n.getKind() == DefUseNode::VarRef)
>> loc = n.getVarRef()->getLocStart();
>>
>> out << "Variable ref '" << VD->getNameAsCString() << "' in LOC: "
>> << "(" << Line(loc, ctx.getSourceManager()) << "," <<
>> Column(loc, ctx.getSourceManager()) << ") -> uses_iterator:\n";
>> try{
>> DefUse::uses_iterator it = DU->uses_end();
>> if(n.getKind() == DefUseNode::VarRef)
>> it = DU->uses_begin(n.getVarRef());
>> else
>> it = DU->uses_begin(VD);
>> for(; it != DU->uses_end(); it++){
>> SourceLocation loc = (*it)->getLocStart();
>> out << "\t* Found USE in loc: " << "(" << Line(loc,
>> ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager())
>> << ")\n";
>> }
>> }catch(NotDefExcpetion e){
>> SourceLocation loc = e.stmt->getLocStart();
>> out << "ERROR: Variable " << PrintClangStmt(e.stmt, ctx) << "
>> in location : "
>> << "(" << Line(loc, ctx.getSourceManager()) << "," <<
>> Column(loc, ctx.getSourceManager()) << ") is not a definition for
>> variable: " << VD->getNameAsCString() << "\n";
>> }
>
> Again, LLVM doesn't use C++ exceptions. You'll need to come up with
> an alternate mechanism.
>
>> }
>>
>> inline void defuse::PrintVarUses(DefUse const* DU, DeclRefExpr const*
>> DR, ASTContext& ctx, llvm::raw_ostream& out){ ::PrintVarUses(DU,
>> DefUseNode(DR), ctx, out); }
>> inline void defuse::PrintVarUses(DefUse const* DU, VarDecl const* VD,
>> ASTContext& ctx, llvm::raw_ostream& out){ ::PrintVarUses(DU,
>> DefUseNode(VD), ctx, out); }
>>
>> void defuse::DefUseChainTest::VisitDeclRefExpr(DeclRefExpr *DR){
>> if(DR->getDecl()->getKind() == Decl::Var ||
>> DR->getDecl()->getKind() == Decl::ParmVar){
>
> Use 'dyn_cast' or 'isa' instead of 'getKind()'
>
>> PrintVarUses(du, DR, *ctx, out);
>> PrintVarDefs(du, DR, *ctx, out);
>> }
>> }
>>
>> void defuse::DefUseChainTest::VisitDeclStmt(DeclStmt *DS){
>> for(DeclStmt::decl_iterator it = DS->decl_begin(); it !=
>> DS->decl_end(); it++){
>> if((*it)->getKind() != Decl::Var)
>> continue;
>>
>> VarDecl* VD = cast<VarDecl>(*it);
>> PrintVarUses(du, VD, *ctx, out);
>>
>> if(VD->getInit())
>> Visit(VD->getInit());
>
> How about:
>
> if (Stmt *S = VD->getInit())
> Visit(S);
>
>
>> }
>> }
>>
>> void defuse::DefUseChainTest::VisitStmt(Stmt* S){
>> for(Stmt::child_iterator I = S->child_begin(), E =
>> S->child_end(); I != E; ++I)
>> if(*I) Visit(*I);
>> }
>
More information about the cfe-dev
mailing list