<div dir="ltr"><div dir="ltr"><div>Hello,</div><div><br></div><div>I need to instrument C++ programs to keep track of some variables of interest and branches taken. I tried to write a clang plugin that modifies the AST to insert calls but the AST I generate seems to break code generation later on. I want to work on the AST level because some of the relationship to the source code constructs are lost when working on a later stage of the compilation pipeline, e.g. LLVM bitcode.<br></div><div><br></div><div> What is the most reliable way to inspect the AST and modify the program (e.g. to remove, add, or change some statements)? The approaches I found are below but I couldn't figure out which would be the best option for implementation:</div><div><br></div><div>- Using libtooling and Rewriter to re-write the source code. This approach injects raw text into the program, which seemed error prone to me.<br></div><div>- Writing a TreeTransformer in Sema to recursively build a new AST with the intended changes. This seems like the best option right now but it requires modifying the compiler rather than "just writing a plugin", is that correct?<br></div><div>- Modifying CodeGen to emit extra/different code on program points of interest. This seems like the Sema approach, with the difference of creating LLVM instructions rather than AST nodes.<br></div><div><br></div><div>A code snippet showing how I am trying to manipulate the AST is below, in case it is of any help and in case there is a way to do it with the plugin infrastructure:</div><div><br></div><div>------<br></div><div><br></div><div>Â Â Â // Instrument the entry point of the given function<br>Â Â Â // If we are processing ReturnType f(T1 arg1, T2 arg2, ...) { ... }<br>Â Â Â // we inject the call "record_entry_of_f(arg1, arg2, ...)"<br>Â Â Â // to the beginning of the function body. And, the type of record_entry_of_f<br>Â Â Â // is (void)(T1 &, T2 &, ...)<br>Â Â Â void instrument_fn(FunctionDecl &fd) {<br>Â Â Â Â Â assert(fd.getBody() != nullptr && "The function to instrument should have a body");<br>Â Â Â Â Â // Inject instrumentation code to entry<br>Â Â Â Â Â auto &body = *fd.getBody();<br>Â Â Â Â Â auto &ctx = fd.getASTContext();<br>Â Â Â Â Â auto instrumenter_fn_name = "record_entry_of_" + get_mangled_name(fd);<br>Â Â Â Â Â // support_decls is a map from support function names to declaration of those functions<br>Â Â Â Â Â // it's type is std::unordered_map<std::string, FunctionDecl *><br>Â Â Â Â Â if (support_decls.find(instrumenter_fn_name) == support_decls.end()) {<br>Â Â Â Â Â Â Â llvm::errs() << "Could not find the instrumenter for ";<br>Â Â Â Â Â Â Â fd.printQualifiedName(llvm::errs());<br>Â Â Â Â Â Â Â llvm::errs() << "\n";<br>Â Â Â Â Â Â Â llvm::errs() << "looked for " << instrumenter_fn_name << "\n";<br>Â Â Â Â Â Â Â return;<br>Â Â Â Â Â }<br>Â Â Â Â Â auto entry_fn_decl = support_decls[instrumenter_fn_name];<br>Â Â Â Â Â auto entry_fn = new (ctx) DeclRefExpr(entry_fn_decl, false, entry_fn_decl->getType(), VK_RValue, body.getBeginLoc());<br>Â Â Â Â Â auto args = new std::vector<Expr *>();<br>Â Â Â Â Â auto typ = dyn_cast<FunctionProtoType>(fd.getFunctionType());<br><br>Â Â Â Â Â if (typ == nullptr) {<br>Â Â Â Â Â Â Â Â Â // Do not instrument if we don't have type information<br>Â Â Â Â Â Â Â Â Â return;<br>Â Â Â Â Â }<br><br>Â Â Â Â Â size_t i = 0;<br>Â Â Â Â Â for (auto param : fd.parameters()) {<br>Â Â Â Â Â Â Â Â Â args->push_back(new (ctx) DeclRefExpr(param, false, typ->getParamType(i), VK_RValue, body.getBeginLoc()));<br>Â Â Â Â Â Â Â Â Â i++;<br>Â Â Â Â Â }<br>Â Â Â Â Â // We are passing a fake source location to create the call expression, specifically the beginning location of the body<br>Â Â Â Â Â Stmt * entry_call = new (ctx) CallExpr(fd.getASTContext(), entry_fn, ArrayRef(*args), ctx.VoidTy, VK_RValue, body.getBeginLoc());<br>Â Â Â Â Â // Create a new compound statement with the call and the original body and make it fd's new body<br>Â Â Â Â Â std::vector<Stmt *> newBody = {entry_call, fd.getBody()};<br>Â Â Â Â Â fd.setBody(CompoundStmt::Create(ctx, ArrayRef<Stmt *>(newBody), body.getBeginLoc(), body.getEndLoc()));<br>Â Â Â Â Â // Dump the AST for debugging<br>Â Â Â Â Â fd.getBody()->dumpColor();<br>Â Â Â }</div><div><br></div><div>------<br></div><div><br></div><div>Thanks,</div><div>Mehmet<br></div><div><br></div><div><br></div></div></div>