[cfe-dev] Manipulating/transforming ASTs for instrumentation
Mehmet Emre via cfe-dev
cfe-dev at lists.llvm.org
Wed May 8 16:13:42 PDT 2019
Hello,
I need to instrument C++ programs to keep track of some variables of
interest and branches taken. I tried to write a clang plugin that modifies
the AST to insert calls but the AST I generate seems to break code
generation later on. I want to work on the AST level because some of the
relationship to the source code constructs are lost when working on a later
stage of the compilation pipeline, e.g. LLVM bitcode.
What is the most reliable way to inspect the AST and modify the program
(e.g. to remove, add, or change some statements)? The approaches I found
are below but I couldn't figure out which would be the best option for
implementation:
- Using libtooling and Rewriter to re-write the source code. This approach
injects raw text into the program, which seemed error prone to me.
- Writing a TreeTransformer in Sema to recursively build a new AST with the
intended changes. This seems like the best option right now but it requires
modifying the compiler rather than "just writing a plugin", is that correct?
- Modifying CodeGen to emit extra/different code on program points of
interest. This seems like the Sema approach, with the difference of
creating LLVM instructions rather than AST nodes.
A code snippet showing how I am trying to manipulate the AST is below, in
case it is of any help and in case there is a way to do it with the plugin
infrastructure:
------
// Instrument the entry point of the given function
// If we are processing ReturnType f(T1 arg1, T2 arg2, ...) { ... }
// we inject the call "record_entry_of_f(arg1, arg2, ...)"
// to the beginning of the function body. And, the type of
record_entry_of_f
// is (void)(T1 &, T2 &, ...)
void instrument_fn(FunctionDecl &fd) {
assert(fd.getBody() != nullptr && "The function to instrument should
have a body");
// Inject instrumentation code to entry
auto &body = *fd.getBody();
auto &ctx = fd.getASTContext();
auto instrumenter_fn_name = "record_entry_of_" + get_mangled_name(fd);
// support_decls is a map from support function names to declaration
of those functions
// it's type is std::unordered_map<std::string, FunctionDecl *>
if (support_decls.find(instrumenter_fn_name) == support_decls.end()) {
llvm::errs() << "Could not find the instrumenter for ";
fd.printQualifiedName(llvm::errs());
llvm::errs() << "\n";
llvm::errs() << "looked for " << instrumenter_fn_name << "\n";
return;
}
auto entry_fn_decl = support_decls[instrumenter_fn_name];
auto entry_fn = new (ctx) DeclRefExpr(entry_fn_decl, false,
entry_fn_decl->getType(), VK_RValue, body.getBeginLoc());
auto args = new std::vector<Expr *>();
auto typ = dyn_cast<FunctionProtoType>(fd.getFunctionType());
if (typ == nullptr) {
// Do not instrument if we don't have type information
return;
}
size_t i = 0;
for (auto param : fd.parameters()) {
args->push_back(new (ctx) DeclRefExpr(param, false,
typ->getParamType(i), VK_RValue, body.getBeginLoc()));
i++;
}
// We are passing a fake source location to create the call
expression, specifically the beginning location of the body
Stmt * entry_call = new (ctx) CallExpr(fd.getASTContext(), entry_fn,
ArrayRef(*args), ctx.VoidTy, VK_RValue, body.getBeginLoc());
// Create a new compound statement with the call and the original
body and make it fd's new body
std::vector<Stmt *> newBody = {entry_call, fd.getBody()};
fd.setBody(CompoundStmt::Create(ctx, ArrayRef<Stmt *>(newBody),
body.getBeginLoc(), body.getEndLoc()));
// Dump the AST for debugging
fd.getBody()->dumpColor();
}
------
Thanks,
Mehmet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190508/32b00999/attachment.html>
More information about the cfe-dev
mailing list