[LLVMdev] How to duplicate a function?
Dmitry N. Mikushin
maemarcus at gmail.com
Mon Sep 26 15:04:52 PDT 2011
Hi Shawn,
Probably I can answer specifically the question "how to replace old
function call with new one, adding extra char* argument".
Method for gathering information: grep with context for one keyword in
LLVM source and then look for another in results:
find . -name *.cpp -exec grep "CallInst" {} -C 100 \; | less
Results:
1) tools/bugpoint/Miscompilation.cpp gives an example of creating
string array and adding it into call args list starting at line 826:
Constant *InitArray = ConstantArray::get(F->getContext(), F->getName());
GlobalVariable *funcName =
2) Using (1) I implemented my own version here, with more comments:
https://hpcforge.org/scm/viewvc.php/branches/accurate/src/frontend/compile.cpp?revision=476&root=kernelgen&view=markup
I hope it's helpful,
- D.
2011/9/16 Shawn Kim <shawn.subscribe at gmail.com>:
> Hi all,
>
> Sorry for the inconvenient about the previous post. The files were not
> attached. So I put them here again.
>
> I am a newbie in LLVM and I am trying to replace the function like:
>
> old function || new function
> ==============================
> =========
> int haha(int a) { int haha(int a, char* ID) {
>
> ===>
>
>
> } }
>
> Of course in the newly replaced function "int haha(int, char*ID)", I want to
> insert some instrumentation code.
>
> Here is my code that I am working on till now and it generates segmentation
> fault in the place I comment with "//////////////////////"
> Can you help me? Any advice will be helpful because I am a beginner in llvm.
>
>
> Thank you in advance.
> Shawn.
>
>
> duplicateFunction.cpp
> =============================================================================
>
> //===- duplicateFunction.cpp - Writing an LLVM Pass
> -----------------------===//
> //
> // The LLVM Compiler Infrastructure
> //
> //===----------------------------------------------------------------------===//
> //
> // This file implements the LLVM duplicating function pass.
> // It starts by computing a new prototype for the function,
> // which is the same as the old function, but has an extra argument.
> //
> //===----------------------------------------------------------------------===//
> #include "llvm/Transforms/Utils/Cloning.h"
> #include "llvm/Pass.h"
> #include "llvm/Function.h"
> #include "llvm/Module.h"
> #include "llvm/CallingConv.h"
> #include "llvm/DerivedTypes.h"
> #include "llvm/InstrTypes.h"
> #include "llvm/Constants.h"
> #include "llvm/Instructions.h"
> #include "llvm/Support/raw_ostream.h"
> #include "llvm/Transforms/Utils/BasicBlockUtils.h"
> #include "llvm/BasicBlock.h"
> #include "llvm/Support/Debug.h"
> #include "llvm/Support/CallSite.h"
> using namespace llvm;
>
> namespace {
> Constant *f;
> Function *Fn;
> FunctionType *FTy;
> Type *RetTy;
> std::vector<Type*> Params;
> class DP : public FunctionPass {
>
> public:
> static char ID;
>
> DP() : FunctionPass(ID) {}
>
> virtual bool doInitialization(Module &M);
> virtual bool runOnFunction(Function &F);
> virtual bool doFinalization(Module &mdl) {
> mdl.dump();
> return true;
> }
> }; /* class */
> }
>
> char DP::ID = 0;
> static RegisterPass<DP> IC("duplicateFunction", "Duplicate Function Pass");
>
>
> bool DP::doInitialization(Module &M) {
>
> // find the function that we want to change.
> Fn = M.getFunction("haha");
>
> // Start by computing a new prototype for the function, which is the
> same as
> // the old function, but has an extra argument.
> FTy = Fn->getFunctionType();
>
> // Find out the return value.
> RetTy = FTy->getReturnType();
>
> // set the calling convention to C.
> // so, we interoperate with C Code properly.
> Function *tmp = cast<Function>(Fn);
> tmp->setCallingConv(CallingConv::C);
>
> return true;
> }
>
> bool DP::runOnFunction(Function &F) {
> #if 0
> Value *param;
>
> // Find the instruction before which you want to insert the function
> call
> Instruction *nextInstr = F.back().getTerminator();
>
> // Create the actual parameter for the function call
> param = ConstantInt::get(Type::getInt32Ty(F.getContext()), 333);
>
> // create and insert the function call
> //CallInst::Create(f, param, "", nextInstr);
> CallInst::Create(Fn, param, "", nextInstr);
>
> // indicates that we changed the code
> //return true;
> #endif
> Type *NRetTy;
>
> std::vector<Type*> Params(FTy->param_begin(), FTy->param_end());
> FunctionType *NFTy = FunctionType::get(FTy->getReturnType(), Params,
> false);
>
> // Create the new function body and insert it into the module...
> Function *NF = Function::Create(NFTy, Fn->getLinkage());
> NF->copyAttributesFrom(Fn);
> Fn->getParent()->getFunctionList().insert(Fn, NF);
> NF->takeName(Fn);
>
> for (Function::arg_iterator AI=F.arg_begin(), AE=F.arg_end(),
> NAI=NF->arg_begin();
> AI != AE; ++AI, ++NAI) {
> NAI->takeName(AI);
> }
>
> // Since we have now create the new function, splice the body of the old
> // function right into the new function, leaving the old rotting hulk of
> the
> // function empty.
> NF->getBasicBlockList().splice(NF->begin(), F.getBasicBlockList());
>
> llvm::Value *Globals = --NF->arg_end();
> Globals->setName("IOCallIDs");
>
> // Now, exploit all return instructions.
> for (Function::iterator BI = NF->begin(), BE = NF->end(); BI != BE;
> ++BI) {
> if (ReturnInst *RI =
> llvm::dyn_cast<llvm::ReturnInst>(BI->getTerminator())) {
> // Don't support functions that have multiple return values.
> assert(RI->getNumOperands() < 2);
>
> // Insert a new load instruction to return.
>
> /////////////////////////////////////////////////////////////////////////////////////
> HERE, GENERATE ERROR
>
> /////////////////////////////////////////////////////////////////////////////////////
> Value *Load = new llvm::LoadInst(Globals, "globalsret", RI);
>
> /////////////////////////////////////////////////////////////////////////////////////
>
> // Return type is void
> if ( RetTy->isVoidTy() ) {
> // ReturnInst::Create(Load, 0, RI); // Return void
> ReturnInst::Create(F.getContext(), 0, RI); // Return void
> RI->getParent()->getInstList().erase(RI);
> } else {
> // Start with an empty struct.
> Value *Return = ConstantAggregateZero::get(NRetTy);
> DEBUG(errs() << "Return: " << *Return->getType() << '\n');
>
> // Insert the original return value in field 0
> Return = InsertValueInst::Create(Return, RI->getOperand(0),
> 0, "ret", RI);
> DEBUG(errs() << "Return: " << *Return->getType() << '\n');
>
> // Insert the globals return value in field 1
> Return = InsertValueInst::Create(Return, Load, 1, "ret",
> RI); // <- maybe useless
> DEBUG(errs() << "Return: " << *Return->getType() << '\n');
>
> // Update the return instruction
> RI->setOperand(0, Return);
> }
> } // if
> } // for
>
> // Replace all uses of the old arguments with the new arguments.
> for (Function::arg_iterator I = F.arg_begin(), E = F.arg_end(), NI =
> NF->arg_begin();
> I != E; ++I, ++NI) {
> I->replaceAllUsesWith(NI);
> }
>
> #if 1
> // Replace all callers
> while ( !F.use_empty() ) {
> CallSite CS(F.use_back());
> Instruction *Call = CS.getInstruction();
> // Function *CallingF = Call->getParent()->getParent();
>
> // Get the global struct in our caller.
> //Value* CallerGlobals = ModifyFunctionRecursive(CallingF).first;
> Value* CallerGlobals = NULL; // <- This should be modified later.
>
> // Copy the existing arguments
> std::vector<Value*> Args;
> Args.reserve(CS.arg_size());
> CallSite::arg_iterator AI = CS.arg_begin(), AE = CS.arg_end();
>
> // First, copy regular arguments
> for (unsigned i = 0, e = FTy->getNumParams(); i != e; ++i, ++AI) {
> Args.push_back(*AI);
> }
> // Then, insert the new argument
> Args.push_back(CallerGlobals);
> // Lastly, copy any remaining varargs
> for (; AI != AE; ++AI) {
> Args.push_back(*AI);
> }
>
> Instruction *New;
> Instruction *Before = Call;
> if ( InvokeInst *II = dyn_cast<InvokeInst>(Call) ) {
> New = InvokeInst::Create(NF, II->getNormalDest(),
> II->getUnwindDest(), Args, "", Before);
> cast<InvokeInst>(New)->setCallingConv(CS.getCallingConv());
> // cast<InvokeInst>(New)->setParamAttrs(CS.getParamAttrs());
> cast<InvokeInst>(New)->setAttributes(CS.getAttributes());
> } else {
> New = CallInst::Create(NF, Args, "", Before);
> cast<CallInst>(New)->setCallingConv(CS.getCallingConv());
> // cast<CallInst>(New)->setParamAttrs(CS.getParamAttrs());
> cast<CallInst>(New)->setAttributes(CS.getAttributes());
> if ( cast<CallInst>(Call)->isTailCall() ) {
> cast<CallInst>(New)->setTailCall();
> }
> }
>
> if (Call->hasName()) {
> New->takeName(Call);
> } else {
> New->setName(NF->getName() + ".ret");
> }
>
> Value *GlobalsRet;
> if ( Call->getType()->isVoidTy() ) {
> // The original function returned nothing, so the new function
> returns
> // only the globals
> GlobalsRet = New;
> } else {
> // Split the values
> Value *OrigRet = ExtractValueInst::Create(New, 0, "origret",
> Before);
> GlobalsRet = ExtractValueInst::Create(New, 1, "globalsret",
> Before);
> // Replace all the uses of the original result
> Call->replaceAllUsesWith(OrigRet);
> }
>
> // Now, store the globals back
> new StoreInst(GlobalsRet, CallerGlobals, Before);
>
> DEBUG(errs() << "Call " << *Call << " replaced, function is now " <<
> *Call->getParent()->getParent() << "\n");
>
> // Finally, remove the old call from the program, reducing the
> use-count of F.
> Call->eraseFromParent();
>
> } // while
> #endif
> return true;
> }
>
>
> test.c
> ====================================================================================
> #include <stdio.h>
> #include <stdlib.h>
>
> int v[200];
> int haha(int);
>
> int main()
> {
> int i;
> int n=100;
>
> if ( !haha(n) )
> exit(-1);
>
> return 1;
> }
>
> int haha(int n)
> //int haha(int n, char* IOCallIDs)
> {
> int i;
> for (i=1; i<n; i++)
> v[i] = v[i-1] + v[i];
>
> printf ("hahaha\n");
> return 1;
> }
>
> Makefile
> ====================================================================================
> LLVM_CONFIG?=llvm-config
>
> # location of the source
> # useful if you want separate source and object directories.
> SRC_DIR?=$(PWD)
>
> #ifndef VERBOSE
> # QUIET:=@
> #endif
>
> COMMON_FLAGS=-Wall -Wextra #-fvisibility=hidden
> CFLAGS+=$(COMMON_FLAGS) $(shell $(LLVM_CONFIG) --cflags)
> CXXFLAGS+=$(COMMON_FLAGS) $(shell $(LLVM_CONFIG) --cxxflags)
>
> #ifeq ($(shell uname),Darwin)
> #LOADABLE_MODULE_OPTIONS=-bundle -undefined dynamic_lookup
> #else
> LOADABLE_MODULE_OPTIONS=-shared -Wl,-O1
> #endif
>
> TEST_C=test.c
> TEST_FILE=$(subst .c,.s, $(TEST_C))
> PLUGIN=duplicateFunction.so
> PLUGIN_OBJECTS=duplicateFunction.o
>
> ALL_OBJECTS=$(PLUGIN_OBJECTS)
> ALL_TARGETS=$(PLUGIN) $(TEST_FILE)
>
> CPP_OPTIONS+=$(CPPFLAGS) $(shell $(LLVM_CONFIG) --cppflags) -MD -MP
> -I$(SRC_DIR)
>
> LD_OPTIONS+=$(LDFLAGS) $(shell $(LLVM_CONFIG) --ldflags)
>
> all: $(ALL_TARGETS)
>
> %.o : $(SRC_DIR)/%.cpp
> @echo Compiling $*.cpp
> $(QUIET)$(CXX) -c $(CPP_OPTIONS) $(CXXFLAGS) $<
>
> $(PLUGIN): $(PLUGIN_OBJECTS)
> @echo Linking $@
> $(QUIET)$(CXX) -o $@ $(LOADABLE_MODULE_OPTIONS) $(CXXFLAGS)
> $(LD_OPTIONS) $(PLUGIN_OBJECTS)
>
> RUN_FLAGS=-duplicateFunction
>
> $(TEST_FILE): $(TEST_C)
> clang -g -O3 -S -emit-llvm $^
> # clang -g -O0 -S -emit-llvm $^
> # clang -O0 -S -emit-llvm $^
> run:
> opt -load ./$(PLUGIN) $(RUN_FLAGS) < $(TEST_FILE) > /dev/null 2> after.s
> # opt -load ./$(PLUGIN) $(RUN_FLAGS) < $(TEST_FILE) > /dev/null
>
> clean:
> $(QUIET)rm -f $(ALL_OBJECTS) *.d $(PLUGIN) $(TEST_FILE)
>
>
> -include $(ALL_OBJECTS:.o=.d)
>
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
More information about the llvm-dev
mailing list