[LLVMdev] Walking thru CallGraph bottom up

Mon Mar 2 06:29:49 PST 2015

On 3/2/15 2:12 AM, Simone Atzeni wrote:
> Hi Herbie,
>
> thanks for you answer and explanation.
>
>> Also, if any of the functions are external, you are completely stuck (unless you put everything together with lld).
>
> I am indeed having a problem regarding external function.
> I my program is just one file everything work and I can access all the functions.
> However, if it has multiple files I have a lot of unresolved pointers to external functions.
> I am creating separately all the byte code files for each file of my program.
> Then I link all of them together with llvm-link, I thought it was enough but actually I can not access the functions that are in other files even if I am link all of them together.
>
> Do you have any idea/suggestion how to solve this problem?

Obviously, if you want to analyze the call graph of an entire program, 
you need to collect the information from across compilation units.  In 
LLVM, the best way to do that is to link the bitcode files together into 
a single bitcode file.  This can either by done using llvm-link (which 
requires changing the Makefiles of most programs) or by adding your 
passes into libLTO (which is more work but allows you to do 
whole-program analysis with little/no modification to program Makefiles).

Of course, if you have external library code (e.g., libc) that is not in 
LLVM bitcode format, you can't analyze it using an LLVM pass.  One 
option is to know what these functions do (i.e., treat them as a special 
case).  Another option would be to try to convert them into LLVM bitcode 
format using Revgen or BAP, but I'm not sure how well that works.

Most analyses understand the semantics of the C library function so that 
they can treat them specially and then assume that any function that has 
its address taken or has externally visible linkage can be called by 
external library code.  The "external node" in the CallGraph analysis 
should state that it can call any function with externally visible linkage.

Regards,

John Criswell

>
> Thanks!
> Best,
> Simone
>
>
>> On Feb 28, 2015, at 14:30, Herbie Robinson <HerbieRobinson at verizon.net> wrote:
>>
>> On 2/24/15 2:27 PM, Simone Atzeni wrote:
>>> Hi all,
>>>
>>> I would like to create a Pass that given an IR instruction walks starting from that instruction up to the main function
>>> to identify all the functions call that have been made to call that instruction.
>>>
>>> Is it possible? What kind of Pass should I create?
>>>
>> Technically, that's not a pass, because a pass is an algorithm for going through all the Instructions.
>>
>> What you want to do is only possible to do for internal functions if there are no function variables in play.
>>
>> The Instruction class has a method to get to the basic block which has a method to get to the Funciton.  In both cases, the method is called getParent().  Once you get the Function, you can find all the internal uses using the use chain from the Function.
>>
>> Start looking at the user class here:
>>
>> http://llvm.org/doxygen/classllvm_1_1User.html
>>
>> When you go through the user chain, you will find all sorts of instructions and possibly other things.  If you find anything other than a call Instruction, you have hit upon a case that's more difficult to handle (related to using function pointer).  If you run whatever analysis you are trying to do after all the optimization passes, value propagation may have covered some of the function pointer cases; so, you want to recurse through PHI nodes, too.
>>
>> Also, if any of the functions are external, you are completely stuck (unless you put everything together with lld).

-- 
John Criswell
Assistant Professor
Department of Computer Science, University of Rochester
http://www.cs.rochester.edu/u/criswell