[cfe-dev] [analyzer] Whole Program Analysis - Full Call Graph

Aleksei Sidorin via cfe-dev cfe-dev at lists.llvm.org
Tue Dec 1 04:23:27 PST 2015

Hello Phil,

I met this problem while implementing inter-unit analysis 
proof-of-concept for CSA. I'll describe my solution here. As I 
understand, this problem cannot be solved inside CSA, so I used 
multi-step approach.

First, for all translation units we should collect a list of 'required' 
and 'exported' functions. To build a list of required declarations, we 
iterate over AST to find CallExprs that use callee decl without body. To 
build a list of exported declarations we just dump all the functions 
that are visible externally.

I used a simple text signature of function as 
mangled_name at target_triple_arch for dumping. I used additional arch mark 
to distinguish between functions in multi-arch builds.

Note that some function may be defined in a headers. One should not 
merge them because they may have different bodies due to macro 
expansion. These calls are local and the approach below should solve 
this problem.

After this, we have two lists. List of required functions is just a list 
of signatures; list of external functions is a list of entries looking 
like a map: function_signature::file_name at target_triple_arch. Final 
export map contains items from exported functions whose key was listed 
in required functions. Resulting mapping is an inter-unit call graph.

You may also need to use C++ mangling-style for C if you build multiple 
projects that may contain functions with the same name. However, if you 
need to do this, you may need more complicated approach.

You can take a look at the code. This code also dumps local calls but 
marks as local explicitly so it may be used to build a whole program 
call graph. You can find our github repo at 
https://github.com/haoNoQ/clang/tree/summary-ipa-draft. See 
and tools/scan-build/xtu-analyze.py 
for some code.

Hope it will help.

> Hello cfe-dev,
> I am interested in researching possible static analyzer schemes on whole
> program states. Initially I would like to build a whole program call
> graph. I plan to develop program wide checkers for calls to a
> proprietary SDK. To begin with these checkers would rely on simple call
> graph/AST based analysis rather than ‘full’ static analysis. These
> checkers would include items like simple flow analysis of SDK calls that
> should occur in a particular order (e.g. open()/read()/write()/close()).
> I would be very grateful if anyone can offer any guidance as to how to
> achieve this.
> My initial thoughts are to use the current AST call graph feature within
> clang. I would create a checker that builds and serialises the call
> graph to a file for each source file. I plan to add this serialisation
> to the clang CallGraph class along the lines of the current AST
> serialisation. The available serialised call graph files would
> subsequently be read and a merged, creating a full program graph  (or
> partial program graph if the whole program has not been serialised).
> I have a couple of questions regarding development style:
> a) Would the clang CallGraph class be the best place for the
> serialisation, or would it better to place it in the checker(s)?
> b) For the read and merge process, should this be an external utility or
> use the regular clang executable?
> Any feedback most welcome. Thank you for your time.

Best regards,
Aleksei Sidorin
Software Engineer,
IMSWL-IMCG, SRR, Samsung Electronics

More information about the cfe-dev mailing list