[llvm-dev] Finding all pointers to functions
John Criswell via llvm-dev
llvm-dev at lists.llvm.org
Wed Dec 23 09:35:49 PST 2015
On 12/23/15 2:09 AM, Russell Wallace wrote:
> On Tue, Dec 22, 2015 at 10:55 AM, John Criswell <jtcriswel at gmail.com
> <mailto:jtcriswel at gmail.com>> wrote:
> You could conservatively assume that any function that has its
> address taken has a pointer to it that escapes into memory or
> external code.
> Right, that's what I'm doing to start with.
> To make things a little more accurate, you could scan the uses of
> any function for which hasAddressTaken() returns true and see if
> any of its uses escapes its function or escapes into memory or
> external code. I believe hasAddressTaken() returns true if the
> function is subjected to a cast instruction, and functions are
> often casted if they are used in a call that uses a different
> signature than the function's declared signature.
> I'll look into that. It seems reasonable to guess that the major
> confounding factor in many C++ programs will be references from
> virtual function tables; there should be some way to optimize those
> To get anything more accurate, you'll need to use alias analysis
> or points-to analysis. DSA tracks function pointers in the heap
> and can tell you whether the function is called from external
> code. However, DSA's accuracy currently suffers if it is run
> after LLVM's optimizations, and the code needs some serious TLC.
> DSA presumably stands for data structure analysis. TLC = tender loving
> care? Why does DSA become less accurate if run after optimization?
DSA was built when LLVM's optimizations maintained the type information
on GEP and other instructions (DSA existed before LLVM was
open-source). As such, it uses LLVM's type information to aid in its
type-inference which, in turn, gives it field sensitivity which, in
turn, improves its accuracy. Over time, LLVM optimizations have come to
modify the type information so that it is just simple byte-level
indexing (as opposed to array-of-structure indexing). DSA hasn't been
updated to handle that well. That is why its precision is better
pre-optimization than post-optimization.
Just out of curiosity, what are you trying to do? I need call graph
analysis for C/C++ code with function pointers, and so I'm writing an
NSF proposal to seek funding to do that (among other enhancements to my
SVA infrastructure). If it's something that would be useful to you (or
other LLVM community members), it would be useful for me to know that.
Department of Computer Science, University of Rochester
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev