[LLVMdev] incorrect DSCallGraph for simple indirect call with vtable nearby

Wed Aug 10 10:59:00 PDT 2011

Dear Ben,

Just a few comments on DSA:

1) I'll try out your example C++ code below and see if I can get the 
same results that you do.  However, I'm at a conference right now 
(Usenix Security), so I don't know exactly when I'll get to it.

2) DSA can get pessimistic results when dealing with external code (as 
Andrew described).  It is designed for whole program analysis, meaning 
that the entire program should be available (e.g., no variables defined 
in other compilation units).  Can you:

a) Modify your example so that all variables are internally defined.  
You may need to use volatile keywords or printf() to ensure that dead 
code isn't eliminated.

b) Make sure that you run the -internalize pass before running your 
analysis to make sure that all variables are marked with internal linkage.

3) As an FYI, we just ported DSA from LLVM 2.7 to LLVM mainline last 
week and haven't had time to shake it down and see how well it is 
working.  We'll be shaking down mainline DSA as we integrate it into an 
optional libLTO replacement for use with the SAFECode memory safety 
compiler.  DSA for LLVM 2.7 is still available in the release_27 branch 
of the poolalloc project.

-- John T.

On 8/10/11 8:37 AM, Ben Liblit wrote:
> In an earlier message 
> (http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-August/042298.html), 
> Andrew Lenharth suggested that EQTDDataStructures (from the poolalloc 
> project) may only try to resolve indirect function calls.  However, I 
> am now finding that the generated DSCallGraph over-approximates the 
> callees in a very simple indirect call.  Some over-approximation is 
> unavoidable, but this case seems simple enough that any reasonable 
> analysis ought to have been able to give the exact, correct answer.
>
> My example C++ source file, compiled to bitcode using clang from the 
> current llvm trunk, is as follows:
>
>     struct Base
>     {
>       virtual void virt();
>     };
>
>     void Base::virt()
>     {
>     }
>
>     void red();
>     void blue();
>
>     extern volatile int unknown;
>     extern Base *base;
>
>     void test()
>     {
>       base->virt();
>       (unknown ? red : blue)();
>     }
>
> Observe that we have a class with a virtual method, and two indirect 
> calls: one through a vtable to that virtual method, and another which 
> is a simple non-deterministic choice between two possible function 
> pointers.  I can understand an inexact (over-approximating) set of 
> callees for the vtable call, as that is rather complex at the bitcode 
> level.  However, I am very surprised to see an over-approximation at 
> the second indirect call, which is merely picking between two possible 
> values:
>
>     %11 = phi void ()* [ @red(), %8 ], [ @blue(), %9 ]   # demangled
>     call void %11()
>
> Yet my DSCallGraph tester code reports that this instruction might 
> call either red(), blue(), or Base::virt().  I am at a loss to explain 
> why.
>
> Equally strange: if I comment out the "base->virt();" call, then 
> DSCallGraph *does* give the expected answer that "(unknown ? red : 
> blue)()" could call either red() or blue() but not Base::virt(). 
> Somehow having that vtable-based call present forces the other call to 
> be unexpectedly conservative in its over-approximation.
>
> Source code for my DSCallGraph tester is attached below.  I'd be most 
> grateful for any clues you good people can provide.
>
> Thanks,
> Ben
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110810/3e360713/attachment.html>