[cfe-dev] Checker - taint analysis with virtual functions (runtime polymorphism handling)

Jordan Rose jordan_rose at apple.com
Wed Sep 19 14:27:44 PDT 2012

Hi, Byoungyoung. I'm not sure I quite understand your objection. If you have this:

> A *a = new B();

where 'a' is a global variable, we have no guarantee that it is still the same 'B' object when we actually analyze any specific function. After all, some other function could have an 'a = nullptr;' pretty much anywhere. Without cross-translation-unit, whole-program analysis, we can't make a definite guarantee that 'a' points to the same object for the entire lifetime of the program.

We currently don't model this "correctly" even if you use a constant pointer:

> A * const a = new B();

because we do per-function analysis, and by the time we've entered any function body, 'a' has already been initialized. We ought to treat it the same as this:

> B a;

which is probably what you should be doing anyway if you really have a global object that lasts for the lifetime of the program. Using 'new' may not be safe during static initialization (i.e. before 'main'), and if the constructor takes (tainted?) arguments, then you're potentially subject to the static initialization order fiasco.

Currently, the only sort of global values we handle are constant integers. The general issue is tracked internally at Apple by <rdar://problem/11720796> and a similar specific case is on our public bug tracker at http://llvm.org/bugs/show_bug.cgi?id=13673.

Our "todo list" of sorts would be our bug-tracking system at http://llvm.org/bugs. If you have an entire self-contained concrete example that you believe should work, please file a bug.

Sorry for the negative answers,

P.S. Our handling of 'new' in particular is fairly weak right now; there's some infrastructure work discussed on the bug tracker at http://llvm.org/bugs/show_bug.cgi?id=12014 that ought to result in us correctly tracking the types of objects that come from 'new'.

On Sep 19, 2012, at 12:09 , Byoungyoung Lee <lifeasageek at gmail.com> wrote:

> I really appreciate your help! With that ipa.txt documentation, now I
> understand how Clang IPA is working. The reason it's not tainted on my
> test program before was that I haven't specified the control flows
> between class allocation routines and tainting routines. After giving
> the control flows specifically to these routines, it works great!
> But still, it seems like current IPA does not handle the global
> constructors? i.e. when the constructor is invoked as it is declared
> as the global variables (A *a = new B(); // "a" is the global
> variable). Since there's no explicit control flows to this
> constructors (I think this will be compiled into .ctors section for
> ELF binaries and invoked by loaders), current IPA analyzer cannot
> handle this case I guess. Is this a sort of TODO-list or could you
> direct me some documentations mentioning this?
> Thanks,
> Byoungyoung
> On Tue, Sep 18, 2012 at 9:36 PM, Anna Zaks <ganna at apple.com> wrote:
>> On Sep 18, 2012, at 5:15 PM, Byoungyoung Lee <lifeasageek at gmail.com> wrote:
>>> Hello Anna,
>>> Thanks for your answer, Anna. I just wanted to make sure that there's
>>> no current support for handling virtual function calls for the taint
>>> analysis.
>>> I'm thinking to implement the taint-analysis which supports the
>>> virtual function calls. When Checker captures the statement like A *a
>>> = new B();  // A is a parent class of B
>>> , I would re-assign a's clang::Decl with class B so that following
>>> virtual function calls would be made using class B's declarations. I'm
>>> wondering this approach would be working? Or could you suggest me
>>> better ways to handle this issue?
>> The logic for reasoning about polymorphism should be done inside the analyzer core engine as it is not specific to the taint checker. Essentially, when 'a->foo()' is called, we will check if 'a' points to the object 'B' at runtime. If yes, we would inline the call for B's implementation of 'foo'. Jordan has started on adding the reasoning about polymorphism to the analyzer. I am not sure why this particular case is not handled yet, but there can be a lot of edge cases one might need to handle.
>> You can read clang/docs/analyzer/IPA.txt for more information on how we deal with inter procedural analyses (including polymorphism or dynamic dispatch). This is one of the more complex areas of the analyzer.
>> Cheers,
>> Anna.
>>> Thanks,
>>> Byoungyoung
>>> On Tue, Sep 18, 2012 at 4:35 PM, Anna Zaks <ganna at apple.com> wrote:
>>>> Hi Byoungyoung,
>>>> Taint analysis relies on the general clang infrastructure for propagating the taint through/into virtual (and regular) calls. Currently, the static analyzer core is not smart enough to de-virtualize the call in this example. However, we are actively working on better IPA support for C++.
>>>> Said that, we would only resolve the function if the analyzer has enough information to de-virtualize. By default (when not enough info is available), the analyzer core would treat the call as opaque. This is the desired behavior. Even for the taint checker, you might only want to propagate the taint into the specific function if you are sure that there is a path on which that would occur.
>>>> Cheers,
>>>> Anna.
>>>> On Sep 18, 2012, at 12:47 PM, lifeasageek <lifeasageek at gmail.com> wrote:
>>>>> Hello,
>>>>> I'm playing with Checker to implement taint-analysis for C++
>>>>> applications. Refering GenericTaintChecker.cpp, I've implemented my
>>>>> simple taint-analysis but it seems like tainted symbols are not
>>>>> propagated for virtual function calls and Checker cannot handle the
>>>>> C++ class runtime polymorphism?
>>>>> From my understanding, when checker sees virtual function call
>>>>> expression, it only knows the declared class type, not the actually
>>>>> allocated class type. In the example code below I've written, when
>>>>> Checker sees g_table->append(), it only knows g_table is the member
>>>>> function of ShapeTable, not of ShapeTableArray.
>>>>> Could you tell me how to handle this C++ runtime polymorphism issues?
>>>>> Can I force it to visit all the possible (or concrete) virtual
>>>>> functions when Checker sees the virtual function calls?
>>>>> ------------------------------------------------------
>>>>> class ShapeTable {
>>>>> public:
>>>>>  virtual void append(int value) = 0;
>>>>>  virtual int search(int value) = 0;
>>>>>  ShapeTable();
>>>>> };
>>>>> class ShapeTableArray :public ShapeTable {
>>>>> public:
>>>>>  ShapeTableArray() : curPosition(0) {
>>>>>      entries = (int*)malloc(sizeof(int) * MAX_ENTRIES);
>>>>>  }
>>>>>  void append(int value) {
>>>>>      entries[curPosition++] = value;
>>>>>      return;
>>>>>  }
>>>>>  int search(int value) {
>>>>>      for (int i=0; i<MAX_ENTRIES; i++) {
>>>>>          if (entries[i] == value)
>>>>>              return i;
>>>>>      }
>>>>>      return NOT_AVAILABLE;
>>>>>  }
>>>>> private:
>>>>>  int *entries;
>>>>>  int curPosition;
>>>>> };
>>>>> int main(void){
>>>>>  ShapeTable *g_table = new ShapeTableArray();
>>>>>  g_table->append(0x1234);
>>>>>  g_table->search(0x1234)
>>>>> }
>>>>> Thanks,
>>>>> Byoungyoung
>>>>> --
>>>>> View this message in context: http://clang-developers.42468.n3.nabble.com/Checker-taint-analysis-with-virtual-functions-runtime-polymorphism-handling-tp4026757.html
>>>>> Sent from the Clang Developers mailing list archive at Nabble.com.
>>>>> _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120919/6b3f0b75/attachment.html>

More information about the cfe-dev mailing list