[cfe-dev] Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Mon Mar 5 18:13:37 PST 2018



On 22/02/2018 5:19 PM, Artem Dergachev wrote:
> You're essentially in an area that's largely under construction at the 
> moment, and i'm likely to address your issue in like nearest future, 
> you can probably follow my progress on phabricator or in cfe-commits, 
> or i can try to ping you when this case is supported.

It's https://reviews.llvm.org/D44124 and https://reviews.llvm.org/D44131 
- after that return values should work. The former is about functions 
that can be "inlined" during analysis (i.e. we see what object is being 
returned), the latter is about functions that aren't inlined (but it 
should make it easier to track them).

>
> Being source-based, the analyzer needs to model every possible 
> language construct explicitly, and in C++ it means *a ton of stuff*. 
> We could analyze, like, llvm IR instead, and it'd be much easier to 
> implement correctly for us, but it'd be much harder to produce 
> user-friendly warnings, because the user does think in terms of source 
> code. And it'd be harder to make checkers for the same reason. Which 
> is why "symbolic execution" we're doing here is to compilation like 
> algebra to arithmetic: we need to do the same thing - support all 
> language constructs - but with a lot of "concrete" values replaced 
> with algebraic unknown values. Luckily, it is also easier because we 
> don't have to support everything perfectly: we can always say that we 
> don't know something, behave "conservatively", and the analysis would 
> be slightly less precise but this is fine if there's not too much of 
> these.
>
> Modeling temporaries is one of such cases - this is one of the 
> difficult parts of the language that we never did properly until now. 
> One does not simply construct into a temporary region because in this 
> case we're not even sure that it is indeed a temporary region in this 
> part of the analyzer, it cannot be understood by looking at the 
> construct-expression in the AST, which is something i'm working around 
> now by introducing "construction contexts". And even if we were sure 
> it's a temporary, we need to not only construct the object, but also 
> know how to properly extend its lifetime and destroy it when lifetime 
> ends, otherwise we'd encounter massive false positives (because it'd 
> not be "we don't know something", it'd be "we know something but it's 
> wrong"), which should also be part of the "context" in which 
> construction takes place. So for now this part simply behaves 
> conservatively.
>
> Your particular case will be relatively tricky when f() is inlined 
> during analysis. In this case the temporary will be constructed within 
> the function, but in order to properly work with this temporary we'd 
> need what the caller function is going to do with it - at least 
> according to the AST that we currently have this is what we have to 
> do. More info in 
> http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - 
> "== Return by value ==" part. CodeGen does it differently, treating a 
> lot of AST rvalues as lvalues under the hood, but i don't think we can 
> afford breaking this invariant for now, because, well, we're more 
> complicated and fragile.
>
> On 22/02/2018 4:46 PM, likan_999.student at sina.com wrote:
>> Thanks Artem, Aleksei for those precious information.
>>
>> Yes, I observed exactly the same thing about MaterializeTemporaryExpr 
>> and I actually spent quite sometime trace into the clang's codebase 
>> and reached to the exactly same place in the link: 
>> createTemporaryRegionIfNeeded. My workaround was inspired by one of 
>> the example checkers, that uses a checkPostStmt on 
>> MaterializeTemporaryExpr. Thanks for pointing out this.
>>
>> Aleksei, thanks for pointing out the getQualifiedNameAsString, I 
>> didn't know I could use that  . I personally prefers it because 
>> getParent()->getName() doesn't contain the namespace, but only the 
>> class name. So to make it safe, I need to go all the way up to 
>> traverse all parent decls, and getQualifiedNameAsString already does 
>> the thing for me.
>>
>> Thanks for taking your time to answer my question. I appreciate it.
>>
>> Now I observed some other weird thing, and I wonder if anyone has 
>> seen this before and knows any solution.
>>
>> As a reminder, the checker I am writing is for a particular class T 
>> in my company's code base and the checker is to ensure T.ok() == true 
>> before calling T.value().
>>
>> The above is actually a over-simplification to my task. The class T 
>> actually has a member, Let's say s_ (in type S) to keep track of the 
>> status, so T.ok() actually returns s_.ok(). And type S is used 
>> directly by a few other classes and macros which are used to creates 
>> T. So to make the check fully working, I also need to model the 
>> behaviors of S and a few other classes.
>>
>> Fortunately, T are template class and therefore all method 
>> implementations are visible to current TU. I was thinking, it is a 
>> whole lot easier if I can just model the behavior of S, and rely on 
>> inline evaluation in the static analyzer. In most cases, it works, 
>> except I found the following corner case (suppose T has a conversion 
>> constructor from X: T(X&&) and default move constructor T(T&&)):
>>
>> X f();
>> T t = f();
>>
>> This is essentially constructs a temporary T t2{f()} and then doing a 
>> move T t{std::move(t2)}. It looks fine if everything is inlined, 
>> however, the inline doesn't work because t2 is a temporary. I dived 
>> into the code base and what happened is
>>
>> 1. At the end of ExprEngine::getRegionForConstructedObject 
>> <https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#%2100183>, 
>> it is marked as improperly modeled target region:
>> CallOpts.IsCtorOrDtorWithImproperlyModeledTargetRegion 
>> <https://clang.llvm.org/doxygen/structclang_1_1ento_1_1ExprEngine_1_1EvalCallOptions.html#a339aa5a5b9aecfa8a844647b8d211eb0>= 
>> true;
>> 2. In ExprEngine::mayInlineCallKind 
>> <https://clang.llvm.org/doxygen/ExprEngineCallAndReturn_8cpp_source.html#%2100677>, 
>> it returns false if the flag is set:
>> // If we did not find the correct this-region, it would be pointless 
>> // to inline the constructor. Instead we will simply invalidate // 
>> the fake temporary target. if (CallOpts 
>> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp?l=618&ct=xref_jump_to_def&gsn=CallOpts&rcl=186553626>.IsCtorOrDtorWithImproperlyModeledTargetRegion 
>> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=62&ct=xref_jump_to_def&gsn=IsCtorOrDtorWithImproperlyModeledTargetRegion&rcl=186553626>) 
>> return CIP_DisallowedOnce 
>> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=623&ct=xref_jump_to_def&gsn=CIP_DisallowedOnce&rcl=186553626>;
>>
>> I am not an expert in C++ compiler. I just wonder why it can't 
>> allocates a temporary region for temporary variables? Having this 
>> special case means I need to model all T's constructors, which is 
>> really annoying. Does anyone knows some workarounds? Thanks.
>>
>>
>> ----- Original Message -----
>> From: Artem Dergachev <noqnoqneo at gmail.com>
>> To: Aleksei Sidorin <a.sidorin at samsung.com>, Li Kan 
>> <likan_999.student at sina.com>, cfe-dev at lists.llvm.org
>> Subject: Re: [cfe-dev] Static analyzer question on how to tell if a 
>> Decl is a particular c++ function, or a particular class
>> Date: 2018-02-16 02:26
>>
>>
>> Yeah, the traditional way to see if this is a certain method of a
>> certain class is to compare the name of the method
>> (CXXMethodDecl->getName()) and the class
>> (CXXMethodDecl->getParent()->getName()), as strings, to the desired
>> strings. getQualifiedNameAsString() works fine, but it's considered slow
>> for whatever reason (i've never observed it personally). ASTMatchers
>> would also work, even if a bit of an overkill. Don't forget to check the
>> method's argument number and types as well. We have an effort to make
>> this whole common idiom more simple via CallDescription, but it doesn't
>> work for C++ methods yet.
>> With C++, you'll need to be aware of the bug we're currently having with
>> object lifetime extension. If you're identifying the object by its
>> memory region (which is correct), you'd occasionally notice that the
>> region for a C++ temporary object may change along the analysis even if
>> no copy-construction occurs. It has been making our new C++ checkers
>> (most noticeably, the iterator checker) more difficult to write than
>> they needed to be. More details in
>> http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - i'm
>> hoping to do something about it soon.
>> On 15/02/2018 1:35 AM, Aleksei Sidorin via cfe-dev wrote:
>> > Hello Li,
>> >
>> > What you're doing is mostly fine. I can only add some potentially
>> > useful info inline.
>> >
>> > 14.02.2018 23:04, Li Kan via cfe-dev пишет:
>> >> Hi folks,
>> >>
>> >> If this is not the correct mailing list for this question, please let
>> >> me know.
>> > You're in the right place. Welcome :)
>> >
>> >> I am trying to write a static analyzer for code base of my current
>> >> job. The particular static analyzer I am writing involves a
>> >> particular class of my company's code base. Let's say it is T. I want
>> >> to write a checker to ensure T.ok() is called before T.value() is 
>> called.
>> >>
>> >> Static analyzer is perfect for this type of check as it requires path
>> >> sensitive checks. When I trying to write the checker, I basically
>> >> checks for pre-call and post-call. I want to tell if a CallEvent is
>> >> T.ok() and T.value(). Currently what I am doing is:
>> >> 1. From CallEvent, I cast to CXXMemberCall, and getOriginExpr(), then
>> >> call getMethodDecl().
>> > You can also try to do the following chain:
>> > dyn_cast_or_null<CXXMethodDecl>(CallEvent.getDecl()) to obtain
>> > CXXMethodDecl.
>> >
>> >> 2. From CXXMethodDecl, I first call getThisType(), then call
>> >> getAsCXXRecordDecl(), then getQualifiedNameAsString(), and compare
>> >> with qualified name of T, to make sure it is member of T.
>> > To obtain parent class declaration from CXXMethodDecl, you can use
>> > getParent() method. So, the pseudocode will look like
>> > "MethodDecl->getParent()->getQualifiedNameAsString()".
>> >
>> >> 3. From CXXMethodDecl, I call getNameAsString() to get the method
>> >> name, and compare them with "ok" and "value".
>> > This looks OK. But it can be useful to know that
>> > getQualifiedNameAsString() for CXXMethodDecl will also include parent
>> > name. So, you can just check if MethodDecl->getQualifiedNameAsString()
>> > is equal to "T::value" or "T::ok".
>> >
>> >> It works, but it looks complicated, and involves string comparison,
>> >> which I assume is slow. Is there a easier way, blessed by clang
>> >> static analyzer official teams, that tell if a MethodDecl, or
>> >> CXXRecordDecl, is the function or class I am interested in?
>> >>
>> >> The ideal way is, there is one-time call to lookup the MethodDecl for
>> >> T::ok() and T::value(), and CXXRecordDecl for T. Then ever since
>> >> then, I just need to compare the pointers.
>> > This is possible. You can just launch a simple matcher against AST:
>> > cxxMethodDecl(hasName("::T::value"))
>> > and get CXXMethodDecl you need and compare with it later.
>> >
>> >> Another way is, the first time I found it is T, T::ok() or
>> >> T::value(), I save the pointer of CXXRecordDecl or MethodDecl, and
>> >> use the pointers later. Is this the reliable (assume I use the
>> >> pointer canonical decl will not change) and blessed way to do it? If
>> >> it is, is my steps above the correct and simplest way to determine it
>> >> is T::ok or T::value? Is there some better and more reliable way?
>> > I think it is reliable enough: pointers to canonical declarations
>> > don't change in AST.
>> >>
>> >> Thanks.
>> >>
>> >>
>> >> _______________________________________________
>> >> cfe-dev mailing list
>> >> cfe-dev at lists.llvm.org
>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >
>> >
>> > --
>> > Best regards,
>> > Aleksei Sidorin,
>> > SRR, Samsung Electronics
>> >
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > cfe-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>




More information about the cfe-dev mailing list