[cfe-dev] Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class
via cfe-dev
cfe-dev at lists.llvm.org
Wed Feb 28 13:28:38 PST 2018
Thanks Artem for clarification. It is very clear and now I understand the situation. Sorry to bother again, but now I have yet another question.
I am now trying to make the checker work for pointer escape. Since I track the status of objects using MemRegion, so I am using checkRegionChanges. I've observed something weird.
Suppose I have free functions whose definitions are not seen by current TU: void f(const S*); void g(S*);
1. Suppose I have
S s;
f(&s);
Then in checkRegionChanges, s's region will appear in explicit_regions but not regions.
2. Suppose I have
S s;
g(&s);
Then in checkRegionChanges, t's region will appear in both explicit_regions and regions.
Based on the above observations, if I want to remove s from GDM in case 2 but not 1, then I need to iterate over all entries in GDM and remove the ones that are sub-regions of any one in `region` variable, not `explicit_regions` variable. However, this doesn't work for the following case
3. If I have struct T { S s; D d; }; and U has some non-const member void D::h(); for the following call:
T t;
t.d.h();
Then in checkRegionChanges, t.d is in explicit_regions and t is in regions. So t.s will be removed from GDM if I check 'regions' variable, which is incorrect.
My question is, is case 1 a bug or a feature that s appears in explicit_regions? If it is by design, how what is the blessed way to distinguish the above cases? Thanks.
BTW: the code that causes the behavior is
https://clang.llvm.org/doxygen/RegionStore_8cpp_source.html#l01256
if (const MemRegion *R = V.getAsRegion()) {
if (TopLevelRegions)
TopLevelRegions->push_back(R);
W.AddToWorkList(R);
continue;
}
All regions are added to TopLevelRegions, without checking RegionAndSymbolInvalidationTraits::TK_PreserveContents.
----- Original Message -----
From: Artem Dergachev <noqnoqneo at gmail.com>
To: likan_999.student at sina.com, Aleksei Sidorin <a.sidorin at samsung.com>, cfe-dev <cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class
Date: 2018-02-23 09:20
You're essentially in an area that's largely under construction at the
moment, and i'm likely to address your issue in like nearest future, you
can probably follow my progress on phabricator or in cfe-commits, or i
can try to ping you when this case is supported.
Being source-based, the analyzer needs to model every possible language
construct explicitly, and in C++ it means *a ton of stuff*. We could
analyze, like, llvm IR instead, and it'd be much easier to implement
correctly for us, but it'd be much harder to produce user-friendly
warnings, because the user does think in terms of source code. And it'd
be harder to make checkers for the same reason. Which is why "symbolic
execution" we're doing here is to compilation like algebra to
arithmetic: we need to do the same thing - support all language
constructs - but with a lot of "concrete" values replaced with algebraic
unknown values. Luckily, it is also easier because we don't have to
support everything perfectly: we can always say that we don't know
something, behave "conservatively", and the analysis would be slightly
less precise but this is fine if there's not too much of these.
Modeling temporaries is one of such cases - this is one of the difficult
parts of the language that we never did properly until now. One does not
simply construct into a temporary region because in this case we're not
even sure that it is indeed a temporary region in this part of the
analyzer, it cannot be understood by looking at the construct-expression
in the AST, which is something i'm working around now by introducing
"construction contexts". And even if we were sure it's a temporary, we
need to not only construct the object, but also know how to properly
extend its lifetime and destroy it when lifetime ends, otherwise we'd
encounter massive false positives (because it'd not be "we don't know
something", it'd be "we know something but it's wrong"), which should
also be part of the "context" in which construction takes place. So for
now this part simply behaves conservatively.
Your particular case will be relatively tricky when f() is inlined
during analysis. In this case the temporary will be constructed within
the function, but in order to properly work with this temporary we'd
need what the caller function is going to do with it - at least
according to the AST that we currently have this is what we have to do.
More info in
http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - "==
Return by value ==" part. CodeGen does it differently, treating a lot of
AST rvalues as lvalues under the hood, but i don't think we can afford
breaking this invariant for now, because, well, we're more complicated
and fragile.
On 22/02/2018 4:46 PM, likan_999.student at sina.com wrote:
> Thanks Artem, Aleksei for those precious information.
>
> Yes, I observed exactly the same thing about MaterializeTemporaryExpr
> and I actually spent quite sometime trace into the clang's codebase
> and reached to the exactly same place in the link:
> createTemporaryRegionIfNeeded. My workaround was inspired by one of
> the example checkers, that uses a checkPostStmt on
> MaterializeTemporaryExpr. Thanks for pointing out this.
>
> Aleksei, thanks for pointing out the getQualifiedNameAsString, I
> didn't know I could use that . I personally prefers it because
> getParent()->getName() doesn't contain the namespace, but only the
> class name. So to make it safe, I need to go all the way up to
> traverse all parent decls, and getQualifiedNameAsString already does
> the thing for me.
>
> Thanks for taking your time to answer my question. I appreciate it.
>
> Now I observed some other weird thing, and I wonder if anyone has seen
> this before and knows any solution.
>
> As a reminder, the checker I am writing is for a particular class T in
> my company's code base and the checker is to ensure T.ok() == true
> before calling T.value().
>
> The above is actually a over-simplification to my task. The class T
> actually has a member, Let's say s_ (in type S) to keep track of the
> status, so T.ok() actually returns s_.ok(). And type S is used
> directly by a few other classes and macros which are used to creates
> T. So to make the check fully working, I also need to model the
> behaviors of S and a few other classes.
>
> Fortunately, T are template class and therefore all method
> implementations are visible to current TU. I was thinking, it is a
> whole lot easier if I can just model the behavior of S, and rely on
> inline evaluation in the static analyzer. In most cases, it works,
> except I found the following corner case (suppose T has a conversion
> constructor from X: T(X&&) and default move constructor T(T&&)):
>
> X f();
> T t = f();
>
> This is essentially constructs a temporary T t2{f()} and then doing a
> move T t{std::move(t2)}. It looks fine if everything is inlined,
> however, the inline doesn't work because t2 is a temporary. I dived
> into the code base and what happened is
>
> 1. At the end of ExprEngine::getRegionForConstructedObject
> <https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#%2100183>,
> it is marked as improperly modeled target region:
> CallOpts.IsCtorOrDtorWithImproperlyModeledTargetRegion
> <https://clang.llvm.org/doxygen/structclang_1_1ento_1_1ExprEngine_1_1EvalCallOptions.html#a339aa5a5b9aecfa8a844647b8d211eb0>=
> true;
> 2. In ExprEngine::mayInlineCallKind
> <https://clang.llvm.org/doxygen/ExprEngineCallAndReturn_8cpp_source.html#%2100677>,
> it returns false if the flag is set:
> // If we did not find the correct this-region, it would be pointless
> // to inline the constructor. Instead we will simply invalidate // the
> fake temporary target. if (CallOpts
> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp?l=618&ct=xref_jump_to_def&gsn=CallOpts&rcl=186553626>.IsCtorOrDtorWithImproperlyModeledTargetRegion
> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=62&ct=xref_jump_to_def&gsn=IsCtorOrDtorWithImproperlyModeledTargetRegion&rcl=186553626>)
> return CIP_DisallowedOnce
> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=623&ct=xref_jump_to_def&gsn=CIP_DisallowedOnce&rcl=186553626>;
>
> I am not an expert in C++ compiler. I just wonder why it can't
> allocates a temporary region for temporary variables? Having this
> special case means I need to model all T's constructors, which is
> really annoying. Does anyone knows some workarounds? Thanks.
>
>
> ----- Original Message -----
> From: Artem Dergachev <noqnoqneo at gmail.com>
> To: Aleksei Sidorin <a.sidorin at samsung.com>, Li Kan
> <likan_999.student at sina.com>, cfe-dev at lists.llvm.org
> Subject: Re: [cfe-dev] Static analyzer question on how to tell if a
> Decl is a particular c++ function, or a particular class
> Date: 2018-02-16 02:26
>
>
> Yeah, the traditional way to see if this is a certain method of a
> certain class is to compare the name of the method
> (CXXMethodDecl->getName()) and the class
> (CXXMethodDecl->getParent()->getName()), as strings, to the desired
> strings. getQualifiedNameAsString() works fine, but it's considered slow
> for whatever reason (i've never observed it personally). ASTMatchers
> would also work, even if a bit of an overkill. Don't forget to check the
> method's argument number and types as well. We have an effort to make
> this whole common idiom more simple via CallDescription, but it doesn't
> work for C++ methods yet.
> With C++, you'll need to be aware of the bug we're currently having with
> object lifetime extension. If you're identifying the object by its
> memory region (which is correct), you'd occasionally notice that the
> region for a C++ temporary object may change along the analysis even if
> no copy-construction occurs. It has been making our new C++ checkers
> (most noticeably, the iterator checker) more difficult to write than
> they needed to be. More details in
> http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - i'm
> hoping to do something about it soon.
> On 15/02/2018 1:35 AM, Aleksei Sidorin via cfe-dev wrote:
> > Hello Li,
> >
> > What you're doing is mostly fine. I can only add some potentially
> > useful info inline.
> >
> > 14.02.2018 23:04, Li Kan via cfe-dev пишет:
> >> Hi folks,
> >>
> >> If this is not the correct mailing list for this question, please let
> >> me know.
> > You're in the right place. Welcome :)
> >
> >> I am trying to write a static analyzer for code base of my current
> >> job. The particular static analyzer I am writing involves a
> >> particular class of my company's code base. Let's say it is T. I want
> >> to write a checker to ensure T.ok() is called before T.value() is
> called.
> >>
> >> Static analyzer is perfect for this type of check as it requires path
> >> sensitive checks. When I trying to write the checker, I basically
> >> checks for pre-call and post-call. I want to tell if a CallEvent is
> >> T.ok() and T.value(). Currently what I am doing is:
> >> 1. From CallEvent, I cast to CXXMemberCall, and getOriginExpr(), then
> >> call getMethodDecl().
> > You can also try to do the following chain:
> > dyn_cast_or_null<CXXMethodDecl>(CallEvent.getDecl()) to obtain
> > CXXMethodDecl.
> >
> >> 2. From CXXMethodDecl, I first call getThisType(), then call
> >> getAsCXXRecordDecl(), then getQualifiedNameAsString(), and compare
> >> with qualified name of T, to make sure it is member of T.
> > To obtain parent class declaration from CXXMethodDecl, you can use
> > getParent() method. So, the pseudocode will look like
> > "MethodDecl->getParent()->getQualifiedNameAsString()".
> >
> >> 3. From CXXMethodDecl, I call getNameAsString() to get the method
> >> name, and compare them with "ok" and "value".
> > This looks OK. But it can be useful to know that
> > getQualifiedNameAsString() for CXXMethodDecl will also include parent
> > name. So, you can just check if MethodDecl->getQualifiedNameAsString()
> > is equal to "T::value" or "T::ok".
> >
> >> It works, but it looks complicated, and involves string comparison,
> >> which I assume is slow. Is there a easier way, blessed by clang
> >> static analyzer official teams, that tell if a MethodDecl, or
> >> CXXRecordDecl, is the function or class I am interested in?
> >>
> >> The ideal way is, there is one-time call to lookup the MethodDecl for
> >> T::ok() and T::value(), and CXXRecordDecl for T. Then ever since
> >> then, I just need to compare the pointers.
> > This is possible. You can just launch a simple matcher against AST:
> > cxxMethodDecl(hasName("::T::value"))
> > and get CXXMethodDecl you need and compare with it later.
> >
> >> Another way is, the first time I found it is T, T::ok() or
> >> T::value(), I save the pointer of CXXRecordDecl or MethodDecl, and
> >> use the pointers later. Is this the reliable (assume I use the
> >> pointer canonical decl will not change) and blessed way to do it? If
> >> it is, is my steps above the correct and simplest way to determine it
> >> is T::ok or T::value? Is there some better and more reliable way?
> > I think it is reliable enough: pointers to canonical declarations
> > don't change in AST.
> >>
> >> Thanks.
> >>
> >>
> >> _______________________________________________
> >> cfe-dev mailing list
> >> cfe-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> >
> >
> > --
> > Best regards,
> > Aleksei Sidorin,
> > SRR, Samsung Electronics
> >
> >
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
More information about the cfe-dev
mailing list