[PATCH] D79704: [Analyzer] [NFC] Parameter Regions

Balogh, Ádám via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 20 07:02:20 PDT 2020


baloghadamsoftware marked 7 inline comments as done.
baloghadamsoftware added a comment.

In D79704#2042130 <https://reviews.llvm.org/D79704#2042130>, @martong wrote:

> Can we find the FunctionDecl if the call happens through a function pointer? Or in that case do we actually find the VarDecl of the function pointer?


Yes, if it has a declaration, then we can retrieve it from the stack frame.

> So, in this patch, we are trying to solve only that problem when the callee `FunctionDecl` does not have a definition?

Not even that. This  patch is just a non-functional change that prepares the ground for such improvements. The next patch intends to solve them problem where the function has declaration but no definition.

> This happens in case of top-level params and in case of function pointers, plus the special argument construction, am I right?

Top-level is a special case: there we do not have `Expr` either so we cannot handle it. For top-level functions the regions remain `VarRegions`. There is nothing special in parameters when analyzed top-level.

> Based on the answers to the above questions, possibly we are trying to solve two problems here. Could we split then this patch into two?
> 
> 1. callee FunctionDecl does not have a definition. This could assert on having an accessable Decl in ParamVarRegion. Here we could have the foundations of the next patch, i.e. we could have ParamVarRegion and NonParamVarRegion.
> 2. function pointers, plus the special argument construction: ParamVarRegion may not have a Decl and we'd remove the related lines from getCalleeAnalysisDeclContext. This obviously requires more tests. And more brainstorming to get the AnalysisDeclContext when there is no Decl available for the function in getCalleeAnalysisDeclContext.

That is exactly my suggestion, but not this, current patch, but the next one. The next patch is for functions without definition but declarations, and there could be another one in the future for functions without declaration. Of course this latter requires some extra code into `ParamRegion`s to handle cases where we do not have a `Decl`. I cannot write it now because I cannot make a test for it, not even a unit test.

In D79704#2046495 <https://reviews.llvm.org/D79704#2046495>, @martong wrote:

> Seems like many changes could be simplified or absolutely not needed in this patch if ParamRegion (or with a better name ParamVarRegion) was derived from VarRegion. Is there any difficulties in that derivation?


It is one of my suggestions, I am still waiting for @NoQ's opinion: let us remove the `Decl` from `DeclRegion` and make `getDecl()` pure virtual there. Also not implement it in `VarRegion`, but as @NoQ suggested, create something like `PlainVarRegion` or `NonParamVarRegion` where we store it, and `ParamVarRegion` where we retrieve it instead of storing it.



================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h:309
+  /// Get the lvalue for a parameter.
+  Loc getLValue(const Expr *Call, unsigned Index,
+                const LocationContext *LC) const;
----------------
martong wrote:
> Is this used anywhere in this patch?
> 
> And why isn't it a `CallExpr` (instead of `Expr`)?
The origin expression of a call may be a set of different kinds of expressions: `CallExpr`, `CXXConstructExpr`, `BlockExpr` or `ObjCMessageExpr`.


================
Comment at: clang/lib/StaticAnalyzer/Core/MemRegion.cpp:195
+    if (Index >= FD->param_size())
+      return nullptr;
+
----------------
martong wrote:
> In all other cases we `assert`, here we return with a nullptr. Why?
It was accidentally left here from an earlier experiment.


================
Comment at: clang/lib/StaticAnalyzer/Core/MemRegion.cpp:915
+            return cast<VarRegion>(I.getCapturedRegion());
+        } else if (const auto *PR = dyn_cast<ParamRegion>(OrigR)) {
+          if (PR->getDecl() == VD)
----------------
martong wrote:
> Both the `VarRegion` and `ParamRegion` cases here do the same.
> This suggest that maybe it would be better to have `VarRegion` as a base class for `ParamVarRegion`.
> This is aligned to what Artem suggested:
> > We can even call it VarRegion and have sub-classes be ParamVarRegion and NonParamVarRegion or something like that.
> But maybe `NonParamVarRegion` is not needed since that is actually the `VarRegion`.
Usually we do not inherit from concrete classes by changing a method implementation that already exist. The other reason is even stronger: `NonParamVarRegion` //must// store the `Decl` of the variable, `ParamVarRegion` //must not//.


================
Comment at: clang/lib/StaticAnalyzer/Core/MemRegion.cpp:1230
+
+const TypedValueRegion*
+MemRegionManager::getRegionForParam(const ParmVarDecl *PVD,
----------------
martong wrote:
> Why the return type is not `ParamVarRegion*`?
Because for top-level functions and captured parameters it still returns `VarRegion`.


================
Comment at: clang/lib/StaticAnalyzer/Core/MemRegion.cpp:191
+const ParmVarDecl *ParamRegion::getDecl() const {
+  const Decl *D = getStackFrame()->getDecl();
+
----------------
martong wrote:
> baloghadamsoftware wrote:
> > NoQ wrote:
> > > baloghadamsoftware wrote:
> > > > NoQ wrote:
> > > > > baloghadamsoftware wrote:
> > > > > > NoQ wrote:
> > > > > > > baloghadamsoftware wrote:
> > > > > > > > NoQ wrote:
> > > > > > > > > This doesn't work when the callee is unknown.
> > > > > > > > Please give me an example where the callee is unknown. As I wrote, originally I put a `getExpr()` method here as well and `getType()` fell back to it if it could not find the `Decl()`. However, it was never invoked on the whole test suite. (I put an `assert(false)` into it, and did not get a crash.
> > > > > > > > Please give me an example where the callee is unknown.
> > > > > > > 
> > > > > > > >>! In D79704#2034571, @NoQ wrote:
> > > > > > > >>>! In D79704#2032947, @Szelethus wrote:
> > > > > > > >> Could you give a specific code example? 
> > > > > > > > 
> > > > > > > > ```lang=c++
> > > > > > > > struct S {
> > > > > > > >   S() {
> > > > > > > >     this; // What region does 'this' point to...
> > > > > > > >   }
> > > > > > > > };
> > > > > > > > 
> > > > > > > > void foo(void (*bar)(S)) {
> > > > > > > >   bar(S()); // ...in this invocation?
> > > > > > > > }
> > > > > > > > ```
> > > > > > OK, but it still does not crash the analyzer, even if I enable creation of stack frames for all the callees, even for those without definition. What else should I do to enforce the crash (null pointer dereference)? Try to use `getParameterLocation()` in a unit test?
> > > > > Yes. I demonstrated how you can get the region without a decl but you should turn this into a test that actually calls one of the problematic functions. Like, `clang_analyzer_dump()` it or something.
> > > > Of  course I tried `clang_analyzer_explain()`, but neither `clang_analyzer_dump()` helps. I also tried a unit test now where I call `getParameterLocation()` explicitly. It turns out that parameter regions are never created for functions without `Decl` because of the first lines in `CallEvent::getCalleeAnalysisDeclContext()`. This function //needs// the `Decl` to retrieve the `AnalysisDeclContext` of the callee, which is needed to retrieve its stack  frame by `getCalleeStackFrame()`. Without stack frame we do not create `ParamRegion`. The other two functions creating `ParamRegion` (`CallEvent::addParameterValuesToBindings` and the newly created `MemRegionManager::getRegionForParam`) start from `ParmVarDecl` which always belongs to a `Decl`. So we can safely assume that currently all parameter regions have a `Decl`. Of course, this can be changed in the future, but I must not include dead code in a patch that cannot even be tested in the current phase. Even creation of callee stack frame for functions without //definition// is not part of this patch, but of the subsequent one.
> > > > neither `clang_analyzer_dump()` helps
> > > 
> > > So, does it dump a parameter region? Because it obviously should.
> > It dumps a temporary region as until now. So this patch does not change the behavior of the analyzer at this point.
> I think we should run this patch against a test suite of open source projects. That could be tmux, curl, redis, xerces, bitcoin, protobuf (those are that we use in CTU at least). We should not have any crashes on theses projects because they exercise widely used language constructs. Besides, they provide a relatively broad set of use of the C and C++ languages, so that's quite a good coverage. In case of any assertion we can use creduce to nail the actual problem.
As I mentioned, I already did it: D79704#2036067


================
Comment at: clang/unittests/StaticAnalyzer/ParamRegionTest.cpp:72
+TEST(ParamRegion, ParamRegionTest) {
+  EXPECT_TRUE(tooling::runToolOnCode(
+                  std::make_unique<ParamRegionTestAction>(),
----------------
martong wrote:
> I think the test code would be more valuable if we could use the `ASTMatcher` to match for a given Decl and then for that Decl we could find the corresponding Region. Then we should have different test cases for the different regions (with expectations formulated on those regions). Perhaps a Decl* -> Region mapping is needed in the test Fixture.
That was the original test, but the current one is more generic: checks for all parameters and checks whether we create the correct type of region for them.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79704/new/

https://reviews.llvm.org/D79704





More information about the cfe-commits mailing list