[PATCH] D69155: [analyzer] Fix off-by-one in operator call parameter binding.

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Oct 17 19:35:52 PDT 2019


NoQ created this revision.
NoQ added reviewers: dcoughlin, xazax.hun, a_sidorin, rnkovacs, Szelethus, baloghadamsoftware, Charusso.
Herald added subscribers: cfe-commits, dkrupp, donat.nagy, mikhail.ramalho, a.sidorin, szepet.
Herald added a project: clang.

This is the bug that I'll be using in the LLVM Developer's Meeting presentation as an example of how to debug things, so //spoiler alert!//

And let me explain it a bit slower than usual.

Here's the original false positive that we're fixing:

F10304929: bad.html <https://reviews.llvm.org/F10304929>

It has appeared by applying the Static Analyzer to LLVM itself. Namely, you should be able to reproduce it locally by applying the static analyzer to LLVM rL373385 <https://reviews.llvm.org/rL373385>; the static analyzer itself should be one commit behind //this// commit.

The warning tells us that variable `Reg` on line 824 is uninitialized:

F10304965: Screen Shot 2019-10-17 at 6.18.06 PM.png <https://reviews.llvm.org/F10304965>

This, however, is clearly a false positive, because the lambda invocation `isKilledReg(MO, Reg)` receives a non-const reference to `Reg` and initializes it on the current path:

F10304987: Screen Shot 2019-10-17 at 6.19.32 PM.png <https://reviews.llvm.org/F10304987>

The static analyzer somehow fails to realize that `Reg` is initialized, and even displays a clearly incorrect note (41) "Returning without writing to 'Reg'".

The variable `Reg` itself is declared in the caller function, `transferSpillOrRestoreInst`, and passed into the current function by reference, which is also named `Reg`:

F10305034: Screen Shot 2019-10-17 at 6.22.45 PM.png <https://reviews.llvm.org/F10305034>

F10305046: Screen Shot 2019-10-17 at 6.24.04 PM.png <https://reviews.llvm.org/F10305046>

In order to debug the issue, I dumped the trimmed exploded graph for the current analysis and used exploded-graph-rewriter to remove the unnecessary details:

  $ clang ... -analyze-function="(anonymous namespace)::LiveDebugValue::runOnMachineFunction(class llvm::MachineFunction &)" -analyzer-dump-egraph=graph.dot -trim-egraph
  
  $ exploded-graph-rewriter.py --topology graph.dot
  
  # I searched the topology dump and found out that our bug node is 52419.
  $ exploded-graph-rewriter.py --to 52419 --single-path --diff graph.dot

Here's the final graph dump:

F10305539: tmpruvE5U.html <https://reviews.llvm.org/F10305539>

I tried to find out which transition in the graph was incorrect.

First of all I confirmed that there is no binding in the Store for the `Reg` in the bug state. The //referece// `Reg` was there and was known to refer to the //variable// `Reg`, but the variable itself is indeed nowhere to be found, as if it was never written to:

F10305142: Screen Shot 2019-10-17 at 6.35.33 PM.png <https://reviews.llvm.org/F10305142>

This gave me confidence that the checker is not to be blamed: the variable was "known" to be uninitialized, therefore the checker was right to report it, and we need to find out why was the variable incorrectly known to be uninitialized.

Therefore, logically, the next question was why did the assignment on line 812, before step 41, was not recorded in the Store. Here's the ExplodedNode in which the assignment operator was evaluated:

F10305210: Screen Shot 2019-10-17 at 6.42.28 PM.png <https://reviews.llvm.org/F10305210>

Here we can see that the assignment was in fact recorded in the Store, but in a wrong memory region. In our notation, `SymRegion{reg_$3519<unsigned int & Reg>}` is an //unknown// region of memory that is the pointee of the reference `Reg` points to.

However, given that `Reg` is a parameter in a function call into which we dived during analysis, this symbolic value should not have existed to begin with, because we have concrete knowledge about what `Reg` points to. Instead, we should have had a binding in the Store from `Reg` (the memory region of the reference parameter) to `&Reg` (the pointer to the variable `Reg`), and loading from `Reg` (as part of evaluating which location should be written into) should have yielded `&Reg` rather than `&SymRegion{reg_$3519<unsigned int & Reg>}`.

In order to confirm this educated guess, I compared the evaluation that the Static Analyzer did for call sites of `isLocationSpill()`:

F10305268: Screen Shot 2019-10-17 at 6.56.29 PM.png <https://reviews.llvm.org/F10305268>

and `isKilledReg()`:

F10305274: Screen Shot 2019-10-17 at 6.57.33 PM.png <https://reviews.llvm.org/F10305274>

And, indeed, the binding `Reg : &Reg` is supposed to be there in both cases, but it is missing in case of `isKilledReg()`. Interestingly, the binding for the other parameter, `MO`, is also missing in case of `isKilledReg`.

By scrolling up a few nodes we can see that the bindings for //expressions// `Reg` and `MO` are present in the Environment when the call is being entered:

F10305300: Screen Shot 2019-10-17 at 7.01.05 PM.png <https://reviews.llvm.org/F10305300>

In other words, the translation of parameter values from the Environment to the Store was not performed correctly during call enter, even though the necessary information seems to have been present in the program state. Therefore this transition was the primary suspect.

Then I attached debugger to the transition. Even though there are many call enters that were evaluated throughout this analysis, I can attach the debugger precisely with the help of the stable IDs:

  (lldb) br s -n inlineCall -c 'Pred->Id = 52357'

Here `inlineCall` is a method of `ExprEngine` that is responsible for "diving" into calls during analysis. Because the `ExprEngine` class is responsible for evaluating the effects of specific statements (and other CFG elements) over the program state, any transition can be debugged by setting a similar breakpoint on a method of this class that looks relevant.

Step-by-step debugging led me into function `addParameterValuesToBindings()` in `CallEvent.cpp`, which is indeed the function responsible for the translation of parameter values from the Environment to the Store:

  528     SVal ArgVal = Call.getArgSVal(Idx);
  529     if (!ArgVal.isUnknown()) {
  530       Loc ParamLoc = SVB.makeLoc(MRMgr.getVarRegion(ParamDecl, CalleeCtx));
  531       Bindings.push_back(std::make_pair(ParamLoc, ArgVal));
  532     }

This transition seems straightforward, however there is a caveat - in C++ the transition is skipped for parameters that were constructed by invoking a constructor:

  522       if (Call.isArgumentConstructedDirectly(Idx))
  523         continue;

Indeed, if a constructor was invoked, then the Store bindings should already be there by the time the call is entered, so there is no need to manually bind them.

We do in fact have an argument that is constructed directly in this example, which is `MO` - a C++ object of class `MachineOperand` that gets passed into `isKilledReg` by value:

F10305470: Screen Shot 2019-10-17 at 7.24.42 PM.png <https://reviews.llvm.org/F10305470>

By the way, you can learn more about how constructors are modeled in a talk by @george.karpenkov and me in LLVM DevMtg 2018: https://www.youtube.com/watch?v=4n3l-ZcDJNY

The problem, however, was that `isArgumentConstructedDirectly(Idx)` returned `false` for `MO` but `true` for `Reg`! It should clearly be the other way round.

This was happening due to an off-by-one error caused by how in Clang AST for historical numbering of parameters in member operator declarations is different by 1 from the numbering of arguments in member operator call expressions: one of them counts "this", the other doesn't. I was already aware about this problem, which is why in D49443 <https://reviews.llvm.org/D49443> I added convenient accessor methods for dealing with these mismatched offsets. However, I forgot to use them in this tiny code snippet. Hence the patch.

Thanks for reading!


Repository:
  rC Clang

https://reviews.llvm.org/D69155

Files:
  clang/lib/StaticAnalyzer/Core/CallEvent.cpp
  clang/test/Analysis/temporaries.cpp


Index: clang/test/Analysis/temporaries.cpp
===================================================================
--- clang/test/Analysis/temporaries.cpp
+++ clang/test/Analysis/temporaries.cpp
@@ -1231,3 +1231,19 @@
   return coin ? S() : foo(); // no-warning
 }
 } // namespace return_from_top_frame
+
+#if __cplusplus >= 201103L
+namespace arguments_of_operators {
+struct S {
+  S() {}
+  S(const S &) {}
+};
+
+void test() {
+  int x = 0;
+  auto foo = [](S s, int &y) { y = 1; };
+  foo(S(), x);
+  clang_analyzer_eval(x == 1); // expected-warning{{TRUE}}
+}
+} // namespace arguments_of_operators
+#endif // __cplusplus >= 201103L
Index: clang/lib/StaticAnalyzer/Core/CallEvent.cpp
===================================================================
--- clang/lib/StaticAnalyzer/Core/CallEvent.cpp
+++ clang/lib/StaticAnalyzer/Core/CallEvent.cpp
@@ -519,7 +519,7 @@
 
     // TODO: Support allocator calls.
     if (Call.getKind() != CE_CXXAllocator)
-      if (Call.isArgumentConstructedDirectly(Idx))
+      if (Call.isArgumentConstructedDirectly(Call.getASTArgumentIndex(Idx)))
         continue;
 
     // TODO: Allocators should receive the correct size and possibly alignment,


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D69155.225555.patch
Type: text/x-patch
Size: 1187 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20191018/330fc7e7/attachment-0001.bin>


More information about the cfe-commits mailing list