<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 5/1/18 2:09 PM, Timothy J. Wood via
cfe-dev wrote:<br>
</div>
<blockquote type="cite"
cite="mid:DF5A5683-416A-4B4B-BFE8-A5025FF4D958@omnigroup.com">
<pre class="moz-quote-pre" wrap="">While working on my outError checker, I'm running into a bunch of false positives that seem to boil down to the same issue.
I *think* my core question is "how do I store state about symbols going out of scope such that I can reliably look it up later", but I'm not 100% sure. Longer version:
Attached below is a diff with a minified version of my checker that just shows this issue, which can be applied to clang on the `release_60` branch. The test case is:
I've annotated a test run of the input and output to make it easier to match up locations, when running like so, :
% ./bin/clang -cc1 -internal-isystem ./lib/clang/6.0.1/include -nostdsysteminc -analyze -analyzer-checker=debug.ViewExplodedGraph,osx.cocoa.NSErrorWrite -fblocks -analyzer-store=region -Wno-objc-root-class ../llvm/tools/clang/test/Analysis/missing-error.m
Test input:
[B] static BOOL _fillErrorWithReturn(NSError **_outError)
{
if (_outError) {
*_outError = [NSError make];
}
[C] return NO;
}
[A] - (BOOL)fillErrorHelperFunctionWithReturn:(NSError **)outError;
{
[D] return _fillErrorWithReturn(outError);
}
Log file:
[A] checkBeginFunction [0x7ff6038a13f0]
Decl 0x7ff60386e130
at /Users/bungi/Source/LLVM/llvm/tools/clang/test/Analysis/missing-error.m:26:1
OutError ParmVarDecl = 0x7ff60386e1b8
StackSlotVal &outError
OutErrorVal &SymRegion{reg_$1<NSError ** outError>}
[B] checkBeginFunction [0x7ff6038a1ea8]
Decl 0x7ff60386e020
at /Users/bungi/Source/LLVM/llvm/tools/clang/test/Analysis/missing-error.m:18:13
OutError ParmVarDecl = 0x7ff60386df20
StackSlotVal &_outError
OutErrorVal &SymRegion{reg_$1<NSError ** outError>}
[C] checkPreStmt<ReturnStmt> [0x7ff604800b48]
Stmt 0x7ff603898078
at /Users/bungi/Source/LLVM/llvm/tools/clang/test/Analysis/missing-error.m:23:2
OutError ParmVarDecl = 0x7ff60386df20
StackSlotVal &_outError
OutErrorVal &SymRegion{reg_$2<NSError ** _outError>}
OutErrorVal is not constrained
[D] checkPreStmt<ReturnStmt> [0x7ff604801188]
Stmt 0x7ff60386e438
at /Users/bungi/Source/LLVM/llvm/tools/clang/test/Analysis/missing-error.m:28:2
OutError ParmVarDecl = 0x7ff60386e1b8
StackSlotVal &outError
OutErrorVal &SymRegion{reg_$1<NSError ** outError>}
OutErrorVal is not constrained
[C] checkPreStmt<ReturnStmt> [0x7ff604802c18]
Stmt 0x7ff603898078
at /Users/bungi/Source/LLVM/llvm/tools/clang/test/Analysis/missing-error.m:23:2
OutError ParmVarDecl = 0x7ff60386df20
StackSlotVal &_outError
OutErrorVal &SymRegion{reg_$2<NSError ** _outError>}
OutErrorVal is not constrained
[D] checkPreStmt<ReturnStmt> [0x7ff604802fd0]
Stmt 0x7ff60386e438
at /Users/bungi/Source/LLVM/llvm/tools/clang/test/Analysis/missing-error.m:28:2
OutError ParmVarDecl = 0x7ff60386e1b8
StackSlotVal &outError
OutErrorVal &SymRegion{reg_$1<NSError ** outError>}
OutErrorVal is not constrained
At [B], the beginning of the static function that gets inlined, looking up the _outError parameter has a stack slot value of "StackSlotVal &_outError", and when I get the contents of that location, I get "&SymRegion{reg_$1<NSError ** outError>}". Note the change in name -- this is the variable from the caller, inlined in to the static function. That seems OK.</pre>
</blockquote>
<br>
Yep, that's indeed OK. The underscored variable _outError contains a
value that is equal to the original value of non-underscored
outError. There never was a separate symbol for representing the
value of _outError - the analyzer just knows that its value is
represented by the old symbol whenever it matters.<br>
<br>
<blockquote type="cite"
cite="mid:DF5A5683-416A-4B4B-BFE8-A5025FF4D958@omnigroup.com">
<pre class="moz-quote-pre" wrap="">Later, in the two [C] branches, when about to return from the inlined static function, trying to get the SVal for _outError, I get a new “$2” name, "&SymRegion{reg_$2<NSError ** _outError>}”, and even though there should be two branches where _outError should be constrained alternately to null and not-null, the OutErrorVal returned is marked as non-constrained on both paths.</pre>
</blockquote>
<br>
Yep, that's because the underscored _outError is dead as a variable,
i.e. no longer used in the program, so we can forget about it. The
checker is not prevented from reincarnating the variable (because
currently checkers are omnipotent), but that's not the right thing
to do. This isn't "renaming" - the underscored reg_$2 should have
never been created and is essentially created artificially by the
checker due to API misuse that we can't easily catch (though we're
working on it).<br>
<br>
Now note that even though _outError is dead, *symbol*
&SymRegion{reg_$1<NSError ** outError>} is alive, because
it is still stored in another variable, namely the original
non-underscored outError. Therefore the constraint on the symbol is
also preserved.<br>
<br>
Additionally, argument values are removed from the Store (memory
model) because parameter variables become dead, but they are still
present in the Environment (syntactic expression values) because the
call-expression is not yet fully evaluated, and therefore its
sub-expressions that represent argument values are still considered
to be important to keep.<br>
<br>
With that reasoning in mind, it is generally advised to use the
Environment to obtain argument values, rather than the Store. The
convenient way of doing that is by using the CallEvent API. A
CallEvent object is readily available in checkPreCall and
checkPostCall callbacks. If it's not available, you can easily
construct it via CallEventManager::getCaller(). For obtaining the
argument values you don't necessarily need to pass an old
ProgramStateRef (that was actual when we just entered the call) to
getCaller(). The current ProgramStateRef will be just fine, because
all we need from it is expression values, and expression values
don't mutate.<br>
<br>
What you won't be able to do with the current program state is
obtain the old value of **_outError. *_outError is in the
Environment, but **_outError is in the old Store only.<br>
<br>
We've been thinking about adding a way of retrieving an old
ProgramState that corresponds to the beginning of previous stack
frames. This would have helped and we wanted it for similar reasons.<br>
<br>
---<br>
<br>
I guess that was too much lecturing, i should focus on the task at
hand.<br>
<br>
Am I understanding correctly that you are trying to compare
**_outError at the beginning of the function and at the end of the
function and then emit the warning based on the return value?<br>
<br>
I think you should do a:<br>
<br>
REGISTER_MAP_WITH_PROGRAMSTATE(const StackFrameContext *, SVal)<br>
<br>
And put the original SVal for **_outError into it. Then at the end
of the function see if the value has changed, by retrieving
*_outError via<br>
<br>
CheckerContext.getStateManager().getCallEventManager()<br>
.getCaller(CheckerContext.getCurrentStackFrame,
CheckerContext.getState())<br>
.getArgSVal(error argument number)<br>
<br>
and dereferencing it.<br>
<br>
<br>
<blockquote type="cite"
cite="mid:DF5A5683-416A-4B4B-BFE8-A5025FF4D958@omnigroup.com">
<pre class="moz-quote-pre" wrap="">
I'm guessing that both the change in the SVal name, and the lack of constraint are due to _outError going out of scope. Making things more odd, though, is that in the exploded graph dot file (attached below too) if you search for the state "0x7ff604800b48" (from the first [C] / checkPreStmt<ReturnStmt> above), you can see that it still has a constraint "reg_$1<NSError ** outError> : { [0, 0] }". In my real checker, I implement the checkDeadSymbols callback and make notes about whether an outError ParmVarDecl value was definitely null when it went out of scope, or whether it was definitely non-null, but had a valid value written to *outError at the time.
The problem lies how to store this information in my custom state such that checkPreStmt<ReturnStmt> can reliably look it up again. If I use a MemRegion as a key in my REGISTER_MAP_WITH_PROGRAMSTATE, then the lookup at "[C] checkPreStmt<ReturnStmt>" seems to fail due to the name change. What should I be using as the lookup key in order to deal with sort lookup of out-of-scope symbols coming back with a different name?
I’ve only noticed this sort of renaming happening with inlining, but maybe that’s is just coincidental.
Hopefully this is the right question, but if you see other places I've gone off the rails, let me know. Otherwise, I'll keep reading and see if I can figure out a solution.
Thanks!
-tim
</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">
</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
cfe-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>
</pre>
</blockquote>
<br>
</body>
</html>