[cfe-dev] [analyzer] Looking up previously stored custom state for ParmVarDecls that have gone out of scope

Timothy J. Wood via cfe-dev cfe-dev at lists.llvm.org
Wed May 2 18:22:01 PDT 2018



> On May 2, 2018, at 4:35 PM, Artem Dergachev <noqnoqneo at gmail.com> wrote:
> 
[…]
> Yep, that's because the underscored _outError is dead as a variable, i.e. no longer used in the program, so we can forget about it. The checker is not prevented from reincarnating the variable (because currently checkers are omnipotent), but that's not the right thing to do. This isn't "renaming" - the underscored reg_$2 should have never been created and is essentially created artificially by the checker due to API misuse that we can't easily catch (though we're working on it).

That makes sense, but when _outError goes out of scope, checkDeadSymbols wasn’t getting called (though I can see there is a node in the exploded graph for dead symbol processing on that path).

> 
> Now note that even though _outError is dead, *symbol* &SymRegion{reg_$1<NSError ** outError>} is alive, because it is still stored in another variable, namely the original non-underscored outError. Therefore the constraint on the symbol is also preserved.
> 
> Additionally, argument values are removed from the Store (memory model) because parameter variables become dead, but they are still present in the Environment (syntactic expression values) because the call-expression is not yet fully evaluated, and therefore its sub-expressions that represent argument values are still considered to be important to keep.
> 
> With that reasoning in mind, it is generally advised to use the Environment to obtain argument values, rather than the Store. The convenient way of doing that is by using the CallEvent API. A CallEvent object is readily available in checkPreCall and checkPostCall callbacks. If it's not available, you can easily construct it via CallEventManager::getCaller(). For obtaining the argument values you don't necessarily need to pass an old ProgramStateRef (that was actual when we just entered the call) to getCaller(). The current ProgramStateRef will be just fine, because all we need from it is expression values, and expression values     don't mutate.
> 
> What you won't be able to do with the current program state is obtain the old value of **_outError. *_outError is in the Environment, but **_outError is in the old Store only.
> 
> We've been thinking about adding a way of retrieving an old ProgramState that corresponds to the beginning of previous stack frames. This would have helped and we wanted it for similar reasons.
> 
> ---
> 
> I guess that was too much lecturing, i should focus on the task at hand.

Heh — I’ll spend some time trying to understand the above. As you can tell, I’m still stumbling around bumping into things in the dark.


> 
> Am I understanding correctly that you are trying to compare **_outError at the beginning of the function and at the end of the function and then emit the warning based on the return value?


The goal is to check adherence to the NSError contract more strongly. Given such a function or method (has a non-void return and a outError parameter), then:

- If the return value is known to be non-zero, outError does not need to be filled.
- If a return value is known to be zero, then:
	- If outError is known to be null, no *outError need be written
	- If outError is known to be non-null, *outError must be written to with a non-null value

- A write to *outError can occur:
	- Directly via *outError = /* some expression guaranteed to be non-null */;
	- Indirectly by calling another error returning API, but only on the branch where the callee returned a zero result

- In all cases, the original outError passed should be the one filled. That is, this is not valid:

	- (BOOL)bad:(NSError **)outError; {
		NSError *e;
		outError = &e;
		*outError = [NSError errorWithDomain…];
		return NO;
	}

Some subtle bugs can be present in NSError returning code that I’d like to also warn about:

- After a call to a error returning function or method that returns non-zero (success), the outError passed to the callee should be considered clobbered even if it was previously filled
- Reading *outError before it is known to contain a valid value should be an error
- A call to a nullable receiver that happens to be nil might accidentally return zero status w/o really writing to *outError

To do all this I need to track whether outError arguments have been checked to be null, whether *outError has been written to, and be able to check those things at the point of return. In some case I can’t know the exact return value:

	- (BOOL)foo:(NSError **)outError; {
		return [self bar:outError];
	}

but the state splits make this work out nicely; on the side where -bar: returns zero, I can assume *outError was written if outError isn’t null, and on the side were -bar: returns non-zero, we are returning non-zero and don’t need a write to *outError.

Along with all this, not only do I need to track the nullness of outErrors and *outErrors, but of local NSError variables too. A common pattern is:

	- (BOOL)foo:(NSError **)outError; {

		NSError *localError;
		if (![self bar:&localError]) {
			// maybe look at localError’s domain and code and try something else

			// … but then eventually fail
			if (outError) {
				*outError = localError;
			}
			return NO;
		}
	}

These are the “simple” cases, but there are some cases that will probably be infeasible w/o making assumptions about specific API in Foundation (asynchronous operations writing to __block variables, NSFileCoordinator taking a outError w/o a BOOL result, and probably others). It will probably be more reasonable to skip those.

> 
> I think you should do a:
> 
>   REGISTER_MAP_WITH_PROGRAMSTATE(const StackFrameContext *, SVal)
> 
> And put the original SVal for **_outError into it. Then at the end of the function see if the value has changed, by retrieving *_outError via
> 
>   CheckerContext.getStateManager().getCallEventManager()
>   .getCaller(CheckerContext.getCurrentStackFrame, CheckerContext.getState())
>   .getArgSVal(error argument number)
> 
> and dereferencing it.

I was somehow under the assumption that SVals were transient and not to be stored, but if that’s a faulty assumption it could be a useful tool.

Thanks again!

-tim




More information about the cfe-dev mailing list