[cfe-dev] [analyzer] Evaluating a call to operator bool()
via cfe-dev
cfe-dev at lists.llvm.org
Mon Dec 30 19:55:15 PST 2019
Hi,
Thanks! It took me a while since this original mailing list post, but I
have now returned to this project and have made some progress.
On Thu, Aug 1, 2019 at 5:40 PM Artem Dergachev <noqnoqneo at gmail.com> wrote:
> On 7/31/19 2:01 PM, via cfe-dev wrote:
>
> Hi list,
>
> I have the following code to analyze:
>
> struct BoolConvertibleStruct {
> int n;
> BoolConvertibleStruct(int m) : n(m) {}
> operator bool() const { return n != 0; }
> };
>
> BoolConvertibleStruct StructFunc() {
> return 1;
> }
>
> I have reduced my problem to wanting to analyze StructFunc() to figure out
> the truth value of its return value. (The actual problem is more
> complicated and you might recognize that BoolConvertibleStruct is a
> stand-in for std::unique_ptr<T>, among other things :-P)
>
>
> That's a very important detail. If it's a struct that you've implemented
> yourself, then all you need to do is extract the value from a field of the
> structure (it's not a problem when you know the name of the field).
>
Indeed, I've gotten my contrived example to work (see below) but you're
right it is quite different when dealing with std::unique_ptr.
However, if it's something in the C++ standard library and it's implemented
> differently depending on the particular implementation of the standard
> library that you may be using, then you can't access the field because
> you've no idea what meaning does every field carry, so you'll have to treat
> the structure as an opaque object and reason about its contents by modeling
> every method of the structure. I.e.:
>
> 1. Subscribe to the constructor of the structure and map (as in
> REGISTER_MAP_WITH_PROGRAMSTATE) the region of the structure to the value
> with which it was constructed.
> 2. Subscribe to the copy/move constructor of the structure and map the
> region into which it's copied/moved to the same value.
> 3. Subscribe to any method that mutates the value and update your maps.
> 4. Once you do all of this, you would be able to simply retrieve the value
> from your map when you need to model operator bool.
>
> This approach is costly and annoying and easy to get wrong and i wish we
> had better tools for implementing it but for now it assumes a lot of
> boilerplate. If you want examples, see how the experimental IteratorChecker
> tries to model iterators (which is a harder problem).
>
I think I have got a skeleton of this approach almost working, but I seem
to be stuck in some of the details. My program state map is from "const
MemRegion*" to "SVal" (I don't think I need an additional map for symbols
as in IteratorChecker?) However it seems that the MemRegion I get when
subscribing to the constructor in PreCall or PostCall is different from the
MemRegion I get when modeling std::unique_ptr::get() in EvalCall. I'm
getting the MemRegion from the CallEvent like so (in the constructor I use
CXXConstructorCall instead):
const auto* instCall = cast<CXXInstanceCall>(&call);
const MemRegion* thisRegion = instCall->getCXXThisVal().getAsRegion();
if (thisRegion)
thisRegion = thisRegion->getMostDerivedObjectRegion();
However, the returned region seems to be different in the constructor and
in the get() method. For example I'm testing my code with a "struct
MyStruct : std::unique_ptr<char>" and I'll get debug output such as:
constructor: Storing 0 (Loc) into map with key SymRegion{conj_$5{struct
MyStruct *, LC1, S3038538, #1}}
get(): Retrieving key SymRegion{reg_$0<const struct MyStruct * this>}:
not present
I did find https://reviews.llvm.org/D26762 linked from another mailing list
post (http://lists.llvm.org/pipermail/cfe-dev/2017-June/054100.html), which
seems like it might be related to where I'm stuck, but the code in that
patch always seems to return None, maybe I'm not using it correctly.
Generally, i'll be pretty excited to accept patches that improve modeling
> of smart pointers in this manner. If that aligns with your interests,
> please extend our fairly minimal SmartPtrChecker and put your work to
> Phabricator (on an as early of a stage as possible) so that we could merge
> it!
>
I am trying to make the modeling work well enough for my own purpose first
and then I will take a look at posting it to Phabricator. I will try to
make it general enough to be able to submit it.
> I have the following sample checker:
>
> class Analyzer : public Checker<check::EndFunction> {
> public:
> void checkEndFunction(const ReturnStmt* ret, CheckerContext& cx) const
> {
>
> const auto* func =
> cast<FunctionDecl>(cx.getStackFrame()->getDecl());
> if (func->getQualifiedNameAsString() != "StructFunc")
> return;
>
> ProgramStateRef state = cx.getState();
> SValBuilder& builder = cx.getSValBuilder();
> ASTContext& ast = cx.getASTContext();
>
> SVal returnValue = cx.getSVal(ret->getRetValue());
> SVal falseValue = builder.makeZeroVal(ast.BoolTy);
> SVal returnedFalse = builder.evalEQ(state, returnValue,
> falseValue);
>
> errs() << "Evaluating (" << returnValue << " == " << falseValue
> << ") -> " << returnedFalse << "\n";
> }
> };
>
> However when I run it on my sample code I get this output:
>
> Evaluating
> (lazyCompoundVal{0x7f98f1871c70,Element{SymRegion{conj_$0{struct
> BoolConvertibleStruct *, LC1, S973, #1}},0 S64b,struct
> BoolConvertibleStruct}} == 0 U1b) -> Unknown
>
>
> lazyCompoundVal is a snapshot of the structure as a whole. You can extract
> values of particular fields from it with the following procedure:
>
> - Take the lazyCompoundVal's parent region
> (`LazyCompoundVal::getRegion()`, in your case it's
> `Element{SymRegion{conj_$0{struct BoolConvertibleStruct *, LC1, S973,
> #1}},0 S64b,struct BoolConvertibleStruct}`).
> - Construct a FieldRegion as a sub-region of the parent region with the
> FieldDecl of the field (i.e., State->getLValue(fieldDecl, parentRegion)).
> - Ask StoreManager to do a getBinding() for that region from the
> lazyCompoundVal's Store (`LazyCompoundVal::getStore()`, in your case it's
> `0x7f98f1871c70`).
>
OK, I did get the contrived example to work that way. I guess the
information I was missing was that I expected to be able to see any known
bytes of the struct in the LazyCompoundVal, and that's not the case. In
case any future readers want to know, the code I used is below. It turned
out to be quite a lot more complicated than I expected and I am still not
sure I understand why it works this way, but this seems to work.
const Expr* returnExpr = ret->getRetValue();
QualType returnType = returnExpr->getType();
auto* klass = returnType.getTypePtr()->getAsCXXRecordDecl();
const FieldDecl* nField = nullptr;
for (const auto* field : klass->fields()) {
if (field->getNameAsString() == "n")
nField = field;
}
assert(nField);
SVal returnValue = cx.getSVal(ret->getRetValue());
Optional<nonloc::LazyCompoundVal> compoundValue =
returnValue.getAs<nonloc::LazyCompoundVal>();
assert(compoundValue);
const TypedValueRegion* parentRegion = compoundValue->getRegion();
SVal parentVal = state->getLValue(klass, parentRegion, /* isVirtual
= */ false);
SVal fieldVal = state->getLValue(nField, parentVal);
Optional<Loc> fieldLoc = fieldVal.getAs<Loc>();
assert(fieldLoc);
StoreManager& storeManager = cx.getStoreManager();
returnValue = storeManager.getBinding(compoundValue->getStore(),
*fieldLoc);
Cheers,
--
Philip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20191230/d9f874a1/attachment-0001.html>
More information about the cfe-dev
mailing list