[PATCH] D28445: [Analyzer] Extend taint propagation and checking

Vlad Tsyrklevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 15 00:00:07 PST 2017


vlad.tsyrklevich added inline comments.


================
Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:442
+
+    const RecordDecl *RD = RT->getDecl()->getDefinition();
+    for (const auto *I : RD->fields()) {
----------------
a.sidorin wrote:
> vlad.tsyrklevich wrote:
> > a.sidorin wrote:
> > > NoQ wrote:
> > > > We need to be careful in the case when we don't have the definition in the current translation unit. In this case we may still have derived symbols by casting the pointer into some blindly guessed type, which may be primitive or having well-defined primitive fields.
> > > > 
> > > > By the way, in D26837 i'm suspecting that there are other errors of this kind in the checker, eg. when a function returns a void pointer, we put taint on symbols of type "void", which is weird.
> > > > 
> > > > Adding Alexey who may recall something on this topic.
> > > I will check the correctness of this code sample because I have some doubts about it.
> > > The problem of casts is hard because of our approach to put taints on 0th elements. We lose casts and may get some strange void symbols. However, I guess this patch isn't related to this problem directly.
> > Not sure which form of correctness you're interested in here but I'll bring up one issue I'm aware of: currently this will create a new SymbolDerived for an LCV sub-region, but it won't be correctly matched against in `isTainted()` because subsequent references to the same region will have a different SymbolDerived. This is the FIXME I mentioned below in `taint-generic.c` I have some idea on how to fix this but I think it will probably require more back and forth, hence why I didn't include it in this change. As it stands now, the sub-region tainting could be removed without changing the functionality of the current patch.
> Checking a default binding symbol here works because we're in PostStmt of the call that creates this default binding. I think this machinery deserves a comment, it was not evident for me initially.
> However, I don't like to create a SymbolDerived manually. The first idea is to just use getSVal(MemRegion*) which will return a SymbolDerived if it is. The second is to create... SymbolMetadata that will carry a taint (if the value is not a symbol, for example). Not sure if second idea is sane enough, I need some comments on it. Artem, Anna?
> Also, instead of referring to the base region, we may need to walk parents one-by-one.
I'd rather just get rid of the tainted SymbolDerived and re-introduce it for discussion in the follow-up change as it's currently not functional anyway and there is some room for debate on the correct way to make tainting of sub-regions of LCVs work correctly. 

As far as walking parents one-by-one, my current understanding of the RegionStoreManager led me to believe that that would be unnecessary. Currently, all bindings are made as offsets from a base region, reference the implementation of `RegionBindingsRef::addBinding`. Is there another reason to walk the parents that I'm missing?


================
Comment at: lib/StaticAnalyzer/Core/RegionStore.cpp:502
+    RegionBindingsRef B = getRegionBindings(S);
+    const MemRegion *MR  = L.getRegion()->getBaseRegion();
+    if (Optional<SVal> V = B.getDefaultBinding(MR))
----------------
vlad.tsyrklevich wrote:
> a.sidorin wrote:
> > We get the LazyCompoundVal for some region but return the symbol for its base. It means that at least method name is very confusing.
> I believe that default bindings are only on base regions, so if you pass a reference to `outer_struct.inner_struct` the default binding for that LCV will be over `outer_struct`. I'm basing this on other references to LCVs in Core/RegionStore.cpp but I could be wrong. Either way, I'd be happy to change the interface to have the caller pass the correct MemRegion here.
One change we could introduce to make getDefaultBinding() more explicit is to have the caller pass LCV.getRegion()->getBaseRegion() instead of just passing the LCV. However, this has the disadvantage of hardcoding an implementation detail of RegionStoreManager into users of the StoreManager interface. I think the correct way forward here might just be to add a comment that mentions that currently RegionStoreManagers bindings are only over base regions. What do you think?


https://reviews.llvm.org/D28445





More information about the cfe-commits mailing list