[PATCH] D86445: [analyzer][RFC] Simplify MetadataSymbol representation and resolve a CStringChecker FIXME

Balázs Benics via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Aug 26 01:28:17 PDT 2020


steakhal marked an inline comment as done.
steakhal added inline comments.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h:441
+  const SymbolMetadata *getMetadataSymbol(const MemRegion *R, QualType T,
                                           const void *SymbolTag = nullptr);
 
----------------
NoQ wrote:
> steakhal wrote:
> > Why do we even need the tag?
> > Why is it defaulted to `nullptr` if it asserts later that the `tag` must not be `null`.
> > I'm confused, somebody help me out :D
> The tag is necessary so that different checkers could attach different metadata to the same region (possibly even of the same type). A null value indeed makes no sense here.
Thanks.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h:516-522
   /// Unconditionally marks a symbol as live.
   ///
   /// This should never be
   /// used by checkers, only by the state infrastructure such as the store and
   /// environment. Checkers should instead use metadata symbols and markInUse.
+  /// TODO: update this comment!!!
   void markLive(SymbolRef sym);
----------------
NoQ wrote:
> steakhal wrote:
> > Is it true for modeling checkers too?
> This comment is much older than the concept of a "modeling checker" and it has never been true. Checkers need to mark things live. It's part of life! (no pun intended) Like, i agree that non-modeling checkers probably don't need to use this method, but that wasn't what the author was trying to say :) And also there's nothing special about `SymbolMetadata`; any data that the checker helps keep track of may need to be marked live by the checker, so `markInUse()` is insufficient.
Ok, I will rephrase the comment expressing that this should be used only in modeling checkers.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2417
 
 void CStringChecker::checkDeadSymbols(SymbolReaper &SR,
     CheckerContext &C) const {
----------------
NoQ wrote:
> martong wrote:
> > Seems like `checkDeadSymbols` could have a generic implementation. Perhaps in the following form:
> > ```
> > class CStringChecker : public Checker<
> >                                          check::DeadSymbols<CStringLength>
> > ```
> > But this should be done in a separate patch.
> > 
> > Maybe it would make sense to have a generic default check::LiveSymbols<GDM0, GDM1, ...> implementation as well. In that implementation we could iterate over the GDM entries and could mark the dependent (sub)symbols live.
> > 
> > (Note, this is a reminder from our verbal discussion with @steakhal.)
> Nope, there can't be a generic implementation. It depends a lot on the data structure tracked by the checker (the checker's technically allowed to track "a map from regions to maps from lists of expressions to sets of SVals", good luck coming up with a generic `checkLiveSymbols()` for that) and it most likely isn't uniquely determined by the data structure either.
> 
> But what we can do is provide half-baked implementations for the common situations such as "a map from regions to symbols" (eg. this checker, smart pointer checker) or "a map from symbols to an enum of states the object associated with the symbol is in" (malloc checker, stream checker). And even better, we could provide half-baked implementations of entire state traits, not just `checkDeadSymbols`
> /`checkLiveSymbols()`.
I think we can postprone the experiments with the generic `checkDeadSymbols`/`checkLiveSymbols` implementations.
This patch does not aim to solve that.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2429
     if (SymbolRef Sym = Len.getAsSymbol()) {
       if (SR.isDead(Sym))
+        Entries = F.remove(Entries, MR);
----------------
NoQ wrote:
> Ok, this doesn't look correct (looks like it never was). It's liveness of the region that's important to us, not liveness of the symbol. Because it's the liveness of the region that causes the program to be able to access the map entry.
Let's say we have this:
```lang=C++
void foo() {
  char *p = malloc(12);
  // strlen(p); // no-leak if strlen called, leak warning otherwise...
} // expected-warning {{leak}}
```
If we were marking the region live here, we would miss the `leak` warning. That warning is only triggered if the region of the `p` becomes dead. Which will never become dead if we have a cstring length symbol associated to that region.
I came to this conclusion after implementing your suggested edit above (checking regions instead of symbols).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86445/new/

https://reviews.llvm.org/D86445



More information about the cfe-commits mailing list