[PATCH] D86445: [analyzer][RFC] Simplify MetadataSymbol representation and resolve a CStringChecker FIXME

Balázs Benics via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Aug 24 04:15:17 PDT 2020


steakhal created this revision.
steakhal added reviewers: NoQ, vsavchenko, xazax.hun, martong, Szelethus.
Herald added subscribers: cfe-commits, ASDenysPetrov, Charusso, dkrupp, donat.nagy, mikhail.ramalho, a.sidorin, rnkovacs, szepet, baloghadamsoftware, whisperity.
Herald added a project: clang.
steakhal requested review of this revision.

The `SymbolMetadata` handing was slightly flawed.
Given the following example:

  void strcat_models_dst_length_even_if_src_dies(char *src) {
    char dst[8] = "1234";
    strcat(dst, src);
    clang_analyzer_eval(strlen(dst) >= 4); // expected-warning{{TRUE}}
  }

A metadata symbol was assigned to the cstring length map, representing the new length of the `dst`:
'`dst` -> `meta{src} + 4`' where `meta{src}` represents the cstring length of the region of `src`.
Unfortunately, this information was immediately removed from the model, since the `src` become dead after the evaluation of calling `strcat`.
This results in the removal of all symbolic expressions (eg metadata symbols) that refers to the dead `src` memory region.

As of now, `MetadataSymbol`s are used only by the `CStringChecker`, so this fix is tightly coupled to fixing the resulting issues of that checker.
There was already a FIXME in the test suite for this - which is now resolved.

---

There was a lengthy discussion on the mailing list <http://lists.llvm.org/pipermail/cfe-dev/2020-August/066559.html> how to resolve this issue.
In that mailing, @NoQ proposed these steps, which I have implemented in this patch:

> 1. Remove statement, location context, block count from SymbolMetadata's identity. Let it solely depend on the region.
> 2. Get rid of the metadata-in-use set. From now on SymbolMetadata, like SymbolRegionValue, is live whenever its region is live.
> 3. When strlen(R) is used for the first time on a region R, produce `$meta<size_t R>` as the answer, but *do not store* it in the string length map. We don't need to change the state because the state didn't  in fact change. On subsequent `strlen(R)` calls we'll simply return the same symbol (**because** the map will remain empty), until the string is changed.
> 4. If the string is mutated, record its length as usual. But if the string is invalidated (or the new length is completely unknown), **do not remove** the length from the state; instead, set it to SymbolConjured. Only remove it from the map when the region dies.
> 5. Keep string length symbols alive as long as they're in the map.




Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D86445

Files:
  clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
  clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
  clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
  clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
  clang/test/Analysis/bsd-string.c
  clang/test/Analysis/string.c

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D86445.287337.patch
Type: text/x-patch
Size: 14745 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20200824/e83eac94/attachment.bin>


More information about the cfe-commits mailing list