[cfe-dev] [analyzer] Calling SymbolReaper::markInUse on BinarySymExprs are ignored

Balázs Benics via cfe-dev cfe-dev at lists.llvm.org
Wed Aug 5 07:17:10 PDT 2020


If you are really curious, here is some context.

Imagine the following code (test.c):
```
typedef typeof(sizeof(int)) size_t;

void clang_analyzer_eval(int);
char  *strcat(char *restrict s1, const char *restrict s2);
size_t strlen(const char *s);

void strcat_symbolic_src_length(char *src) {
  char dst[8] = "1234";
  strcat(dst, src);
  clang_analyzer_eval(strlen(dst) >= 4); // expected-warning{{TRUE}}
  (void)*dst;
}
```

One would expect that the 'strlen(dst) >= 4' is TRUE, but it returns UNKOWN.

After doing a little bit of investigation - and debug prints - I came up
with the following trace.

---

# In the CStringChecker::evalStrcat creates a metadata symbol representing
the cstring length of the region pointed by 'src'.
created metadata symbol 'meta_$2{SymRegion{reg_$0<char * src>},unsigned
long}'

# After the evalStrcat evaluated the call, the state contains the necessary
mapping that the 'dst' points to a cstring which is 4 + meta$4 long.
# Note that meta$4 represents the cstring length of the region pointed by
'src'.
# So far so good. We know that resulting cstring length precisely.
strcpy common END State.dump: "program_state": {
  [...]
  "checker_messages": [
    { "checker": "unix.cstring.CStringModeling", "messages": [
      "CString lengths:",
      "dst: (meta_$2{SymRegion{reg_$0<char * src>},unsigned long}) + 4U",
      "SymRegion{reg_$0<char * src>}: meta_$2{SymRegion{reg_$0<char *
src>},unsigned long}",
      ""
    ]}
  ]
}

# We mark all symbols as live in the cstring length map.
# At least we think so...
CStringChecker::checkLiveSymbols marks '(meta_$2{SymRegion{reg_$0<char *
src>},unsigned long}) + 4U' in use
CStringChecker::checkLiveSymbols marks 'meta_$2{SymRegion{reg_$0<char *
src>},unsigned long}' in use
CStringChecker::checkLiveSymbols marks 'meta_$2{SymRegion{reg_$0<char *
src>},unsigned long}' in use

# Even we marked the given symbols in use, we still removes them for some
reason...
CStringChecker::checkDeadSymbols finds the '(meta_$2{SymRegion{reg_$0<char
* src>},unsigned long}) + 4U' as dead; so removes the mapping from 'dst'
CStringChecker::checkDeadSymbols finds the 'meta_$2{SymRegion{reg_$0<char *
src>},unsigned long}' as dead; so removes the mapping from
'SymRegion{reg_$0<char * src>}'

# Now that state does not contain the cstring length of the region pointed
by 'dst'.

---

Further investigation showed that even if I would visit all the sub
SymExprs looking for SymbolMetadata would not help.
SymbolReaper::isDead would still show that the
'(meta_$2{SymRegion{reg_$0<char * src>},unsigned long}) + 4U' SymExpr is
dead.

---

How can we preserve such metadata information?

---

You can also reproduce this following these steps:

Apply the add-debug-prints.patch on top of
7f1556f292ccfd80c4ffa986d5b849f915e5cd82
"Fix typo: s/epomymous/eponymous/ NFC".
Analyze the 'test.c' file using this command:
./bin/clang -cc1 -internal-isystem lib/clang/12.0.0/include -nostdsysteminc
-analyze -analyzer-constraints=range -setup-static-analyzer
-analyzer-checker=unix.cstring,alpha.unix.cstring,alpha.security.taint,debug.ExprInspection
-analyzer-config eagerly-assume=false test.c

Balázs Benics <benicsbalazs at gmail.com> ezt írta (időpont: 2020. aug. 5.,
Sze, 10:29):

> Why does the SymbolReaper::markInUse only work on SymbolMetadata symbols?
>
> void SymbolReaper::markInUse(SymbolRef sym) {
>   if (isa<SymbolMetadata>(sym))
>     MetadataInUse.insert(sym);
> }
>
> I think it is flawed if the Symbol is a SymIntExpr holding an expression
> tree referring to SymbolMetadata symbols. In such case, those symbols would
> not be kept alive - causing some confusion on the checker developers' side
> and potentially losing some information about the analysis.
>
> Should we walk the expression tree instead of the current implementation?
> What performance impact should we expect by doing so?
>
> Any ideas?
>
> Balazs.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200805/65d70591/attachment-0001.html>
-------------- next part --------------
$ ./bin/clang -cc1 -internal-isystem lib/clang/12.0.0/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=unix.cstring,alpha.unix.cstring,alpha.security.taint,debug.ExprInspection -analyzer-config eagerly-assume=false test.c

created metadata symbol 'meta_$2{SymRegion{reg_$0<char * src>},unsigned long}'
strcpy common END State.dump: "program_state": {
  "store": { "pointer": "0x55a7afb68260", "items": [
    { "cluster": "dst", "pointer": "0x55a7afb5f788", "items": [
      { "kind": "Default", "offset": 0, "value": "conj_$3{char, LC1, S967, #1}" }
    ]}
  ]},
  "environment": { "pointer": "0x55a7afb5ef00", "items": [
    { "lctx_id": 1, "location_context": "#0 Call", "calling": "strcat_symbolic_src_length", "location": null, "items": [
      { "stmt_id": 958, "pretty": "strcat", "value": "&code{strcat}" },
      { "stmt_id": 961, "pretty": "strcat(dst, src)", "value": "&Element{dst,0 S64b,char}" },
      { "stmt_id": 967, "pretty": "dst", "value": "&Element{dst,0 S64b,char}" },
      { "stmt_id": 973, "pretty": "src", "value": "&SymRegion{reg_$0<char * src>}" }
    ]}
  ]},
  "constraints": [
    { "symbol": "reg_$0<char * src>", "range": "{ [1, 18446744073709551615] }" },
    { "symbol": "meta_$2{SymRegion{reg_$0<char * src>},unsigned long}", "range": "{ [0, 3] }" }
  ],
  "dynamic_types": null,
  "dynamic_casts": null,
  "constructing_objects": null,
  "checker_messages": [
    { "checker": "unix.cstring.CStringModeling", "messages": [
      "CString lengths:", 
      "dst: (meta_$2{SymRegion{reg_$0<char * src>},unsigned long}) + 4U", 
      "SymRegion{reg_$0<char * src>}: meta_$2{SymRegion{reg_$0<char * src>},unsigned long}", 
      ""
    ]}
  ]
}
CStringChecker::checkLiveSymbols marks '(meta_$2{SymRegion{reg_$0<char * src>},unsigned long}) + 4U' in use
CStringChecker::checkLiveSymbols marks 'meta_$2{SymRegion{reg_$0<char * src>},unsigned long}' in use
CStringChecker::checkLiveSymbols marks 'meta_$2{SymRegion{reg_$0<char * src>},unsigned long}' in use
CStringChecker::checkDeadSymbols finds the '(meta_$2{SymRegion{reg_$0<char * src>},unsigned long}) + 4U' as dead; so removes the mapping from 'dst'
CStringChecker::checkDeadSymbols finds the 'meta_$2{SymRegion{reg_$0<char * src>},unsigned long}' as dead; so removes the mapping from 'SymRegion{reg_$0<char * src>}'
evalstrLength common BEGIN State.dump: "program_state": {
  "store": { "pointer": "0x55a7afb68260", "items": [
    { "cluster": "dst", "pointer": "0x55a7afb5f788", "items": [
      { "kind": "Default", "offset": 0, "value": "conj_$3{char, LC1, S967, #1}" }
    ]}
  ]},
  "environment": { "pointer": "0x55a7afb5ef00", "items": [
    { "lctx_id": 1, "location_context": "#0 Call", "calling": "strcat_symbolic_src_length", "location": null, "items": [
      { "stmt_id": 998, "pretty": "strlen", "value": "&code{strlen}" },
      { "stmt_id": 1009, "pretty": "dst", "value": "&Element{dst,0 S64b,char}" },
      { "stmt_id": 1030, "pretty": "clang_analyzer_eval", "value": "&code{clang_analyzer_eval}" }
    ]}
  ]},
  "constraints": null,
  "dynamic_types": null,
  "dynamic_casts": null,
  "constructing_objects": null,
  "checker_messages": null
}
created metadata symbol 'meta_$5{dst,unsigned long}'
evalstrLengthCommon returns: 'meta_$5{dst,unsigned long}'
CStringChecker::checkLiveSymbols marks 'meta_$5{dst,unsigned long}' in use
CStringChecker::checkLiveSymbols marks 'meta_$5{dst,unsigned long}' in use
CStringChecker::checkLiveSymbols marks 'meta_$5{dst,unsigned long}' in use
CStringChecker::checkLiveSymbols marks 'meta_$5{dst,unsigned long}' in use
CStringChecker::checkDeadSymbols finds the 'meta_$5{dst,unsigned long}' as dead; so removes the mapping from 'dst'
test.c:11:3: warning: UNKNOWN [debug.ExprInspection]
  clang_analyzer_eval(strlen(dst) >= 4); // expected-warning{{TRUE}}
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.c
Type: text/x-csrc
Size: 332 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200805/65d70591/attachment-0001.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: add-debug-prints.patch
Type: text/x-patch
Size: 4350 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200805/65d70591/attachment-0001.bin>


More information about the cfe-dev mailing list