[PATCH] D107339: [analyzer] Retrieve a character from StringLiteral as an initializer for constant arrays.

Denys Petrov via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Oct 28 04:04:17 PDT 2021


ASDenysPetrov added inline comments.


================
Comment at: clang/lib/StaticAnalyzer/Core/RegionStore.cpp:1757-1760
+  // FIXME: Nevertheless, we can't do the same for cases, like:
+  //   const char *str = "123"; // literal length is 4
+  //   char c = str[41];        // offset is 41
+  // It should be properly handled before reaching this point.
----------------
steakhal wrote:
> martong wrote:
> > martong wrote:
> > > steakhal wrote:
> > > > ASDenysPetrov wrote:
> > > > > ASDenysPetrov wrote:
> > > > > > steakhal wrote:
> > > > > > > martong wrote:
> > > > > > > > ASDenysPetrov wrote:
> > > > > > > > > ASDenysPetrov wrote:
> > > > > > > > > > martong wrote:
> > > > > > > > > > > Thanks for updating the patch! However, this `FIXME` makes me worried. Do you think you could pass the `Decl` itself to `getSValFromInitListExpr` in order to be able to check whether the type is a pointer or an array? 
> > > > > > > > > > This worried me as well. Currently I can't find a way to get the `Decl` for `SL`.
> > > > > > > > > > 
> > > > > > > > > We can load this as is for now with mentioning the known issue.
> > > > > > > > This might cause some itchy false positives. Perhaps, we could address this in a follow-up patch and then commit them together?
> > > > > > > > Currently I can't find a way to get the Decl for SL.
> > > > > > > Why do you need a Decl? The SVal's gotta be an `Element{"123",41 S64b,char}` for the example in the comment (*).
> > > > > > > 
> > > > > > > (*) With a minor modification: https://godbolt.org/z/7zhGMnf7P
> > > > > > > ```lang=C++
> > > > > > > template <class T> void clang_analyzer_dump(T);
> > > > > > > const char * const str = "123";
> > > > > > > const char * str2 = "123";
> > > > > > > void top() {
> > > > > > >   clang_analyzer_dump(&str[41]);  // &Element{"123",41 S64b,char}
> > > > > > >   clang_analyzer_dump(&str2[41]); // &Element{SymRegion{reg_$0<const char * str2>},41 S64b,char}
> > > > > > > }
> > > > > > > ```
> > > > > > Because only `Decl` can tell us whether it is a `const char str[42]` or `const char * const str`. We can't say anything just looking at `SVal`.
> > > > > > This might cause some itchy false positives. Perhaps, we could address this in a follow-up patch and then commit them together?
> > > > > This will produce not more FP as before, even less. I didn't change the behavior specifically here. I just found the issue point to it.
> > > > I probably still miss the point. https://godbolt.org/z/EMhbq3745
> > > > Since the case you mention is actually represented by a `NonParamVarRegion` which holds a pointer to its decl.
> > > > ```lang=C++
> > > > const char str3[43] = "123";
> > > > void top() {
> > > >   clang_analyzer_dump(&str3[41]); // &Element{str3,41 S64b,char}
> > > >   //                    NonParamVarRegion  ---^^^^ (it can print the name of the *decl*)
> > > > }
> > > > ```
> > > > 
> > > > ---
> > > > 
> > > > What I wanted to highlight is, that it's a non-issue. In your example you had a **non-const** global variable, thus we could not infer any meaningful initial value for it, and that is actually the expected behavior. As soon as I marked the `str` pointer //const// (along with its //pointee//), suddenly the analyzer can infer its initial value.
> > > But @steakhal, we have this case in the test file with a `FIXME` that is not aligned with your observation.
> > > ```
> > > char const *const glob_ptr8 = "123";
> > > void glob_ptr_index4() {
> > >   clang_analyzer_eval(glob_ptr8[0] == '1');  // expected-warning{{TRUE}}
> > >   clang_analyzer_eval(glob_ptr8[1] == '2');  // expected-warning{{TRUE}}
> > >   clang_analyzer_eval(glob_ptr8[2] == '3');  // expected-warning{{TRUE}}
> > >   clang_analyzer_eval(glob_ptr8[3] == '\0'); // expected-warning{{TRUE}}
> > >   // FIXME: Should be UNDEFINED.
> > >   // We should take into account a declaration in which the literal is used.
> > >   clang_analyzer_eval(glob_ptr8[4] == '\0'); // expected-warning{{TRUE}}
> > > }
> > > ```
> > > > This might cause some itchy false positives. Perhaps, we could address this in a follow-up patch and then commit them together?
> > > This will produce not more FP as before, even less. I didn't change the behavior specifically here. I just found the issue point to it.
> > 
> > Would it be a FP introduced by this patch? Or we would report the "bug" in the baseline too?
> > ```
> > char const *const glob_ptr8 = "123";
> > int glob_ptr_index4(int x) {
> >   return x/glob_ptr8[4]; // Div by zero 
> > }
> > ```
> > Actually, we should report "accessing garbage or undefined" instead of div zero.
> Actually, separating the tests into a parent patch would highlight these differences.
> In your example you had a non-const global variable
Yes, because my example is about offsets and const-ness has no meaning in it. It was omitted for lower verbosity.
> What I wanted to highlight is, that it's a non-issue.
@martong is right. Please, pay attention to the NOTE. It says that there is no problem with the array declration, but it **is** with the pointer one. We **can't** get a Decl from StringRegion: https://godbolt.org/z/hhvasM8dh
```
const char * const str = "123";
void top() {
  clang_analyzer_dump(&str[41]);  // &Element{"123",41 S64b,char}
                                  // StringRegion^
}

```
> Would it be a FP introduced by this patch? Or we would report the "bug" in the baseline too?
> Actually, we should report "accessing garbage or undefined" instead of div zero.
Yes, It should warn "accessing garbage or undefined". The problem exists right now in the baseline: https://godbolt.org/z/6Exejd149
As I said I didn't change the behavior here. The root of the problem is the same as I described above. FP will still be presented after the patch. I just found and pointed to the particular bug.
>Actually, separating the tests into a parent patch would highlight these differences.
I don't see strong reasons for splitting the patch after you get the point. The solution is quite coherent, so I prefer not to do so.

Lastly, I update the comment in the NOTE to make it clearer.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107339/new/

https://reviews.llvm.org/D107339



More information about the cfe-commits mailing list