[PATCH] D86874: [analyzer] Fix ArrayBoundCheckerV2 false positive regarding size_t indexer

Mon Sep 7 06:02:52 PDT 2020

Szelethus added a comment.

I can only imagine how long it took for you to write all this, because reading it wasn't that short either. Thank you so much! It really gave me a perspective on how you see this problem, as well as what is actually happening (and should happen) in the checker.

In D86874#2254638 <https://reviews.llvm.org/D86874#2254638>, @steakhal wrote:

>   void foo(int x) {
>     int buf[10];
>   
>     char *r = (char*)(buf + x) + 3;
>     *r = 66; // Might be safe, if x has a value to make it so.
>   }

I suppose you mean that it might be unsafe on the grounds of out-of-bounds addressing, but alignment-wise its okay?

In D86874#2256249 <https://reviews.llvm.org/D86874#2256249>, @martong wrote:

>>   The resulting `RegionRawOffsetV2` for `p`, `q` will be:
>>   p: {BaseRegion: `buf`, ByteOffset: `20 S64b`}
>>   q: {BaseRegion: `buf`, ByteOffset: `(((reg_$0<int x>) + 1) * 12) + 24`}
>>                                         ^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^
>>                                                      |           |
>>   `@a` This is an //object-language// expression. --/            |
>>                                                                 /
>>   `@b` The rest should be //meta-language// expression. -------/
>>
>> SPOILER: This distinction is really important here.
>>
>> So, we made an expression, we should not evaluate in either of the worlds.
>> We should not evaluate it using the semantics of the //object-language// since only the `@a` is of that world.
>> We should model overflow in `@a` if that expression is of //unsigned// type (OR signed but `-fwrapv`...) etc. according to the //abstract machine//.
>> BUT modeling the possibility of an overflow of the rest of the expression (`@b`) would be a flaw.
>
> Why? I'd expect that the value of the whole expression `@a at b` could overflow.

That wasn't all too clear to me either. It seems like there is extra work to do if we do the calculation according to C++ rules, but not the other way around, so we could just regard everything as if it was under the object-language ruleset.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86874/new/

https://reviews.llvm.org/D86874