[PATCH] D86874: [analyzer] Fix ArrayBoundCheckerV2 false positive regarding size_t indexer
Balázs Benics via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Sep 7 07:28:36 PDT 2020
steakhal added a comment.
In D86874#2259105 <https://reviews.llvm.org/D86874#2259105>, @Szelethus wrote:
> I can only imagine how long it took for you to write all this, because reading it wasn't that short either. Thank you so much! It really gave me a perspective on how you see this problem, as well as what is actually happening (and should happen) in the checker.
Thank you very much. I tied to make it as short as possible while keeping all the necessary details to make my reasoning perfectly clean. Now I think I was on a bad track, but @martong helped me to pinpoint some issues.
> In D86874#2254638 <https://reviews.llvm.org/D86874#2254638>, @steakhal wrote:
>
>> void foo(int x) {
>> int buf[10];
>>
>> char *r = (char*)(buf + x) + 3;
>> *r = 66; // Might be safe, if x has a value to make it so.
>> }
>
> I suppose you mean that it might be unsafe on the grounds of out-of-bounds addressing, but alignment-wise its okay?
You can safely access any valid memory region via `char`, `unsigned char` and `std::byte` pointers. And as integers are using all bits in their representation, I suspect that this code snippet is safe - given that the pointer points to a valid region.
(aka. no out-of-bound access happens). Alignment does not play any role here.
> In D86874#2256249 <https://reviews.llvm.org/D86874#2256249>, @martong wrote:
>
>>> The resulting `RegionRawOffsetV2` for `p`, `q` will be:
>>> p: {BaseRegion: `buf`, ByteOffset: `20 S64b`}
>>> q: {BaseRegion: `buf`, ByteOffset: `(((reg_$0<int x>) + 1) * 12) + 24`}
>>> ^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^
>>> | |
>>> `@a` This is an //object-language// expression. --/ |
>>> /
>>> `@b` The rest should be //meta-language// expression. -------/
>>>
>>> SPOILER: This distinction is really important here.
>>>
>>> So, we made an expression, we should not evaluate in either of the worlds.
>>> We should not evaluate it using the semantics of the //object-language// since only the `@a` is of that world.
>>> We should model overflow in `@a` if that expression is of //unsigned// type (OR signed but `-fwrapv`...) etc. according to the //abstract machine//.
>>> BUT modeling the possibility of an overflow of the rest of the expression (`@b`) would be a flaw.
>>
>> Why? I'd expect that the value of the whole expression `@a at b` could overflow.
>
> That wasn't all too clear to me either. It seems like there is extra work to do if we do the calculation according to C++ rules, but not the other way around, so we could just regard everything as if it was under the object-language ruleset.
Let's forget it. My statements in this regard are going in a bad direction. Sorry for the dead-end, I thought I was on the right track.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D86874/new/
https://reviews.llvm.org/D86874
More information about the cfe-commits
mailing list