[clang] [clang][dataflow] Add synthetic fields to `RecordStorageLocation` (PR #73860)

via cfe-commits cfe-commits at lists.llvm.org
Mon Dec 4 00:15:48 PST 2023


martinboehme wrote:

> Overall, I love it! I have some comments on some features in the future, but those are probably not going to be addressed any time soon. First of all, I think in the future when we reason about the values of the pointers, synthetic fields might need to have offsets to be able to know things like `&modeled.getField1() != &modeled.getField2()`.

Agree.

We do already mostly support this particular case; the transfer function for (in)equality on pointer values would need to be strengthened to compare not the `PointerValue`s themselves but the `StorageLocation`s they point to. As it currently stands, we only [return true](https://github.com/llvm/llvm-project/blob/c9c1b3c37fa5d5c617068afe67afcafacd58a241/clang/lib/Analysis/FlowSensitive/Transfer.cpp#L56) if the `PointerValue`s themselves are equal, and otherwise we return an atomic. Making pointer comparison stronger would benefit our modeling of pointers in general, not just synthetic fields.

> Offsets might also be a way to support unions, multiple synthetic fields starting at the same offset might express just that.

True. And this isn't just restricted to synthetic fields -- regular fields of a union should receive the same treatment.

If I understand the standard right, the only thing that's really guaranteed about the fields of a union is that they all share the same address; writing to one field, then reading from another is undefined behavior (though many compilers implement this as a non-standard language extension).

Currently, when comparing the addresses of different fields of a union, our transfer function produces an atomic (see above). Interestingly, this isn't actually wrong, per se, just inaccurate.

Anyway, though, I think it's relatively rare that a program depends on the behavior of comparing the addresses of two different fields of a union, so I don't think it's a high priority to fix this.

> Another big item is properly supporting inheritance. Consider:
> 
> ```
> struct B { int a; };
> struct D : /*virtual*/ B {};
> struct E : /*virtual*/ B {};
> struct F: D, E {
>     void m() {
>         int* l = &(D::a);
>         int* m = &(E::a);
>     }
> };
> ```
> 
> Here, whether we have the same or different memory locations for `D::a` and `E::a` depends on whether we are doing virtual inheritance. This is something that should work both for regular fields and synthetic fields.

Agreed, we also don't do this correctly yet. Again, though, I think this is lower priority than many of the other gaps in functionality that we have, as many style guides discourage virtual inheritance anyway, and in the cases where we see it in the code, I would guess that the result of our analysis often doesn't depend on modelling this correctly.

Thanks for all of these observations -- I think at least modelling pointer comparison more accurately is something we should look at soon.

https://github.com/llvm/llvm-project/pull/73860


More information about the cfe-commits mailing list