[PATCH] D96976: [analyzer] Fix reinterpret_cast handling for pointer-to-member
Deep Majumder via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Mar 2 05:18:13 PST 2021
RedDocMD marked an inline comment as done.
RedDocMD added inline comments.
================
Comment at: clang/test/Analysis/reinterpret-cast-pointer-to-member.cpp:20
int DoubleDerived::*ddf = &Base::field;
int Base::*bf = reinterpret_cast<int Base::*>(reinterpret_cast<int Derived::*>(reinterpret_cast<int Base::*>(ddf)));
Base base;
----------------
steakhal wrote:
> Please have a note describing why you are doing this roundtrip.
You mean why a simpler test wouldn't suffice? Well I think it would, but this was a common corner identified by @vsavchenko and I have put it in as a result.
================
Comment at: clang/test/Analysis/reinterpret-cast-pointer-to-member.cpp:29-31
+ Some some;
+ some.*sf = 14;
+ clang_analyzer_eval(some.*sf == 14); // expected-warning{{UNKNOWN}}
----------------
steakhal wrote:
> The assignment is actually UB.
> TBH I don't know how to test such behavior xD
> Same for the next example.
>
> Shouldn't it return `undef` for reading via an invalid member pointer?
I don't quite know what is `undef`. Is it a special macro or something?
Using an invalid member pointer is definitely UB, but I also need to show that casting to an invalid pointer is properly handled because that is not UB. I guess I will remove the member access and claim that since there was no crash, it is okay and has been handled appropriately.
================
Comment at: clang/test/Analysis/reinterpret-cast-pointer-to-member.cpp:52-62
+void testMultiple() {
+ int F::*f = &F::field;
+ int A::*a = reinterpret_cast<int A::*>(f);
+ int C::*c = reinterpret_cast<int C::*>(f);
+ A aobj;
+ C cobj;
+ aobj.*a = 13;
----------------
steakhal wrote:
> Wait a minute. It's not how it works.
> How I imagine member pointers, they are just offsets.
> `&F::field` is notionally equivalent with `offsetof(F, field)`. That being said, You can not apply this member pointer to any object besides `F`.
> Imagine if the classes of the inheritance tree would have other fields as well.
> Then the `offsetof(T, field)` would be different for `F`, and `C`.
>
> This example demonstrates that both of these member pointer dereferences are UB.
> https://godbolt.org/z/15sMEP
> It returns different values depending on the optimization level, which is a clear sign of UB.
> BTW this issue is closely related to strict aliasing.
The member access on A is definitely UB, I guess I will do what I proposed in the `Some` case.
I don't think the other one is. Consider the following:
```
struct A {};
struct B : public A {};
struct C {
int field;
};
struct D : public C {};
struct E : public B, public D {};
struct F : public E {};
int main() {
int F::* ff = &F::field;
int C::* cf1 = static_cast<int C::*>(ff);
int C::* cf2 = reinterpret_cast<int C::*>(ff);
C c;
c.*cf1 = 10;
c.*cf2 = 10;
return 0;
}
```
`cf1` and `cf2` are the same thing, except that they are declared differently (one via `static_cast`, other via `reinterpret_cast`). If we look at the AST (truncated to the first three lines of main):
```
CompoundStmt 0x1a4fe18 <col:12, line:18:1>
|-DeclStmt 0x1a4f3a8 <line:11:3, col:26>
| `-VarDecl 0x1a21078 <col:3, col:21> col:12 used ff 'int F::*' cinit
| `-ImplicitCastExpr 0x1a4f378 <col:17, col:21> 'int F::*' <BaseToDerivedMemberPointer (E -> D -> C)>
| `-UnaryOperator 0x1a4f360 <col:17, col:21> 'int C::*' prefix '&' cannot overflow
| `-DeclRefExpr 0x1a4f2f8 <col:18, col:21> 'int' lvalue Field 0x1a207d0 'field' 'int'
|-DeclStmt 0x1a4f560 <line:12:3, col:43>
| `-VarDecl 0x1a4f428 <col:3, col:42> col:12 used cf1 'int C::*' cinit
| `-CXXStaticCastExpr 0x1a4f518 <col:18, col:42> 'int C::*' static_cast<int struct C::*> <DerivedToBaseMemberPointer (E -> D -> C)>
| `-ImplicitCastExpr 0x1a4f500 <col:40> 'int F::*' <LValueToRValue> part_of_explicit_cast
| `-DeclRefExpr 0x1a4f498 <col:40> 'int F::*' lvalue Var 0x1a21078 'ff' 'int F::*'
|-DeclStmt 0x1a4f6e8 <line:13:3, col:48>
| `-VarDecl 0x1a4f5c8 <col:3, col:47> col:12 used cf2 'int C::*' cinit
| `-CXXReinterpretCastExpr 0x1a4f6b8 <col:18, col:47> 'int C::*' reinterpret_cast<int struct C::*> <ReinterpretMemberPointer>
| `-ImplicitCastExpr 0x1a4f6a0 <col:45> 'int F::*' <LValueToRValue> part_of_explicit_cast
| `-DeclRefExpr 0x1a4f638 <col:45> 'int F::*' lvalue Var 0x1a21078 'ff' 'int F::*'
```
Notice how the `static_cast` figures out the path to the correct subobject. This is how member-pointers are handled as far as I can tell.
Unfortunately, Stroustrup is surprisingly scant when it comes to this topic. I am trying to dig through the Standard.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D96976/new/
https://reviews.llvm.org/D96976
More information about the cfe-commits
mailing list