[PATCH] D47299: [CodeGenCXX] Emit strip.invariant.group with -fstrict-vtable-pointers

Richard Smith - zygoloid via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 28 11:20:01 PDT 2018


rsmith added inline comments.


================
Comment at: clang/lib/CodeGen/CGExprScalar.cpp:1647
+      }
+    }
+
----------------
Prazek wrote:
> rjmccall wrote:
> > Prazek wrote:
> > > rjmccall wrote:
> > > > Incidentally, how do you protect against code like this?
> > > > 
> > > >   A *ptr;
> > > >   reinterpret_cast<B *&>(ptr) = new B();
> > > >   ptr->foo();
> > > > 
> > > > Presumably there needs to be a launder/strip here, but I guess it would have to be introduced by the middle-end when forwarding the store?  The way I've written this is an aliasing violation, but (1) I assume your pass isn't disabled whenever strict-aliasing is disabled and (2) you can do this with a memcpy and still pretty reliably expect that LLVM will be able to eventually forward the store.
> > > Can you add more info on what is A and B so I can make sure I understand it correctly?
> > > Is the prefix of the layout the same for both, but they are not in the hierarchy?
> > > 
> > > I haven't thought about the strict aliasing. I think the only sane way would be to require strict aliasing for the strict vtable pointers.
> > It's whatever case you're worried about such that you have to launder member accesses and bitcasts.
> > 
> > And like I mentioned, relying on strict aliasing isn't enough because you can do it legally with memcpy.  Maybe it's okay to consider it UB?  I'm not sure about that.
> AFAIK reinterpreting one class as another is UB if they are not in hierarchy (especially calling virtual function on reinterpreted class), not sure if strict aliasing should allow it anyway (if it would be a hand written custom vptr then it should be ok with strict aliasing turned off, but with vptr I don't think it is legal).
> @rsmith Can you comment on that?
OK, here's how I think about what we're doing here:

We view the IR-level pointer value for a pointer to a dynamic class type as being a fat pointer, containing the actual pointer value and also a tag indicating the dynamic type of the object (only notionally, though -- the actual bit representation of the pointer doesn't include the extra information, but we don't ever emit IR that inspects the bit representation of the fat pointer to avoid exposing that fact). In that model, if you try to type pun between a pointer to a dynamic class type and a pointer to a non-dynamic-class type, that can't be expected to work because the (notional) value is different, much as type punning between a derived and base class pointer wouldn't work for a pointer to something other than a base class at offset zero.

I think @rjmccall's example is OK, because both `A` and `B` would need to be dynamic class types in a hierarchy to work, and that means we'd be using the same notional pointer representation. A slight variation of that example:

```
struct A {};
struct B : A { virtual void f(); };
struct C : B { void f(); } c;
A *p = &c;
B *q;
memcpy(&q, &p, sizeof(B*)); // or q = std::bit_cast<B*>(p);
q->f();
```

... would be UB, because the representation of an `A*` and a `B*` are different (a `B*` contains a tag and an `A*` does not).


Repository:
  rL LLVM

https://reviews.llvm.org/D47299





More information about the llvm-commits mailing list