<div dir="ltr">On 10 May 2013 01:00, John McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word"><div><div><div>On May 9, 2013, at 9:22 PM, Nick Lewycky <<a href="mailto:nlewycky@google.com" target="_blank">nlewycky@google.com</a>> wrote:</div></div><blockquote type="cite">
<div dir="ltr"><div>On 9 May 2013 19:13, John McCall <span dir="ltr"><<a href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span> wrote:<br></div><div><div class="gmail_extra">
<div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>This is not how I understand the [basic.life] rules. The question is whether a pointer value, reference, or name is formally forwarded to point to the new object. Because the dynamic type is different, the pointer value held in 'p' is not updated. Copying that value into 'q' does not change the fact that the pointer value still refers to a non-existent object.</div>
</blockquote><div><br></div><div><div class="gmail_quote">I'm actually okay with the simple copy not forming a new object pointer. However, "Base *q = reinterpret_cast<Base*>(p);" really ought to.</div>
</div></div></div></div></div></blockquote><div><br></div>Yes, I agree that it ought to.</div><div><div><br><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>
<div class="gmail_quote"></div></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
It is unclear what, exactly, under the rules constitutes forming a valid pointer to the newly-constructed object except using the result of the new-expression itself. I think an explicit cast might, ignoring all "object-ness" of the source pointer and simply treating it formally as a pointer to some storage that you are casting to the type of an object stored there?<br>
</blockquote><div><br></div><div>I want to make optimizations to the program that people can't object to through a cursory reading of the standard, which is made difficult by the standard being contradictory on many relevant points here. Ultimately I've chosen to be very liberal about what I'm allowing to be considered a newly formed valid pointer.</div>
</div></div></div></blockquote><div><br></div></div><div>Being conservative is fair. My point was just that it has absolutely nothing to do with being held in a named pointer variable. Also, this is C++, so you almost certainly need to be able to track the object-ness of values across inlining in order to have much hope of meaningful optimization. And potentially not just across inlining, but through memory — who actually allocates polymorphic values and doesn't use smart pointers these days?</div>
</div></div></blockquote><div><br></div><div style>You essentially want to have some optimizations fire to clean things up before we devirtualize? And you want to devirtualize more often? So demanding. It's not unreasonable, it's just hard. I don't have a way to salvage the existing proposal.</div>
<div><br></div><div style>Let me propose to extend LLVM's pointers by defining that they are a pair of object ID and memory address. As a matter of lowering to target ABI, we only pass the memory addresses not the object IDs, ptrtoint produces an integer from only the memory address, and inttoptr creates a pointer with some object ID, but we don't know which one. icmp only compares the memory address part of these pointers. Bitcast and GEP preserve the object ID.</div>
<div style><br></div><div style>To create a new object ID, a new intrinsic @llvm.newobject takes and returns pointers of the same type, where the memory address is the same, but with a newly allocated object ID. Given "%x = i8* call @llvm.newobject(i8* %y)", %x and %y are mayalias in BasicAA (TBAA may decide they're NoAlias anyhow, and that's fine).</div>
<div style><br></div><div style>The only observer of these object IDs are load instructions annotated with !llvm.invariant.object metadata. Loads with this annotation that load the same pointer (including both object ID and memory address) may be assumed to load the same value no matter what other operations happen in between.</div>
<div style><br></div><div style>At the C++ level, we state that a pointer's associated vptr is fixed on first use. Things which break that association emit a call to @llvm.newobject. That includes pointer casts, and calls to placement new. Yes, if you call placement new you need to use the newly returned pointer, or else your program has undefined behaviour.</div>
<div><br></div><div style>This proposal is much more rough than the last one and I'd appreciate any help identifying the problems. Please review!</div><div style><br></div><div style>Nick<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word"><div><div><div><br></div><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">
<div>BTW, Richard came up with a wonderful example. What do you make of this?:<br></div><div><div><br></div><div> char alignas(A, B) buffer[max(sizeof(A), sizeof(B))];</div><div> A *a = reinterpret_cast<A*>(buffer);</div>
<div> B *b = reinterpret_cast<B*>(buffer);<br></div><div> new(buffer) A;</div><div> a->vfn();</div><div> new(buffer) B;</div><div> b->vfn();</div><div><br></div><div>
Valid?</div></div></div></div></div></blockquote><div><br></div></div></div><div>Let's answer a simpler question. Why is this valid?</div><div><div><br></div><div> A *a = reinterpret_cast<A*>(buffer);</div>
</div><div><div> new(buffer) A;</div><div> a->vfn();</div><div><br></div><div>My initial interpretation is that the initial value of 'a' is a type-punned pointer that refers to an object of type char[n]. The lifetime of that object ends when we reuse its storage for a new object of type A. (This is okay to do to any type; additionally, since the char[n] object does not have a non-trivial destructor, we are not required to put a char[n] object back before it goes out of scope.) This leaves the name 'buffer' referring only to allocated storage. The lifetime of the A object begins after the constructor completes. 'a' does not automatically refer to the new object because the type of the object it refers to does not match the type of the object now created there. So, what's the theory that allows us to use 'a' here as if it referred to the new object?</div>
<div><br></div><div>To be clear, I think we obviously have to permit both of these examples.</div><div><br></div><div>The only additional complication with Richard's example is that yet another object comes into existence. I don't see that as fundamentally changing anything.</div>
<span><font color="#888888"><div><br></div><div>John.</div></font></span></div></div></blockquote></div><br></div></div>