[PATCH] D57023: [MsgPack] New MsgPackDocument class

Scott Linder via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 26 20:14:18 PDT 2020


scott.linder added inline comments.


================
Comment at: llvm/trunk/include/llvm/BinaryFormat/MsgPackDocument.h:123-132
+  /// Get an ArrayDocNode for an array node. If Convert, convert the node to an
+  /// array node if necessary.
+  ArrayDocNode &getArray(bool Convert = false) {
+    if (getKind() != Type::Array) {
+      assert(Convert);
+      convertToArray();
+    }
----------------
dblaikie wrote:
> scott.linder wrote:
> > dblaikie wrote:
> > > Excuse the delayed review - found this while investigating other issues. I believe this code is invalid & is an aliasing violation (taking a pointer to ArrayDocNode (similarly with MapDocNode) when it's not pointing to an object of that type).
> > > 
> > > Can you please fix this code to not do that?
> > Is the issue that in the implementations:
> > 
> > ```
> > void DocNode::convertToArray() { *this = getDocument()->getArrayNode(); }
> > ```
> > 
> > and
> > 
> > ```
> >   ArrayDocNode getArrayNode() {
> >     auto N = DocNode(&KindAndDocs[size_t(Type::Array)]);
> >     Arrays.push_back(std::unique_ptr<DocNode::ArrayTy>(new DocNode::ArrayTy));
> >     N.Array = Arrays.back().get();
> >     return N.getArray();
> >   }
> > ```
> > 
> > As even though the object pointed to by `this` is destructed and a new ArrayDocNode is copy-constructed over it, *that* `ArrayDocNode` is not actually ever constructed. A `DocNode` is just "type punned" into an `ArrayDocNode`.
> > 
> > Would changing the definition to:
> > 
> > ```
> >   ArrayDocNode getArrayNode() {
> >     auto N = ArrayDocNode(&KindAndDocs[size_t(Type::Array)]);
> >     Arrays.push_back(std::unique_ptr<DocNode::ArrayTy>(new DocNode::ArrayTy));
> >     N.Array = Arrays.back().get();
> >     return N;
> >   }
> > ```
> > 
> > And the analogous change for MapDocNode avoid the UB?
> Hey, thanks for chiming in!
> 
> I don't believe the proposed change would address the UB.
> 
> Essentially, it's important for C++ that the compiler can assume that the type of an object doesn't change when a function call is made:
> 
> ```
> some_type var;
> var.func();
> // var is of type 'some_type' here
> ```
> 
> While it is valid (but pretty tricky/best avoided) to have 'func()' destroy 'var', placement-new some other type in 'var's memory, then destroy that, then construct a new 'some_type' where 'var' is - that doesn't seem like it's the use case you're trying to get to here, so I don't believe that small exception helps you here.
> 
> `*this = getArrayNode()` doesn't change the type of `*this`, so it's still an aliasing violation to `reinterpret_cast` `this` to an `ArrayDocNode*`, because it's still not an `ArrayDocNode*` - the assignment in `convertToArray` calls `DocNode's operator=(const DocNode&)` - it's slicing, just taking the DocNode part of the ArrayDocNode and assigning it to `*this`- it doesn't change the dynamic type of `*this`.
> 
> Anymore than:
> 
> ```
> base b;
> derived d;
> b = d;
> ```
> 
> `b` is still of type `base`, not of type `derived`.
Thank you for breaking it down! That makes perfect sense in retrospect.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57023/new/

https://reviews.llvm.org/D57023





More information about the llvm-commits mailing list