[PATCH] D57023: [MsgPack] New MsgPackDocument class

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 26 19:03:35 PDT 2020


dblaikie added inline comments.


================
Comment at: llvm/trunk/include/llvm/BinaryFormat/MsgPackDocument.h:123-132
+  /// Get an ArrayDocNode for an array node. If Convert, convert the node to an
+  /// array node if necessary.
+  ArrayDocNode &getArray(bool Convert = false) {
+    if (getKind() != Type::Array) {
+      assert(Convert);
+      convertToArray();
+    }
----------------
scott.linder wrote:
> dblaikie wrote:
> > Excuse the delayed review - found this while investigating other issues. I believe this code is invalid & is an aliasing violation (taking a pointer to ArrayDocNode (similarly with MapDocNode) when it's not pointing to an object of that type).
> > 
> > Can you please fix this code to not do that?
> Is the issue that in the implementations:
> 
> ```
> void DocNode::convertToArray() { *this = getDocument()->getArrayNode(); }
> ```
> 
> and
> 
> ```
>   ArrayDocNode getArrayNode() {
>     auto N = DocNode(&KindAndDocs[size_t(Type::Array)]);
>     Arrays.push_back(std::unique_ptr<DocNode::ArrayTy>(new DocNode::ArrayTy));
>     N.Array = Arrays.back().get();
>     return N.getArray();
>   }
> ```
> 
> As even though the object pointed to by `this` is destructed and a new ArrayDocNode is copy-constructed over it, *that* `ArrayDocNode` is not actually ever constructed. A `DocNode` is just "type punned" into an `ArrayDocNode`.
> 
> Would changing the definition to:
> 
> ```
>   ArrayDocNode getArrayNode() {
>     auto N = ArrayDocNode(&KindAndDocs[size_t(Type::Array)]);
>     Arrays.push_back(std::unique_ptr<DocNode::ArrayTy>(new DocNode::ArrayTy));
>     N.Array = Arrays.back().get();
>     return N;
>   }
> ```
> 
> And the analogous change for MapDocNode avoid the UB?
Hey, thanks for chiming in!

I don't believe the proposed change would address the UB.

Essentially, it's important for C++ that the compiler can assume that the type of an object doesn't change when a function call is made:

```
some_type var;
var.func();
// var is of type 'some_type' here
```

While it is valid (but pretty tricky/best avoided) to have 'func()' destroy 'var', placement-new some other type in 'var's memory, then destroy that, then construct a new 'some_type' where 'var' is - that doesn't seem like it's the use case you're trying to get to here, so I don't believe that small exception helps you here.

`*this = getArrayNode()` doesn't change the type of `*this`, so it's still an aliasing violation to `reinterpret_cast` `this` to an `ArrayDocNode*`, because it's still not an `ArrayDocNode*` - the assignment in `convertToArray` calls `DocNode's operator=(const DocNode&)` - it's slicing, just taking the DocNode part of the ArrayDocNode and assigning it to `*this`- it doesn't change the dynamic type of `*this`.

Anymore than:

```
base b;
derived d;
b = d;
```

`b` is still of type `base`, not of type `derived`.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57023/new/

https://reviews.llvm.org/D57023





More information about the llvm-commits mailing list