[PATCH] D135750: [clang][Interp] Track initialization state of local variables

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 22 05:36:12 PST 2022


aaron.ballman added inline comments.


================
Comment at: clang/lib/AST/Interp/ByteCodeExprGen.cpp:749-751
+  const Descriptor::MetadataSize MDSize{sizeof(InlineDescriptor)};
+  Descriptor *D =
+      P.createDescriptor(Src, Ty, MDSize, IsConst, Src.is<const Expr *>());
----------------
tbaeder wrote:
> aaron.ballman wrote:
> > tbaeder wrote:
> > > erichkeane wrote:
> > > > aaron.ballman wrote:
> > > > > Double checking that `Src.is<const Expr *>()` is correct here. That's being passed as the `IsTemporary` argument. I presume the logic is that if the decl type is an expression then it has to be a temporary, but that's not entirely true. Consider C where a compound literal is *not* a temporary but actually creates an lvalue: `(int){12}`. Will that be an issue?
> > > > See suggested comment below in InterpBlock.
> > > That interpretation is correct and I don't know what to do about compound literals yet.
> > My intuition is that `IsTemporary` should be set from `Expr::isTemporaryObject()` (or isa `MaterializeTemporaryExpr` maybe).
> > 
> > At the very least, you can add a .c test file with a compound literal in a _Static_assert and a FIXME comment (in the test and here in the code).
> > 
> The `Src.is<>` is not a change introduced by this patch though.
> 
> As for a test, it'll just be rejected because we don't implement visiting `CompoundLiteral`s at all right now.
> The Src.is<> is not a change introduced by this patch though.

Then let's add a FIXME comment to remember to come back to look into this (I'm not convinced it was correct to begin with, but it is kind of orthogonal to this patch too).

> As for a test, it'll just be rejected because we don't implement visiting CompoundLiterals at all right now.

Fair!


================
Comment at: clang/lib/AST/Interp/Descriptor.h:72
+  struct MetadataSize {
+    size_t NumBytes;
+  };
----------------
I think this should use `InterpSize` rather than `size_t` given that it's used to assign into `MDSize`


================
Comment at: clang/lib/AST/Interp/Descriptor.h:141-145
   unsigned getAllocSize() const { return AllocSize; }
   /// returns the size of an element when the structure is viewed as an array.
   unsigned getElemSize()  const { return ElemSize; }
+  /// Returns the size of the metadata.
+  unsigned getMetadataSize() const { return MDSize; }
----------------
There's some interface awkwardness here where these values are of type `InterpSize` but the interface returns `unsigned`


================
Comment at: clang/lib/AST/Interp/Descriptor.h:74
+  };
+  static constexpr MetadataSize NoMetadata = MetadataSize{0};
+
----------------
aaron.ballman wrote:
> tbaeder wrote:
> > tbaeder wrote:
> > > shafik wrote:
> > > > erichkeane wrote:
> > > > > add a line: 
> > > > > `static constexpr MetadataSize InlineDescriptor = MetadataSize{sizeof(InlineDescriptor)};` 
> > > > > and you can use this instead of depending on 'sizeof' all over the place.
> > > > It feels weird to call this `NoMetadata` but we will pass this as an argument to a function with a parameter of `MetaDataSize`. So I am expecting a size but I am getting no meta data instead and it looks like a mistake. 
> > > > 
> > > > Maybe a better name would be `ZeroMetaDataSize`?
> > > I kinda get what your point is, but I don't think this is confusing.
> > I'm a bit concerned about the naming here; `InlineDescriptor` is already a type name, but calling it `MetadataSizeInlineDesc` is pretty long and callers already have to prepend a `Descriptor::` to that. :/
> Rather than using a sentinel value, should we be using `Optional`?
I'm still wondering if `Optional` is cleaner than `NoMetadata`...


================
Comment at: clang/lib/AST/Interp/InterpBlock.h:77-86
+  char *data() {
+    // Data might contain metadata as well.
+    size_t DataOffset = Desc->getMetadataSize();
+    return reinterpret_cast<char *>(reinterpret_cast<char *>(this) +
+                                    sizeof(Block) + DataOffset);
+  }
+
----------------
tbaeder wrote:
> aaron.ballman wrote:
> > tbaeder wrote:
> > > aaron.ballman wrote:
> > > > Should we have `const`-correct overloads to provide read-only access?
> > > IIRC I added that at least once in a different patch but I abandoned that and haven't had a use for a const overload since.
> > My worry is that retrofitting const correctness will eventually become too much of a chore and so we'll be adding yet another const-incorrect interface to the compiler. Seeing as how this is being reworked from scratch, I think we should use the opportunity to get better const-correctness, but maybe that ship has sailed?
> I don't think the ship has sailed at all. Is your argument that the function should be `const` because it can be? It's slightly problematic because they end up returning `const char *` as well. I can add the two overloads without problem, just not sure if they end up being used at all right now.
> I don't think the ship has sailed at all. Is your argument that the function should be const because it can be? It's slightly problematic because they end up returning const char * as well. I can add the two overloads without problem, just not sure if they end up being used at all right now.

My argument is that pairing operations that return a pointer to mutable data with a constant operation that returns a pointer to constant data makes it easier for us to add const-correct code in the future because people are less likely to run into the const correctness "tax" where they have to add overloads/markings virally.


================
Comment at: clang/lib/AST/Interp/Program.cpp:334
         Descriptor *ElemDesc =
-            createDescriptor(D, ElemTy.getTypePtr(), IsConst, IsTemporary);
+            createDescriptor(D, ElemTy.getTypePtr(), Descriptor::NoMetadata,
+                             IsConst, IsTemporary);
----------------
tbaeder wrote:
> aaron.ballman wrote:
> > tbaeder wrote:
> > > aaron.ballman wrote:
> > > > Why does this one get no metadata while the rest of the descriptors created do?
> > > This is the descriptor for the array elements. The metadata passed to this function is only being used for the outermost descriptor, i.e. we use it a few lines below for the actual array but we don't use it for the array elements.
> > Don't we need the metadata to track whether the array elements have been initialized? e.g., it's not the array object that gets initialized, it's the elements that do, and we'd want to catch things like:
> > ```
> > consteval int func() {
> >   int a[10];
> >   return a[2];
> > }
> > ```
> That's what the `InitMap` is for: it's just a bitmap storing one bit per element of a primitive array for its initialization state.
Ahhh okay, thank you for the explanation!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135750/new/

https://reviews.llvm.org/D135750



More information about the cfe-commits mailing list