[PATCH] D75726: [ConstExprPreter] Updated constant interpreter documentation

Richard Smith - zygoloid via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Apr 7 15:49:44 PDT 2020


rsmith added inline comments.


================
Comment at: clang/docs/ConstantInterpreter.rst:118
 Memory management in the interpreter relies on 3 data structures: ``Block``
-object which store the data and associated inline metadata, ``Pointer`` objects
-which refer to or into blocks, and ``Descriptor`` structures which describe
-blocks and subobjects nested inside blocks.
+object which store the data and associated inline metadata, ``Pointer`` 
+objects which refer to or into blocks, and ``Descriptor`` structures which 
----------------
object -> objects


================
Comment at: clang/docs/ConstantInterpreter.rst:167
 
-Descriptor are generated at bytecode compilation time and contain information required to determine if a particular memory access is allowed in constexpr. Even though there is a single descriptor object, it encodes information for several kinds of objects:
+Descriptor are generated at bytecode compilation time and contain information 
+required to determine if a particular memory access is allowed in constexpr. 
----------------
Descriptor -> Descriptors


================
Comment at: clang/docs/ConstantInterpreter.rst:169-170
+required to determine if a particular memory access is allowed in constexpr. 
+Even though there is a single descriptor object, it encodes information for 
+several kinds of objects:
 
----------------
This sounds like this is talking about the memory layout of descriptors, but I think the text below is actually talking about the memory layout of blocks?


================
Comment at: clang/docs/ConstantInterpreter.rst:174
 
   A block containing a primitive reserved storage only for the primitive.
 
----------------
reserved -> reserves

How do you distinguish between initialized and in-lifetime-but-uninitialized primitives? Eg:

```
constexpr int f() {
  int a;
  return a; // error, a is in-lifetime but uninitialized
}
```


================
Comment at: clang/docs/ConstantInterpreter.rst:174
 
   A block containing a primitive reserved storage only for the primitive.
 
----------------
rsmith wrote:
> reserved -> reserves
> 
> How do you distinguish between initialized and in-lifetime-but-uninitialized primitives? Eg:
> 
> ```
> constexpr int f() {
>   int a;
>   return a; // error, a is in-lifetime but uninitialized
> }
> ```
How do you represent the result of casting a pointer to an integer (which we permit only when constant-folding, but nonetheless we do permit)?


================
Comment at: clang/docs/ConstantInterpreter.rst:182
+  initialised, while a value of ``(InitMap*)-1`` indicates that the object was 
+  fully initialised. when all fields are initialised, the map is deallocated 
+  and replaced with that token.
----------------
when -> When

The overwhelmingly common case is the set of initialized elements is [0, 1, ..., K) for some K. Have you considered instead storing this value as a union of an `InitMap*` and an integer, using the bottom bit to indicate which case we're in? (We don't need to allocate the map at all except in weird cases where someone makes a hole in the array through a pseudo-destructor or allocates out-of-order with `construct_at` or similar.)

How do you distinguish between in-lifetime-but-uninitialized elements and out-of-lifetime elements? For example:

```
using T = int;
constexpr int f(bool b) {
  int arr[5];
  arr[3].~T();
  arr[0] = 1; // ok, uninitialized -> initialized
  if (!b)
    arr[3] = 1; // error, arr[3] is not in lifetime
  else
    std::construct_at(arr + 3, 0); // ok, not in lifetime -> in lifetime and initialized
  return arr[3];
}
```

Maybe we should use two bits per primitive in the `InitMap` case and store both "initialized" and "in-lifetime"?


================
Comment at: clang/docs/ConstantInterpreter.rst:194-195
+Records are laid out identically to arrays of composites: each field and base 
+class is preceded by an inline descriptor. The ``InlineDescriptor`` 
+has the following field:
 
----------------
field -> fields

>From the description below, it looks like `sizeof(InlineDescriptor)` is currently 16. That seems unnecessary: We could easily get this down to 8 bytes by bit-packing the offset and the flags. (Restricting ourselves to 2^59 bytes for each record seems unproblematic.) I suspect it might even be OK to pack this into 4 bytes; that'd still allow us to support objects whose representations are up to 128 MiB -- though perhaps that's getting a little too close to territory that programs might actually want to enter.


================
Comment at: clang/docs/ConstantInterpreter.rst:201
+ * **IsInitialized**: flag indicating whether the field or element was 
+ initialized. For non-primitive fields, this is only relevant for base classes.
+ * **IsBase**: flag indicating whether the record is a base class. In that case, 
----------------
It's not completely clear to me what this would mean for a base class. Is the idea that this tracks whether base classes are in their period of construction / destruction, so that you can determine the dynamic type of the object?


================
Comment at: clang/docs/ConstantInterpreter.rst:204
+ the offset can be used to identify the derived class.
  * **IsActive**: indicates if the field is the active field of a union.
  * **IsMutable**: indicates if the field is marked as mutable.
----------------
I assume this is actually more generally tracking whether the subobject is within its lifetime?


================
Comment at: clang/docs/ConstantInterpreter.rst:223-224
+ * **TargetPointer**: represents a target address derived from a base address 
+ through pointer arithmetic, such as ``((int *)0x100)[20]``. Null pointers are 
+ target pointers with a zero offset.
+ * **TypeInfoPointer**: tracks information for the opaque type returned by 
----------------
This seems problematic -- we need to distinguish between pointers with value zero and null pointers to support targets where they are different. Perhaps we could instead represent null pointers as a new tag in the tagged union, and reserve a zero `TargetPointer` for only those cases where a pointer is zero but not null? (See `APValue::isNullPointer()` and `LValue::IsNullPtr` for how we track that in the old constant evaluator.)


================
Comment at: clang/docs/ConstantInterpreter.rst:231-232
+
+Besides the previously mentioned union, a number of other pointers have 
+their own type:
+
----------------
I would say "pointers and pointer-like types", since at least member pointers are not actually a kind of pointer.


================
Comment at: clang/docs/ConstantInterpreter.rst:328
+banned in constexpr, some expressions on target offsets must be folded, 
+replicating the behavious of the ``offsetof`` builtin. Target pointers 
+are characterised by 3 offsets: a field offset, an array offset and a 
----------------
behavious -> behaviour


================
Comment at: clang/docs/ConstantInterpreter.rst:329-334
+are characterised by 3 offsets: a field offset, an array offset and a 
+base offset, along with a descriptor specifying the type the pointer is 
+supposed to refer to. Array indexing ajusts the array offset, while the 
+field offset is adjusted when a pointer to a member is created. Casting 
+an integer to a pointer sets the value of the base offset. As a special 
+case, null pointers are target pointers with all offets set to 0.
----------------
Why do we need all three of these? If this is replacing the `APValue` forms with a null base and only an offset, a single value seems sufficient.


================
Comment at: clang/docs/ConstantInterpreter.rst:331
+base offset, along with a descriptor specifying the type the pointer is 
+supposed to refer to. Array indexing ajusts the array offset, while the 
+field offset is adjusted when a pointer to a member is created. Casting 
----------------
ajusts -> adjusts


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75726/new/

https://reviews.llvm.org/D75726





More information about the cfe-commits mailing list