[PATCH] D75726: [ConstExprPreter] Updated constant interpreter documentation
Richard Smith - zygoloid via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Apr 7 15:49:44 PDT 2020
rsmith added inline comments.
================
Comment at: clang/docs/ConstantInterpreter.rst:118
Memory management in the interpreter relies on 3 data structures: ``Block``
-object which store the data and associated inline metadata, ``Pointer`` objects
-which refer to or into blocks, and ``Descriptor`` structures which describe
-blocks and subobjects nested inside blocks.
+object which store the data and associated inline metadata, ``Pointer``
+objects which refer to or into blocks, and ``Descriptor`` structures which
----------------
object -> objects
================
Comment at: clang/docs/ConstantInterpreter.rst:167
-Descriptor are generated at bytecode compilation time and contain information required to determine if a particular memory access is allowed in constexpr. Even though there is a single descriptor object, it encodes information for several kinds of objects:
+Descriptor are generated at bytecode compilation time and contain information
+required to determine if a particular memory access is allowed in constexpr.
----------------
Descriptor -> Descriptors
================
Comment at: clang/docs/ConstantInterpreter.rst:169-170
+required to determine if a particular memory access is allowed in constexpr.
+Even though there is a single descriptor object, it encodes information for
+several kinds of objects:
----------------
This sounds like this is talking about the memory layout of descriptors, but I think the text below is actually talking about the memory layout of blocks?
================
Comment at: clang/docs/ConstantInterpreter.rst:174
A block containing a primitive reserved storage only for the primitive.
----------------
reserved -> reserves
How do you distinguish between initialized and in-lifetime-but-uninitialized primitives? Eg:
```
constexpr int f() {
int a;
return a; // error, a is in-lifetime but uninitialized
}
```
================
Comment at: clang/docs/ConstantInterpreter.rst:174
A block containing a primitive reserved storage only for the primitive.
----------------
rsmith wrote:
> reserved -> reserves
>
> How do you distinguish between initialized and in-lifetime-but-uninitialized primitives? Eg:
>
> ```
> constexpr int f() {
> int a;
> return a; // error, a is in-lifetime but uninitialized
> }
> ```
How do you represent the result of casting a pointer to an integer (which we permit only when constant-folding, but nonetheless we do permit)?
================
Comment at: clang/docs/ConstantInterpreter.rst:182
+ initialised, while a value of ``(InitMap*)-1`` indicates that the object was
+ fully initialised. when all fields are initialised, the map is deallocated
+ and replaced with that token.
----------------
when -> When
The overwhelmingly common case is the set of initialized elements is [0, 1, ..., K) for some K. Have you considered instead storing this value as a union of an `InitMap*` and an integer, using the bottom bit to indicate which case we're in? (We don't need to allocate the map at all except in weird cases where someone makes a hole in the array through a pseudo-destructor or allocates out-of-order with `construct_at` or similar.)
How do you distinguish between in-lifetime-but-uninitialized elements and out-of-lifetime elements? For example:
```
using T = int;
constexpr int f(bool b) {
int arr[5];
arr[3].~T();
arr[0] = 1; // ok, uninitialized -> initialized
if (!b)
arr[3] = 1; // error, arr[3] is not in lifetime
else
std::construct_at(arr + 3, 0); // ok, not in lifetime -> in lifetime and initialized
return arr[3];
}
```
Maybe we should use two bits per primitive in the `InitMap` case and store both "initialized" and "in-lifetime"?
================
Comment at: clang/docs/ConstantInterpreter.rst:194-195
+Records are laid out identically to arrays of composites: each field and base
+class is preceded by an inline descriptor. The ``InlineDescriptor``
+has the following field:
----------------
field -> fields
>From the description below, it looks like `sizeof(InlineDescriptor)` is currently 16. That seems unnecessary: We could easily get this down to 8 bytes by bit-packing the offset and the flags. (Restricting ourselves to 2^59 bytes for each record seems unproblematic.) I suspect it might even be OK to pack this into 4 bytes; that'd still allow us to support objects whose representations are up to 128 MiB -- though perhaps that's getting a little too close to territory that programs might actually want to enter.
================
Comment at: clang/docs/ConstantInterpreter.rst:201
+ * **IsInitialized**: flag indicating whether the field or element was
+ initialized. For non-primitive fields, this is only relevant for base classes.
+ * **IsBase**: flag indicating whether the record is a base class. In that case,
----------------
It's not completely clear to me what this would mean for a base class. Is the idea that this tracks whether base classes are in their period of construction / destruction, so that you can determine the dynamic type of the object?
================
Comment at: clang/docs/ConstantInterpreter.rst:204
+ the offset can be used to identify the derived class.
* **IsActive**: indicates if the field is the active field of a union.
* **IsMutable**: indicates if the field is marked as mutable.
----------------
I assume this is actually more generally tracking whether the subobject is within its lifetime?
================
Comment at: clang/docs/ConstantInterpreter.rst:223-224
+ * **TargetPointer**: represents a target address derived from a base address
+ through pointer arithmetic, such as ``((int *)0x100)[20]``. Null pointers are
+ target pointers with a zero offset.
+ * **TypeInfoPointer**: tracks information for the opaque type returned by
----------------
This seems problematic -- we need to distinguish between pointers with value zero and null pointers to support targets where they are different. Perhaps we could instead represent null pointers as a new tag in the tagged union, and reserve a zero `TargetPointer` for only those cases where a pointer is zero but not null? (See `APValue::isNullPointer()` and `LValue::IsNullPtr` for how we track that in the old constant evaluator.)
================
Comment at: clang/docs/ConstantInterpreter.rst:231-232
+
+Besides the previously mentioned union, a number of other pointers have
+their own type:
+
----------------
I would say "pointers and pointer-like types", since at least member pointers are not actually a kind of pointer.
================
Comment at: clang/docs/ConstantInterpreter.rst:328
+banned in constexpr, some expressions on target offsets must be folded,
+replicating the behavious of the ``offsetof`` builtin. Target pointers
+are characterised by 3 offsets: a field offset, an array offset and a
----------------
behavious -> behaviour
================
Comment at: clang/docs/ConstantInterpreter.rst:329-334
+are characterised by 3 offsets: a field offset, an array offset and a
+base offset, along with a descriptor specifying the type the pointer is
+supposed to refer to. Array indexing ajusts the array offset, while the
+field offset is adjusted when a pointer to a member is created. Casting
+an integer to a pointer sets the value of the base offset. As a special
+case, null pointers are target pointers with all offets set to 0.
----------------
Why do we need all three of these? If this is replacing the `APValue` forms with a null base and only an offset, a single value seems sufficient.
================
Comment at: clang/docs/ConstantInterpreter.rst:331
+base offset, along with a descriptor specifying the type the pointer is
+supposed to refer to. Array indexing ajusts the array offset, while the
+field offset is adjusted when a pointer to a member is created. Casting
----------------
ajusts -> adjusts
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D75726/new/
https://reviews.llvm.org/D75726
More information about the cfe-commits
mailing list