[PATCH] D64146: [Clang Interpreter] Initial patch for the constexpr interpreter

Jessica Clarke via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 26 19:34:52 PST 2021


jrtc27 added a comment.

In D64146#2591830 <https://reviews.llvm.org/D64146#2591830>, @jrtc27 wrote:

> In D64146#2591732 <https://reviews.llvm.org/D64146#2591732>, @jrtc27 wrote:
>
>> In D64146#2567710 <https://reviews.llvm.org/D64146#2567710>, @nand wrote:
>>
>>> CodePtr points into the bytecode emitted by the byte code compiler. In some instances, pointers to auxiliary data structures are embedded into the byte code, such as functions or AST nodes which contain information relevant to the execution of the instruction.
>>>
>>> Would it help if instead of encoding pointers, the byte code encoded some integers mapped to the original objects?
>>
>> I've read through the code and have slightly more understanding now. It seems there are several options:
>>
>> 1. Keep the pointers somewhere on the side and put an integer in the byte code, like you suggest
>>
>> 2. Pad values in the byte code to their natural alignment in general (and ensure the underlying std::vector<char> gets its storage allocated at an aligned boundary / use a different container), though this can get a little weird as the amount of padding between consecutive arguments varies depending on where you are (unless you force realignment to the max alignment at the start of a new opcode)
>>
>> 3. Make the byte code be an array of uintptr_t instead of packing it like is done currently, with care needed on ILP32; that can either just use uint64_t instead and we declare CHERI unsupported for 32-bit architectures (which is unlikely to be a problem as you probably want a 64-bit virtual address space if the doubling pointer size, with 64-bit CHERI capabilities on 32-bit VA systems being only for embedded use) or you can split 64-bit integers into two 32-bit integers and treat them as two arguments
>>
>> 1 works but feels ugly. 2 or 3 would be my preference, and mirror how "normal" interpreters work, though those might split the code and data so they can keep the opcodes as, say, 32-bit integers, but the stack full of native word/pointer slots; my inclination is that 3 is the best option as it looks like the simplest. How do you feel about each of those? Is memory overhead from not packing values a concern?
>
> Hm, though I see the "store an ID" pattern is common for dynamic things and this should be quite rare, so maybe that is indeed the right approach, mirroring something like getOrCreateGlobal?

https://reviews.llvm.org/D97606 implements this.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64146/new/

https://reviews.llvm.org/D64146



More information about the cfe-commits mailing list