[PATCH] D141215: [clang-repl] Introduce Value to capture expression results

Vassil Vassilev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat Apr 15 23:58:05 PDT 2023


v.g.vassilev added a comment.

I added some of background and historical notes. I hope that helps.



================
Comment at: clang/include/clang/Interpreter/Interpreter.h:59
+
+  Value LastValue;
 
----------------
aaron.ballman wrote:
> I think I'm surprised to see this as a data member of `Interpreter` but mostly because my brain keeps thinking this interpreter is to interpret whole programs, so the idea of a "last value" is a bit odd from that context. The name is probably fine as-is, but I figured it may be worth mentioning just the same.
One way to think of this is every `repl` has some way to access the last result it created when executed its last chunk of input. In python that's the `_` value. In the debugger it is the `%N` where N is some number. Here this variable would enable implementing a similar to these in future.


================
Comment at: clang/include/clang/Interpreter/Value.h:98
+  QualType getType() const;
+  Interpreter &getInterpreter() const;
+
----------------
junaire wrote:
> sgraenitz wrote:
> > Can we make this private and try not to introduce more dependencies on the interpreter instance?
> > 
> > The inherent problem is that we will have to wire these through Orc RPC in the future once we want to support out-of-process execution.
> Ughh, I don't understand can you elaborate a bit? We can't make these accessors private since they are meaningful API?  it will be used intensively in `ValuePrinter.cpp` (the second patch), and the end users may want to use them as well.
I think what @sgraenitz means here is that the `Value` class combines two things. Essential information about what happened in the JIT and its higher-level type representation by the language it came from.

In theory, and I don't know how, we could split this somehow into two. The first part would become a pure JIT thing which only cares about putting bits in and out of the JIT and the second part would be the one that attach the exact meaning of what these bits are in the high-level language.

This would allow us to run C++ on small devices which cannot fit the entire clang-repl/jit/llvm infrastructure. That would happen by creating an out-of-process execution which then communicates with the host with the `Value` class. That would be a critical component to have, however, I'd prefer to land this and then do some research how to make this happen. This patch is critical for several projects already and I don't think we can wait too much. The lack of this patch makes clang-repl non-functional.

I believe @sgraenitz described the out-of-process execution in his talk here: https://compiler-research.org/meetings/#caas_10Mar2022 @sgraenitz please feel free to correct me if my interpretation of your words incorrect.




================
Comment at: clang/include/clang/Interpreter/Value.h:46
+
+#define REPL_BUILTIN_TYPES                                                     \
+  X(bool, Bool)                                                                \
----------------
aaron.ballman wrote:
> Is this expected to be a complete list of builtin types? e.g., should this have `char8_t` and `void` and `wchar_t`, etc? Should this be including `clang/include/clang/AST/BuiltinTypes.def` instead of manually maintaining the list?
This is used for various things including storing the bits into a big-endian agnostic way. For `void` we have a special case in the Value and we cannot define a union of a `void` type. We can include the others that you suggest. All the relevant ones are described in `clang::BuiltinType::getName`.

We cannot use `BuiltinTypes.def` because we want to have a mapping between the type as written in the language (eg, `bool`, `unsigned`, etc) and its underlying type name. That mapping is not available in `BuiltinTypes.def`. Ideally we should extend `BuiltinTypes.def` somehow but I'd prefer outside of this patch. One of the challenges is that some of the types depend on the language options (eg. `_Bool` vs `bool`) and I am not sure this can be resolved by tablegen.

On a broader perspective, the `Value` class is responsible for two things: (a) get a value from the interpreter to compiled code (see test case); (b) set a value from compiled code to the interpreter. The second case is not yet covered (I can open soon another patch). The major use-case is calling at runtime functions and taking input parameters from compiled code.

FWIW, we should probably move all of these entities in a separate namespace. I'd suggest `caas` (compiler-as-a-service) and possibly rename the `Value` to `InterpreterValue` since `Value` is very generic and there are already a couple of classes with that name in llvm and clang. 


================
Comment at: clang/include/clang/Interpreter/Value.h:83
+  Value() = default;
+  Value(void /*Interpreter*/ *In, void /*QualType*/ *Ty);
+  Value(const Value &RHS);
----------------
junaire wrote:
> aaron.ballman wrote:
> > Why do these take `void *` instead of the expected type?
> Yeah for the first parameter, we could just use `Interpreter*` but the second one is an opaque type so I think we should keep it?
See my previous comments on performance. We cannot include anything bulky in the header file.


================
Comment at: clang/include/clang/Interpreter/Value.h:121
+    static T cast(const Value &V) {
+      if (V.isPointerOrObjectType())
+        return (T)(uintptr_t)V.getAs<void *>();
----------------
aaron.ballman wrote:
> Do we have to worry about member function pointers where a single pointer value may not be sufficient?
I am not sure if I understand your comment correctly. We only store objects and not function/member pointers (perhaps in some pathological way one could do that but I'd say it is not supported). Here we want to know if we stored it as a non-builtin type we need to make the pointer-like casts. My take is that for now we shouldn't worry about member function pointers.


================
Comment at: clang/include/clang/Interpreter/Value.h:143
+  /// Values referencing an object are treated as pointers to the object.
+  template <typename T> T castAs() const { return CastFwd<T>::cast(*this); }
+
----------------
junaire wrote:
> aaron.ballman wrote:
> > This doesn't match the usual pattern used in Clang where the `get` variant returns `nullptr` and the `cast` variant asserts if the cast would be invalid. Should we go with a similar approach?
> These APIs are adopted in Cling and Cling will remove the old implementation and use the upstream version of the value printing feature at some point in the future. So @v.g.vassilev hope we can keep these APIs close to Cling's variants as much as we can. I don't have a strong opinion here, what's your take here? @v.g.vassilev 
I probably see the confusion. The `getAs<T>` operation is only relevant for pointer types. For example we stored object of type `UserClass` as a pointer. For instance, I don't see how `getAs<float>` can return a `nullptr`. We can probably inline the implementation of `getAs` for non-pointer types into `castAs` and leave `getAs` unimplemented relying only on the explicit specialization bellow for pointers.
 
The `castAs` operation means take a valid bit representation and transforms it into the requested one. I don't think it can or should be ever invalid. Here we are reshuffling bits.

 These changes are very subtle and if we decide to make changes we should open a pull request against the ROOT project to make sure we are not breaking some important case. The reference implementation is here: https://github.com/root-project/root/blob/master/interpreter/cling/include/cling/Interpreter/Value.h


================
Comment at: clang/include/clang/Interpreter/Value.h:160-162
+  // Interpreter, QualType are stored as void* to reduce dependencies.
+  void *Interp = nullptr;
+  void *OpaqueType = nullptr;
----------------
aaron.ballman wrote:
> Why don't forward declares suffice if we're storing the information by pointer?
This is a performance-critical class. We literally measure the instruction count for it. We practically cannot include anything in this header file because the class needs to included as part of the interpreter runtime. For example:

```
#include <clang/Interpreter/Value.h>
Value ResultV;
gInterpreter->evaluate("float i = 12.3; i++", &V);
printf("Value is %d\n", ResultV.castAs<int>());
```

This is how you can do things in Cling. This not yet there but that's our next step.

For performance reasons we have the `ValueKind` optimization which allows us to perform most of the operations we need very fast. There are some operations such as printing the concrete type which need the actual `QualType` and so on but they are outside of the performance critical paths and it is okay to resort back to the real types providing the level of accuracy we need.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141215/new/

https://reviews.llvm.org/D141215



More information about the cfe-commits mailing list