[PATCH] D141215: [clang-repl] Introduce Value to capture expression results

Stefan Gränitz via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon May 8 01:57:23 PDT 2023


sgraenitz added a comment.

In D141215#4272538 <https://reviews.llvm.org/D141215#4272538>, @lhames wrote:

> using regular global variable instances to manage the storage on the executor side, an extended `MemoryAccess` interface to read/write the value from the REPL side when needed (e.g. for printing), and emitting glue functions to pass the variable's value in to callers.

Agree, that's probably the better solution. Just for the record: Values couldn't be temporaries and access must be synchronized with execution to avoid races. I guess both is easily acceptable.



================
Comment at: clang/include/clang/Interpreter/Value.h:160-162
+  // Interpreter, QualType are stored as void* to reduce dependencies.
+  void *Interp = nullptr;
+  void *OpaqueType = nullptr;
----------------
v.g.vassilev wrote:
> aaron.ballman wrote:
> > junaire wrote:
> > > aaron.ballman wrote:
> > > > v.g.vassilev wrote:
> > > > > aaron.ballman wrote:
> > > > > > v.g.vassilev wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > Why don't forward declares suffice if we're storing the information by pointer?
> > > > > > > This is a performance-critical class. We literally measure the instruction count for it. We practically cannot include anything in this header file because the class needs to included as part of the interpreter runtime. For example:
> > > > > > > 
> > > > > > > ```
> > > > > > > #include <clang/Interpreter/Value.h>
> > > > > > > Value ResultV;
> > > > > > > gInterpreter->evaluate("float i = 12.3; i++", &V);
> > > > > > > printf("Value is %d\n", ResultV.castAs<int>());
> > > > > > > ```
> > > > > > > 
> > > > > > > This is how you can do things in Cling. This not yet there but that's our next step.
> > > > > > > 
> > > > > > > For performance reasons we have the `ValueKind` optimization which allows us to perform most of the operations we need very fast. There are some operations such as printing the concrete type which need the actual `QualType` and so on but they are outside of the performance critical paths and it is okay to resort back to the real types providing the level of accuracy we need.
> > > > > > > 
> > > > > > That sounds like it's going to lead to maintenance problems in the long term, right? I can't think of another header file we say "don't touch this because it may impact runtime performance", and so I can easily imagine someone breaking your expectation that this file can't include anything else.
> > > > > > 
> > > > > > Is there a long-term plan for addressing this?
> > > > > We have a few components like the Lexer that are extremely prone to performance regressions.
> > > > > 
> > > > > In terms for a longer-term plan in addressing this there are some steps could help IMO. First, this component is relatively standalone and very few changes will be required over time, for these I am hoping to be listed as a reviewer. Second, we can add a comment in the include area, making a note that including anything here will degrade the performance of almost all interpreted code. Third, we will find out about this in our downstream use-cases as the things get significantly slower.
> > > > > 
> > > > > Neither is silver bullet but that's probably the best we could do at that time. Btw, we might be able to add a test that's part of LLVM's performance analysis infrastructure.
> > > > > Neither is silver bullet but that's probably the best we could do at that time. Btw, we might be able to add a test that's part of LLVM's performance analysis infrastructure.
> > > > 
> > > > Yeah, we should probably consider doing that. But to make sure I understand the performance concerns... when we change functionality in the lexer, we (potentially) slow down the lexing phase of compilation. That's straightforward and unsurprising. But in this case, it sounds like the act of including another header file in this header file will cause a runtime performance concern, even if no other changes are made. If I'm correct, I can't think of anything else in the compiler that works like that.
> > > I believe what @v.g.vassilev means is that the repl itself might include `Value.h` as a part of *runtime*, so if the header is heavy, you can notice the repl is slowed down. (That's obvious) So keep in mind we're breaking the boundary between the compiled code and interpreted code (It's kinda confusing) here it actually impacts interpreted code.
> > > I believe what @v.g.vassilev means is that the repl itself might include Value.h as a part of *runtime*, so if the header is heavy, you can notice the repl is slowed down. (That's obvious) So keep in mind we're breaking the boundary between the compiled code and interpreted code (It's kinda confusing) here it actually impacts interpreted code.
> > 
> > I'm not certain that's a reasonable design choice to make. Or, stated somewhat differently, I'm really uncomfortable with having header files we can't maintain because changing them impacts runtime performance in surprising ways. That's not a sustainable design even if we think this header will be reasonably stable. We need *some* amount of abstraction here so that we can have a clean design for the REPL interpreter without NFC code changes impacting performance. Otherwise, people will be discouraged from adding comments to this file (those take time to lex, after all), or using long identifiers (longer identifiers take longer to lex than shorter ones), or including what is used instead of using `void *` (as being discussed here), and so on.
> > 
> > This is quite probably something you've already thought about plenty, but... could we add an abstraction layer so that the interpreter side of things has a "low-token-count" interface that dispatches through to the actual implementation?
> > > I believe what @v.g.vassilev means is that the repl itself might include Value.h as a part of *runtime*, so if the header is heavy, you can notice the repl is slowed down. (That's obvious) So keep in mind we're breaking the boundary between the compiled code and interpreted code (It's kinda confusing) here it actually impacts interpreted code.
> > 
> > I'm not certain that's a reasonable design choice to make. Or, stated somewhat differently, I'm really uncomfortable with having header files we can't maintain because changing them impacts runtime performance in surprising ways. That's not a sustainable design even if we think this header will be reasonably stable. We need *some* amount of abstraction here so that we can have a clean design for the REPL interpreter without NFC code changes impacting performance. Otherwise, people will be discouraged from adding comments to this file (those take time to lex, after all), or using long identifiers (longer identifiers take longer to lex than shorter ones), or including what is used instead of using `void *` (as being discussed here), and so on.
> 
> All valid points. I guess we have seen some changes related to compilation speed in the past in the STLExtras.h (iirc, although I cannot find the right commit). We did particular changes to the header file to reduce the compilation time of some large TU builds. I'd think that's more like the case of `stddef.h` and similar headers in the resource directory. The more we add the worse becomes the compiler startup time.
> 
> > 
> > This is quite probably something you've already thought about plenty, but... could we add an abstraction layer so that the interpreter side of things has a "low-token-count" interface that dispatches through to the actual implementation?
> 
> Yes, I have a plan that's quite ambitious (and a draft RFC): basically the idea is any `#include` to become a no-op for the compiler unless something is actually needed.
> 
> I understand your concern here but I don't really know how to address it in this particular patch.
> 
> 
>> [...] include Value.h as a part of *runtime*, so if the header is heavy, you can notice the repl is slowed down. (That's obvious) So keep in mind we're breaking the boundary between the compiled code and interpreted code (It's kinda confusing) here it actually impacts interpreted code.
> I'm really uncomfortable with having header files we can't maintain because changing them impacts runtime performance in surprising ways. [...] could we add an abstraction layer so that the interpreter side of things has a "low-token-count" interface that dispatches through to the actual implementation?

Agree. What about splitting this up in 3 parts?
(1) Private API: interface as consumed by the libclangInterpreter
(2) Public API: interface as consumed by tools, other LLVM projects and out-of-tree
(3) Client API: the actual runtime-only parts with low-token-count etc.

(1) and (2) would be here in the source-tree. (3) is a resource file similar to intrinsics headers, I believe they have very similar requirements on performance, maintenance and versioning.


================
Comment at: clang/lib/Interpreter/Interpreter.cpp:534
+    }
+      llvm_unreachable("Unknown runtime interface kind");
+    }
----------------
junaire wrote:
> sgraenitz wrote:
> > Is this a leftover from an old `default` label?
> there's no default label since we handle all the cases here. But indeed it looks a bit ugly and I removed it.
Perfect


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141215/new/

https://reviews.llvm.org/D141215



More information about the cfe-commits mailing list