[Lldb-commits] [lldb] [lldb][docs] Resurrect the information on adding a new language (PR #109427)

Mon Sep 23 04:34:04 PDT 2024

================
@@ -0,0 +1,94 @@
+# Adding Programming Language Support
+
+LLDB has been architected to make it straightforward to add support for a
+programming language. Only a small enum in core LLDB needs to be modified to
+make LLDB aware of a new programming language. Everything else can be supplied
+in derived classes that need not even be present in the core LLDB repository.
+This makes it convenient for developers adding language support in downstream
+repositories since it practically eliminates the potential for merge conflicts.
+
+The basic steps are:
+* Add the language to the `LanguageType` enum.
+* Add a `TypeSystem` for the language.
+* Add expression evaluation support.
+
+Additionally, you may want to create a `Language` and `LanguageRuntime` plugin
+for your language, which enables support for advanced features like dynamic
+typing and data formatting.
+
+## Add the Language to the LanguageType enum
+
+The `LanguageType` enum
+(see [lldb-enumerations.h](https://github.com/llvm/llvm-project/blob/main/lldb/include/lldb/lldb-enumerations.h))
+contains a list of every language known to LLDB. It is the one place where
+support for a language must live that will need to merge cleanly with upstream
+LLDB if you are developing your language support in a separate branch. When
+adding support for a language previously unknown to LLDB, start by adding an
+enumeration entry to `LanguageType`.
+
+## Add a TypeSystem for the Language
+
+Both [Module](https://github.com/llvm/llvm-project/blob/main/lldb/include/lldb/Core/Module.h)
+and [Target](https://github.com/llvm/llvm-project/blob/main/lldb/include/lldb/Target/Target.h)
+support the retrieval of a `TypeSystem` instance via `GetTypeSystemForLanguage()`.
+For `Module`, this method is directly on the `Module` instance. For `Target`,
+this is retrieved indirectly via the `TypeSystemMap` for the `Target` instance.
+
+The `TypeSystem` instance returned by the `Target` is expected to be capable of
+evaluating expressions, while the `TypeSystem` instance returned by the `Module`
+is not. If want to support expression evaluation for your language, you could
+consider one of the following approaches:
+* Implement a single `TypeSystem` class that supports evaluation when given an
+  optional `Target`, implementing all the expression evaluation methods on the
+  `TypeSystem`.
+* Create multiple `TypeSystem` classes, one for evaluation and one for static
+  `Module` usage.
+
+For clang and Swift, the latter approach was chosen. Primarily to make it
+clearer that evaluation with the static `Module`-returned `TypeSystem` instances
+make no sense, and have them error out on those calls. But either approach is
+fine.
+
+# Add Expression Evaluation Support
+
+Expression Evaluation support is enabled by implementing the relevant methods on
+a `TypeSystem`-derived class. Search for `Expression` in the
+[TypeSystem header](https://github.com/llvm/llvm-project/blob/main/lldb/include/lldb/Symbol/TypeSystem.h)
+to find the methods to implement.
+
+# Type Completion
+
+There are three levels of type completion, each requiring more type information:
----------------
Michael137 wrote:

Since you're just reviving the documentation, happy to keep it as is and clean it up later.

I'm not sure there's a ton of actionable info in this section for someone trying to add new language support. `Pointer size`/`Layout info`/`Full type info` don't correspond to specific constructs that one could grep for (AFAIK). I assume this is talking about `ResolveState::Forward`/`ResolveState::Layout`/`ResolveState::Full`? And the recommendation here is to create `Forward` `CompilerType`s where possible? (Technically `Layout` resolve state will pull in the full type definition. The only distinction between fully resolving vs. layout resolving a type is that it won't try to fully complete a pointee type, but I guess that's kind of an implementation detail not worth pointing out?).

Also, would it make sense to move it after `Creating Types` (or make it a subsection)? Because we just start talking about "types" without having mentioned them before.

Side-note, I have been documenting a lot of the type-completion/expression evaluation infrastructure internally, and intend to upstream it at some point. So this would be a natural place to put it.

https://github.com/llvm/llvm-project/pull/109427