[PATCH] D140745: WIP: generate Config {Fragment structure, json schema, docs, YAML parser} from schema spec
Sam McCall via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Jan 2 05:26:39 PST 2023
sammccall created this revision.
Herald added subscribers: kadircet, arphaman, mgrang.
Herald added a project: All.
sammccall updated this revision to Diff 485562.
sammccall added a comment.
sammccall updated this revision to Diff 485564.
sammccall updated this revision to Diff 485568.
sammccall added reviewers: kadircet, nridge.
sammccall published this revision for review.
Herald added a project: clang-tools-extra.
Herald added a subscriber: cfe-commits.
Restructure schema so $ref can use JSON pointers, fixing compat with yaml-language-server.
Add some VSCode extensions supported by yaml-language-server.
sammccall added a comment.
fix external: none
sammccall added a comment.
fix schema version
sammccall added a comment.
This is definitely not ready to ship, but I'd like some high-level feedback before adding tests/polish.
Most important thing to look at is `schema.yaml` which would be the source of truth, followed by maybe `Generate.cpp` and the new structure of `ConfigYAML` (but no need to go through the code details at this point).
Particular questions I'd like opinions on:
worthwhile overall?
-------------------
This isn't trivial: does it solve a big enough problem to be worth the complexity? (Benefit would be improved maintenance of existing config duplication, and being able to accept JSON-schema which IMO we shouldn't without automation).
better off parsing schema.json rather than defining a new format?
-----------------------------------------------------------------
I don't think so (json-schema is too powerful a language, and it's IMO pretty verbose and painful to edit).
But the bias towards creating new things needs to be double-checked!
does the YAML structure seem OK?
--------------------------------
I'm reasonably happy with the compromises I landed on e.g.
- distinction between lowercase for meta-properties and UpperCase for schema children
- ways to handle special casing around `If.UnknownKey` and `Index.External = None`
but maybe they only make sense in my own head
build-time generation vs checking in generated files
----------------------------------------------------
Generating at cmake-time is the obvious choice for *.inc, but the consumers of schema.json and doc.md aren't part of our build.
1. we could generate *.inc at build time, and say that it's up to some other repo(s) to run the llvm build system and produce+publish schema.json and doc.md
2. we could generate *.inc at build time, and check in schema.json and doc.md (need to update by rerunning the tool, we could have a test)
3. we could generate all files by hand and check them in
My feeling is that 1 is too burdensome to get these extra files.
2 is a bit more complicated than 3 (two ways to do things, and building the tool in host config adds some cmake complexity).
And I'm not sure it has any advantages: regenerating all the files isn't more work than regenerating half.
So I lean towards 3 but I don't have much confidence in what will be least annoying overall.
meta-schema.json
----------------
Once I got the tooling set up, having a schema+diagnostics for `schema.yaml` is pretty nice.
It provides a place to document the structure of the file, and gives a crisp answer to "is this change broken" (albeit one we won't actually have a test for).
OTOH it's not strictly necessary, and yet more stuff in the tree.
Enums in ConfigFragment.h
-------------------------
Previously these were strings in `ConfigFragment.h`, and the enums were defined in `Config.h`, which I think is where they belong in terms of dep structure.
However once `schema.yaml` knows about enums, it seems natural to generate the enum definitions and the code to parse them.
So options all have downsides:
1. put them in `ConfigFragment.h` and parse them in `ConfigYAML`, with the other generated code. Now `Config` needs to depend on `ConfigFragment`
2. generate them into `Config.h` and generate parsing code in `ConfigCompile` as today. This means we now need to inject generated code into 4 places instead of 2.
3. don't generate the enums, define & parse them by hand as today (and need to keep that code in sync)
I went with 1 in the code, but I really don't know what's best here.
Dir structure
-------------
With so many extra files, I created `config/` and I think moving existing `Config*` sources to `config/*` seems like a clear improvement, but want to get an ack on this first as renaming is annoying.
This is hacked up: the generated files should not be checked in!
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D140745
Files:
clang-tools-extra/clangd/CMakeLists.txt
clang-tools-extra/clangd/Config.h
clang-tools-extra/clangd/ConfigCompile.cpp
clang-tools-extra/clangd/ConfigFragment.h
clang-tools-extra/clangd/ConfigYAML.cpp
clang-tools-extra/clangd/config/CMakeLists.txt
clang-tools-extra/clangd/config/Fragment.inc
clang-tools-extra/clangd/config/Generate.cpp
clang-tools-extra/clangd/config/YAML.inc
clang-tools-extra/clangd/config/doc.md
clang-tools-extra/clangd/config/meta-schema.json
clang-tools-extra/clangd/config/schema.json
clang-tools-extra/clangd/config/schema.yaml
clang-tools-extra/clangd/unittests/ConfigCompileTests.cpp
clang-tools-extra/clangd/unittests/ConfigYAMLTests.cpp
clang-tools-extra/clangd/unittests/IncludeCleanerTests.cpp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D140745.485568.patch
Type: text/x-patch
Size: 115870 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20230102/36f82bbb/attachment-0001.bin>
More information about the cfe-commits
mailing list