[PATCH] D82612: [clangd] Config: compile Fragment -> CompiledFragment -> Config

Haojian Wu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jun 30 23:57:37 PDT 2020


hokein accepted this revision.
hokein added a comment.
This revision is now accepted and ready to land.

Thanks, looks good.



================
Comment at: clang-tools-extra/clangd/ConfigCompile.cpp:65
+    if (F.HasUnrecognizedCondition)
+      Conditions.push_back([&](const Params &) { return false; });
+
----------------
sammccall wrote:
> hokein wrote:
> > sammccall wrote:
> > > hokein wrote:
> > > > I think if this case happened, we might just bail out directly -- adding the regex conditions seems unnecessary, we never execute them on Line 127 (but we might still need computing the regex for diagnostics).
> > > > 
> > > > and I'm not sure the motivation of using `vector` for `Conditions` and `Apply`, it looks like Apply is just 1 element (if we do the bailout for conditions, `Conditions` will be at most 1 element).
> > > > 
> > > > I feel like using a single unique_function for `Conditions` and `Apply` might be a better fit with `Fragment`. There is a comment in `ConfigFragment.h`:
> > > > 
> > > > >  Each separate condition must match (combined with AND).
> > > > >  When one condition has multiple values, any may match (combined with OR).
> > > >  
> > > > so my reading is that:
> > > > 
> > > > - `Fragment` and `CompiledFragment` is a 1:1 mapping
> > > > - A `Fragment::Condition` represents a single condition, so the AND match applies for different `Fragment`s, which is in ConfigProvider?
> > > > - A single condition can have multiple values (`PathMatch`), we use OR match
> > > > 
> > > > Is that correct?
> > > > I think if this case happened, we might just bail out directly -- adding the regex conditions seems unnecessary, we never execute them on Line 127 (but we might still need computing the regex for diagnostics).
> > > >
> > > > and I'm not sure the motivation of using vector for Conditions and Apply, it looks like Apply is just 1 element (if we do the bailout for conditions, Conditions will be at most 1 element).
> > > 
> > > So the explanation for both of these things is that this file is going to grow a lot, and in particular these sections handling PathMatch etc are going to occur again and again, so we need a scalable pattern. I've deliberately limited the scope of the first patchset to only support one condition, one setting to apply etc, but you have to imagine there are likely to be 5 conditions and maybe 25 settings.
> > > 
> > > So with that in mind:
> > >  - yes, actually saving the conditions isn't necessary, but we do need to evaluate them for side-effects (diagnostics) and it's simpler not to keep track of them. Optimizing performance of broken configs is not a big win :-)
> > >  - it's marginally simpler to have only one Apply function, but much simpler if separate parts of compile() can push settings independently. Same goes for condition in fact...
> > > 
> > > > Fragment and CompiledFragment is a 1:1 mapping
> > > 
> > > Yes. I think this is somewhere where the naming is helpful at least :-)
> > > 
> > > > A Fragment::Condition represents a single condition, so the AND match applies for different Fragments, which is in ConfigProvider?
> > > 
> > > The AND match applies to the multiple Conditions in a single CompiledFragment. Each fragment only considers its own conditions. 
> > > So given:
> > > ```
> > > If:
> > >   Platform: win
> > >   PathMatch: [foo1/.*, foo2/.*]
> > > CompileFlags:
> > >   Add: -foo
> > > ---
> > > CompileFlags:
> > >   Add: -bar
> > > ```
> > > 
> > > We end up with... one Provider, suppling two CompiledFragments (each compiled from a separate Fragment).
> > > 
> > > The first CompiledFragment has two Conditions, conceptually:
> > >  - `Params.Platform == "win"`
> > >  - `Regex("foo1/.*").matches(Params.File) || Regex("foo2/.*").matches(Params.File)`
> > > 
> > > The second CompiledFragment has no Conditions. Each fragment has one Apply function.
> > > 
> > > The CompiledFragments are applied independently. They each consider their own Conditions, ANDing them together.
> > > 
> > > So CompiledFragment #1 checks both conditions. If any fails, it bails out. Otherwise it runs its Apply function, adding `-foo`.
> > > 
> > > Then CompiledFragment #2 runs. It has no conditions, so never bails out. It adds `-bar`.
> > oh, thanks for the explanation, this helps me a lot to understand the code.
> > 
> > so the member `Condition` in `Fragment` will be enhanced to something like `std::vector<XXX> Conditions` in the future? If so, can we defer that for `CompiledFragment` as well? 
> > 
> > 
> > Since `Fragment` and `CompiledFragment` is 1:1 mapping, I think it is important to keep their implementation aligned, it'd help a lot for readers to understand the code. The condition logic (AND, OR) is subtle and complicated,  I was reading this part of code back and forth, still inferred it in an incorrect way.
> > so the member Condition in Fragment will be enhanced to something like std::vector<XXX> Conditions in the future?
> 
> Ah, not quite...
> Fragment is supposed to closely follow the YAML format, which is:
> `If: { PathMatch: foo, Platform: bar }`
> not
> `If: [ {PathMatch: foo}, {Platform: bar} ]`
> That is, the partitioning of parts of the `If` block into `Conditions` is implicit rather than explicit. (It's basically partitioning on the map keys, but maybe there'll be exceptions).
> 
> The subtle difference between Fragment::Condition and CompiledFragmentImpl::Condition is confusing though. I think renaming the former to Fragment::If is much clearer (not sure why I didn't name it that in the first place).
> 
> > Since Fragment and CompiledFragment is 1:1 mapping, I think it is important to keep their implementation aligned
> 
> I think the fragment and the fragmentcompiler are pretty well aligned, but I don't think duplicating the fragment structure in the final CompiledFragmentImpl is a good idea.
> 
> The reason is captured pretty well in the essay "Parse, don't validate". Through compilation here we want to examine the config and establish that we can apply this configuration efficiently, and without any more errors. The type-driven way to do that is to emit a function that applies the configuration and whose signature doesn't allow for errors.
> 
> https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
oh, I think I understand how the design is driven like this now. that makes sense.

I think it would be helpful to add some documentation (and give a concrete config example, the example you gave is good enough) for CompiledFragmentImpl::Conditions.



================
Comment at: clang-tools-extra/clangd/ConfigCompile.cpp:46
+      if (!C(P)) {
+        dlog("config::Fragment {0}: condition not met", this);
+        return false;
----------------
sammccall wrote:
> hokein wrote:
> > maybe `Impl::applying config ...` to allow better debug logging?
> > 
> > this reminds us: since the config can easily become a monster, think about multiple config files from different sources (user config file, project config file, LSP, flags) are applied together, we might need a tracing mechanism (at least some useful logs) to help us diagnose a final complicated config. no need to do anything in this patch, just something we should keep in mind.
> I've changed these from config::Fragment to "Config fragment" to be less misleading. I'm not sure sprinkling in internal class names helps much vs having them be clear sentences, I always end up searching for the log string to see exactly where it comes from anyway.
> 
> This is dlog, it would be nice to make it vlog but I think it's just way too spammy, we'll get several of these for every file that's opened and background-indexed...
> 
> > we might need a tracing mechanism (at least some useful logs) 
> 
> My secret plan is to have a clangd flag like `clangd --check <filename>` that will *not* run the language server, but instead dump a bunch of information about how clangd would process that file (stringing together CDB, config, query-driver, diagnostics...). This would miss config-over-LSP but in every other respect seems much simpler than trying to find ways to extract this stuff live and get it in front of the user somehow.
that sounds good.


================
Comment at: clang-tools-extra/clangd/ConfigFragment.h:100
   /// Each separate condition must match (combined with AND).
   /// When one condition has multiple values, any may match (combined with OR).
   ///
----------------
I think this comment is important, but it is a bit vague (not quite friendly to readers without much config context), we might want to refine it -- would be nice to clearly state what's a condition, e.g. "PathMatch" is a condition, "Platform" is a condition. 


maybe separate the rename to a separate patch, but up to you.


================
Comment at: clang-tools-extra/clangd/ConfigFragment.h:108
+  struct IfBlock {
     /// The file being processed must fully match a regular expression.
     std::vector<Located<std::string>> PathMatch;
----------------
nit: maybe explicitly spell this is an `OR` match.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82612/new/

https://reviews.llvm.org/D82612





More information about the cfe-commits mailing list