[PATCH] D82386: [clangd] Config: Fragments and parsing from YAML

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jun 25 09:07:38 PDT 2020


kadircet marked 5 inline comments as done.
kadircet added a comment.

LGTM with couple of nits.

Regarding clang-format, I was mostly confused by `template <typename T> class Located {` being on a single line, but apparently that's the style :D



================
Comment at: clang-tools-extra/clangd/ConfigFragment.h:61
+  /// BufferName is used for the SourceMgr and diagnostics.
+  static std::vector<Fragment> parseYAML(llvm::StringRef YAML,
+                                         llvm::StringRef BufferName,
----------------
sammccall wrote:
> kadircet wrote:
> > what about `fromYAML` and returning `vector<const Fragment>` ?
> > what about fromYAML
> 
> I can live with it if you like, but I prefer parse:
>  - Fragment::fromYAML reads funny as it returns multiple fragments in general
>  - parse describes the action more clearly and so hints toward thinking about error handling etc
> 
> > returning vector<const Fragment> ?
> why? the resulting vector is slower (has to copy instead of move on reallocation).
> It prevents us from moving from the fragment when we compile it (which we do in some places)
> I can live with it if you like, but I prefer parse:

I was mainly unhappy about it because `parse` doesn't reflect the construction bit of the process, but it is clear from the signature and you are right about returning multiple fragments contradicting with the idea of `from`. I am more inclined towards `from` but I am fine with `parse` too if it feels more natural to you.

> why? the resulting vector is slower (has to copy instead of move on reallocation).

It was to signal the fact that these correspond to some `immutable` input received, but I giving up on move semantics sounds like a nasty consequence. So let's keep it that way (unless you want to hide the members and provide only const accessors, which is some plumbing for a simple struct like this and might hinder readability)


================
Comment at: clang-tools-extra/clangd/ConfigFragment.h:73
+    /// Only valid if SourceManager is set.
+    llvm::SMLoc Location;
+  };
----------------
sammccall wrote:
> kadircet wrote:
> > what is the use of this ?
> There may be multiple fragments in a file.
> 
> When tracking how fragments are used in the logs, there should be enough information to identify the particular fragment, without having to guess.
>  - during parsing, we can log in some source-specific way
>  - in the Fragment form, we can log this location to identify the fragment
>  - during compilation, we log this location and the pointer address of the new CompiledFragment
>  - subsequently we can just log the CompiledFragment address
right, I was thinking that we already have location information available for the start of the fragment during parsing and wasn't sure how useful it would be in later on stages as we know the exact location of the relevant entries (keys, scalars etc.) for logging.

I wasn't against the idea of storing it, just wanted to learn if it has any significant importance for the upcoming stages. thanks for the info!


================
Comment at: clang-tools-extra/clangd/ConfigFragment.h:75
+  };
+  SourceInfo Source;
+
----------------
sammccall wrote:
> kadircet wrote:
> > why not make this const ? i don't think it makes sense to modify these after creation.
> const members are a pain, as they prevent moving the object (and vectors of them are doing copies, etc). For that reason I generally don't find "I don't semantically need to modify this" a good enough reason to mark a member const.
> 
> Adding a constructor muddles the idea of a dumb struct and presents an inconsistent API - the other fields *also* don't really change during the lifetime (parseYAML() is basically a big constructor)
this was in parallel with the idea of returning `vector<const Fragment>`. so agreed.


================
Comment at: clang-tools-extra/clangd/ConfigFragment.h:77
+
+  struct ConditionFragment {
+    std::vector<Located<std::string>> PathMatch;
----------------
sammccall wrote:
> kadircet wrote:
> > comments?
> Whoops, done. There's an argument to be made that we shouldn't maintain docs in two places, and the user docs should be canonical.
> Something to think about later, maybe we can generate them.
> 
> I'm always unsure whether to put the doc for "blocks" like this on the struct or the field...
I suppose it makes sense to put the part starting with `  /// Conditions based on a file's path use the following form:` above the field rather than the struct, especially if we are planning to generate docs.

also:
s/ConfigFragment/Fragment/


================
Comment at: clang-tools-extra/clangd/ConfigFragment.h:81
+    /// The condition will evaluate to false.
+    bool UnrecognizedCondition;
+  };
----------------
sammccall wrote:
> kadircet wrote:
> > `HasUnrecognizedCondition` ?
> > 
> > also default init to `true` maybe?
> Renamed. I guess you meant `false` here? (an empty condition should not have this set)
I was actually suggesting `true` to make sure this is explicitly set by the parser in case of success and not forgotten when something slips away. but not that important.


================
Comment at: clang-tools-extra/clangd/ConfigFragment.h:87
+    std::vector<Located<std::string>> Add;
+  } CompileFlags;
+};
----------------
sammccall wrote:
> kadircet wrote:
> > maybe make this an llvm:Optional too. Even though emptiness of `Add` would be OK in this case, for non-container "blocks" we might need to introduce an optional, and homogeneity would be nice when it happens.
> I'd rather push this the other way: the presence of an empty block like this shouldn't have any semantic effect - I think we can be consistent about this and it helps the schema feel regular/extensible/predictable.
> 
> Condition was the only tricky case, but I don't think we need it, so I've removed the Optional on the condition instead. it's an AND, so an empty set of conditions evaluates to "unconditionally true" anyway.
sure SG, I was rather afraid of introducing an `int` option one day, then we might need to signal it's existence with an `Optional` but one might also argue that we would likely have a "default" value for such options, hence instead of marking it as `None` we could just make use of the default in absence of that option.


================
Comment at: clang-tools-extra/clangd/ConfigYAML.cpp:136
+    else if (auto *BS = llvm::dyn_cast<BlockScalarNode>(&N))
+      return Located<std::string>(S->getValue(Buf).str(), N.getSourceRange());
+    warning(Desc + " should be scalar", N);
----------------
sammccall wrote:
> kadircet wrote:
> > s/S->getValue(Buf)/BS->getValue()/  (or change `auto *BS` to `auto *S`)
> > 
> > and a test
> Nice catch. This would be a good clang-tidy check I think... variable initialized in an if and then used in an else.
yup sounds good, also i would change the `else if` to an if, as we are returning anyways (which might be another tidy check)


================
Comment at: clang-tools-extra/clangd/ConfigYAML.cpp:150
+      for (auto &Child : *S) {
+        if (auto Value = scalarValue(Child, "List item"))
+          Result.push_back(std::move(*Value));
----------------
sammccall wrote:
> kadircet wrote:
> > shouldn't we error/warn out if not ?
> we do: scalarValue warns if it returns None and is documented to do so.
right but this function still returns `Result` and not `None`


================
Comment at: clang-tools-extra/clangd/unittests/ConfigYAMLTests.cpp:58
+      Out.Pos.character = D.getColumnNo(); // Zero-based - bug in SourceMgr?
+      if (!D.getRanges().empty()) {
+        const auto &R = D.getRanges().front();
----------------
sammccall wrote:
> kadircet wrote:
> > is there an explanation to what these ranges are ?
> > 
> > All I could find is somewhere in the implementation making sure these are relative to the `D.getLineNo` and nothing else :/
> These are ranges to highlight (e.g. in console output). SourceMgr/SMDiag is modeled after clang's SourceManager/Diagnostic, where this stuff is a bit better documented.
right, sorry i forgot to mention in the first comment. I was rather afraid of Range being relative to start of line, but being contained in another line. (due to the usage on start.character and end.character down below)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82386/new/

https://reviews.llvm.org/D82386





More information about the cfe-commits mailing list