[PATCH] D41946: [clangd] Add support for different file URI schemas.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 16 07:27:04 PST 2018


sammccall added a comment.

Lots of quibbles here, some things we can certainly defer. Happy to chat offline if you're not sure...



================
Comment at: clangd/URI.cpp:17
+
+std::string FileURI::toString() const { return Schema + "://" + Path; }
+
----------------
This isn't correct, we need to escape some characters, turn \ to /, etc.

It's not quite obvious what to do here in general, because URI escaping rules are defined in terms of the full generic URI syntax.

I'd suggest:
1) instead of just splitting scheme/path, split scheme/authority/path
   if it starts with //, authority is the part up to the next /. Otherwise empty.
Splitting authority/scheme/path allows "file" schema to just care about the path part, not the // prefix, without mandating that all schemas have authorities.
2) when deserializing authority/scheme, decode all percent-encoding. when serializing, percent-encode characters that are neither reserved nor unreserved.
(The reserved characters basically have no business in filenames, apart from / which you don't want escaped. This allows custom URI schemes to be flexible)


================
Comment at: clangd/URI.cpp:35
+FileSystemSchema::uriFromAbsolutePath(llvm::StringRef AbsolutePath) const {
+  assert(AbsolutePath.startswith("/") && "Expect an absolute path.");
+  return {Schema, AbsolutePath};
----------------
this error needs to be handled, it shouldn't be an assert


================
Comment at: clangd/URI.h:19
+
+/// \brief An URI for a file path in a certain schema. This is invalid if either
+/// schema or path is empty.
----------------
Why allow an invalid state?
I think I'd lean towards more type safety here: having a few factory functions that always produce valid URIs (like fromPath) and some that return Expected<URI> (like parse), but require URI itself to be valid.

This would probably mean making it a class and having schema/path be readonly...


================
Comment at: clangd/URI.h:19
+
+/// \brief An URI for a file path in a certain schema. This is invalid if either
+/// schema or path is empty.
----------------
sammccall wrote:
> Why allow an invalid state?
> I think I'd lean towards more type safety here: having a few factory functions that always produce valid URIs (like fromPath) and some that return Expected<URI> (like parse), but require URI itself to be valid.
> 
> This would probably mean making it a class and having schema/path be readonly...
This doesn't really describe what it's for. Consider:

   /// A URI describes the location of a source file.
   /// In the simplest case, this is a file:// URI that directly encodes the absolute path to a file.
   /// More abstract cases are possible: a shared index service might expose repo:// URIs that
   /// are relative to the source control root.


================
Comment at: clangd/URI.h:21-22
+/// schema or path is empty.
+struct FileURI {
+  std::string Schema;
+  // This is interpreted according to the schema.
----------------
nit: scheme, not schema (they're synonyms, but URIs always talk about scheme)


================
Comment at: clangd/URI.h:24
+  // This is interpreted according to the schema.
+  std::string Path;
+
----------------
I think calling this "path" is confusing, because there's ambiguity between the URI path component, and the file path that you're ultimately trying to transform it into.

Consider "body", without changing semantics.


================
Comment at: clangd/URI.h:31
+
+  bool IsValid() const { return !Schema.empty() && !Path.empty(); }
+
----------------
(if you keep this: isValid(), or just valid(), or operator bool())


================
Comment at: clangd/URI.h:38
+  /// this returns an empty URI with empty schema and path.
+  static FileURI fromURI(llvm::StringRef Uri);
+
----------------
fromURI is confusing, because the input is a String and the output is a URI.
parse()?


================
Comment at: clangd/URI.h:45
+
+/// \brief This manages file paths in different schemas. Different
+/// codebases/projects can have different file schemas, and clangd interprets a
----------------
Hmm, it manages file paths in exactly one schema :-)
"manages" is also very vague. What about

/// \brief URIScheme is an extension point for teaching clangd to recognize a custom URI scheme.
// A scheme's job is to translate between URIs and file paths.


================
Comment at: clangd/URI.h:53
+///
+/// By default, a "file" schema is supported where URI paths are always absolute
+/// in the file system.
----------------
nit: move this to the registry?


================
Comment at: clangd/URI.h:64
+  ///
+  /// Returns "" if \p Uri is not in the schema.
+  virtual std::string getAbsolutePath(llvm::StringRef CurrentFile,
----------------
Uri not in schema: why would the framework even call getAbsolutePath in that case?
Easier just to make that inexpressible by just having the framework pass the path/body as a stringref, I think.

We should have a way to handle errors though, e.g. file:// URIs that have relative paths. Return Expected rather than use the empty string for signaling?


================
Comment at: clangd/URI.h:65
+  /// Returns "" if \p Uri is not in the schema.
+  virtual std::string getAbsolutePath(llvm::StringRef CurrentFile,
+                                      const FileURI &Uri) const = 0;
----------------
make currentfile the last param and optional? Uri is the primary param


================
Comment at: clangd/URI.h:69
+  /// \brief Creates an URI for the file \p AbsolutePath in this schema.
+  virtual FileURI uriFromAbsolutePath(llvm::StringRef AbsolutePath) const = 0;
+};
----------------
again:
 - return std::string to avoid the possibility of returning a FileURI with the wrong scheme
 - maybe provide a way to signal errors. In this case, optional<string> might be right, as we don't know in advance which scheme to use so failures are expected.


================
Comment at: clangd/URI.h:74
+/// are absolute (with leading '/').
+class FileSystemSchema : public URISchema {
+public:
----------------
this should be defined in the cpp file


================
Comment at: clangd/URI.h:89
+///
+/// Returns "" if there is no matching schema.
+std::string resolve(llvm::StringRef CurrentFile, const FileURI &Uri);
----------------
There are other possible failure cases.
Please propagate errors here rather than magic empty strings.


================
Comment at: clangd/URI.h:90
+/// Returns "" if there is no matching schema.
+std::string resolve(llvm::StringRef CurrentFile, const FileURI &Uri);
+
----------------
this is part of the "user" API, not the extension API, move up above?
in fact, should this maybe be a method on URI?


================
Comment at: clangd/URI.h:90
+/// Returns "" if there is no matching schema.
+std::string resolve(llvm::StringRef CurrentFile, const FileURI &Uri);
+
----------------
sammccall wrote:
> this is part of the "user" API, not the extension API, move up above?
> in fact, should this maybe be a method on URI?
I think you want a public function to turn an absolute path into a file:// URI.
When communicating with the editor, we don't want to send custom URI schemes.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D41946





More information about the cfe-commits mailing list