[PATCH] D46035: [clangd] Fix unicode handling, using UTF-16 where LSP requires it.

Haojian Wu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Apr 25 02:31:32 PDT 2018


hokein accepted this revision.
hokein added a comment.
This revision is now accepted and ready to land.

Cool, the code looks good to me (just a few nits), thanks for the descriptive comments!

> This seems likely to cause problems with editors that have the same bug, and
>  treat the protocol as if columns are UTF-8 bytes. But we can find and fix those.

VSCode is fine I think, but we need to fix our internal ycm vim integration.



================
Comment at: clangd/SourceCode.cpp:25
+// Returns true if CB returned true, false if we hit the end of string.
+template <typename Callback>
+bool iterateCodepoints(StringRef U8, const Callback &CB) {
----------------
Can we make it `static`?

The callback type is function<int, int>, the reason why using template here is mainly to save some keystroke? 


================
Comment at: clangd/SourceCode.cpp:53
+// to UTF-8, and returns the length in bytes.
+static size_t measureUTF16(StringRef U8, int Units, bool &Valid) {
+  size_t Result = 0;
----------------
nit: consider naming the parameter `U16Units`?


================
Comment at: clangd/SourceCode.cpp:72
+  size_t Count = 0;
+  iterateCodepoints(U8, [&](int U8Len, int U16Len) {
+    Count += U16Len;
----------------
Maybe add an `assume` to ensure `iterateCodepoints` always return false (reach the end of the U8)?


================
Comment at: clangd/SourceCode.cpp:137
+    P.character =
+        utf16Len(Code.substr(LocInfo.second - ColumnInBytes, ColumnInBytes));
+  }
----------------
nit: it took me a while to understand what the sub-expression `Code.substr(LocInfo.second - ColumnInBytes, ColumnInBytes)` does, maybe abstract it out with a descriptive name? Also it is not straight-forward to understand what `LocInfo.second` is without navigating to `getDecomposedSpellingLoc`.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46035





More information about the cfe-commits mailing list