[clangd-dev] DraftStore::updateDraft compares lsp's rangeLength in UTF16 code units against the computed rangeLength in bytes

Daan De Meyer via clangd-dev clangd-dev at lists.llvm.org
Mon Oct 22 10:27:39 PDT 2018


After implementing option 2, I've changed my mind and I think we should
remove the check.

Right now, DraftStore::UpdateDraft takes the start position and end
position of the LSP message (specified in UTF-16 code units) and calculates
the length of the range in bytes. What we'd be doing with option 2 is
converting that length back to UTF-16 code units and checking whether it's
equal to the rangeLength of the LSP message. This is nothing more than a
really complicated check to see whether 'LSP.end - LSP.start ==
LSP.rangeLength'. The check could just as easily done without the
conversion to bytes, but then we're really only checking whether the client
calculated the rangeLength correctly. Since the rangeLength isn't used by
the rest of clangd I propose to just ignore it. rangeLength seems to be
redundant since we just calculate it ourselves. I also found an issue about
this in the LSP repository:
https://github.com/Microsoft/language-server-protocol/issues/9.

Regards,

Daan

On Mon, 22 Oct 2018 at 13:48, Sam McCall <sammccall at google.com> wrote:

> On Mon, Oct 22, 2018 at 1:32 PM Daan De Meyer <daan.j.demeyer at gmail.com>
> wrote:
>
>>
>>> My only concern is that we'll break clients that make the equal and
>>> opposite mistake (sending byte lengths instead of UTF-16 lengths).
>>> Maybe we should also downgrade this error to a log message, since we can
>>> presumably trust the length from the string. What do you think?
>>>
>>
>> I think the check is useful just to verify if the client and server are
>> on the same page so I'm partial to option 2 as well. Downgrading to a log
>> message avoids outright breaking existing clients so that seems like the
>> way to go. As soon as Github is back up I'll make an issue asking to
>> clarify the unit of rangeLength in the lsp repository as well.
>>
> Thanks!
>
> Assuming we go with option 2 (including downgrading to an error message),
>> I have two more questions:
>>
>> 1. Which header should I put the utf16Len function in (from
>> SourceCode.cpp)? I tried searching for an existing header with UTF
>> functions but I didn't immediately find one.
>>
> I think SourceCode.h is a reasonable fit within clangd. (There are some
> UTF libraries in llvm/support, but it's not clear this belongs there and
> it'd be a bunch of work).
>
> Consider renaming it something like `lspLength()` or so and keeping the
> bits about UTF-16 in the function comment.
> That's kind of a low-level detail.
>
>
>> 2. How do I go about downgrading to a log message? Do I just write to
>> errs()?
>>
> We use the log() functions in clangd/Logger.h. (Abstracts the destination,
> handles locking, does formatting).
>
> This one is scary+rare enough it should probably be an error log.
> The functions use formatv syntax, so something like:
>   elog("bad TextDocumentContentChangeEvent: rangeLength={0} but computed
> length {1} for {2}",
>        *change.rangeLength, Len, change.text);
>
> Cheers, Sam
>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/clangd-dev/attachments/20181022/da648d1f/attachment.html>


More information about the clangd-dev mailing list