[PATCH] D93531: [clangd] Reuse buffer for JSONTransport::readRawMessage

Fri Dec 18 14:30:38 PST 2020

sammccall added a comment.

In D93531#2463126 <https://reviews.llvm.org/D93531#2463126>, @njames93 wrote:

> In D93531#2463052 <https://reviews.llvm.org/D93531#2463052>, @sammccall wrote:
>
>> This adds a bit of complexity, making the code here a fair amount harder to follow and verify the correctness of.
>>
>> - Do we have evidence that these allocations are causing a problem? (e.g. do we observe a significant decrease in RSS after the patch)? Naively I would expect these allocations to be basically unimportant compared to those of the JSON objects themselves.(And I don't particularly expect either of them to be significant - the comment on the other review was really just "0 probably isn't the right arg to  malloc_trim if there's any allocation going on").
>
> It's not causing problems per say. but given the incoming json messages can contain a whole file plus things like escape chars. its wise to allocate a buffer that will grow to the largest json it receives but never shrink.

Well, it's also wise to measure before optimizing. The large requests (ones containing a whole file) typically cause us to parse a C++ TU. After we read this JSON message, we will immediately:

- allocate a std::string to hold the decoded file content payload as part of the json::Value
- copy that string into the DraftManager
- allocate a MemoryBuffer and copy the file into it, for parsing
- read all of the thousands of transitively included headers into newly-allocated MemoryBuffers
- run the clang parser and then the clangd indexer

I think the most likely case is that this change has no observable impact on performance or resource usage. If that's true, I don't want to land it - it makes the code harder to reason about for no detectable benefit.

It's definitely *plausible* there's something surprising going on with allocation patterns such that this makes a difference. The impact of `malloc_trim` shows that clangd's overall allocation patterns have surprising effects. But right now I don't see any reason to think that these allocations matter.

>> - there seem to be simpler ways to structure this avoiding allocations. JSONLineBuffer is effectively statically bounded, and can be `SmallString<32>` or so. The content buffer could simply be passed in if I'm reading right: `bool readRawMessage(std::string&)`? OutputBuffer probably does need to be a member variable though.
>
> Its not statically bounded unfortunately, the length is how ever long a line in the json is, which can be infinite. the read line function uses a buffer size of 1024 as an upperbound.

JSONLineBuffer is not used for reading JSON, but rather HTTP-style headers. The only headers supported are `Content-Length` and `Content-Type`, and I don't believe clients send Content-Type in practice (there's only one supported). In any case, you'll never see a line beyond 256 chars from a well-behaved client. ("statically bounded" is too strong, but from a performance-tuning perspective it's enough).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93531/new/

https://reviews.llvm.org/D93531