[PATCH] D47063: [clangd] Keep only a limited number of idle ASTs in memory

Wed May 23 07:38:10 PDT 2018

sammccall added a comment.

(Haven't reviewed the code yet, just design stuff)
I'm tempted to push back on the idea that we need to store any - "if it's fast enough for code complete, it's fast enough for GTD". But that's probably oversimplifying - we don't force reuse of a stable preamble for GTD I think. So we probably do want some cache.

In https://reviews.llvm.org/D47063#1104495, @malaperle wrote:

> In https://reviews.llvm.org/D47063#1104417, @ilya-biryukov wrote:
>
> > Maybe we can even store enough information to not need the AST for navigation at all and build it only for features like refactorings.
> >  @sammccall, let me know what are your thoughts on all of this.
>
>
> That's what I was thinking. I think I also had similar discussion with @arphaman. I think storing a limited number of ASTs is a good interim solution.

Agree - it's a good idea but we shouldn't block on it. A general-purpose xrefs index may solve this problem (and others) but requires a bunch of design. A narrower structure risks building a bunch of complexity for one feature we can't reuse for the next.

> Another alternative that I've considered was evicting the ASTs from memory after a certain period of time, e.g. after 30 seconds of inactivity for some file.

We discussed this a bit more offline. A time-based limit reduces idle RAM usage, and a weight limit (or just count) reduces peak RAM.
Relatively speaking, idle seems to be more important to our hosted/multiplexed use cases, and peak is more important when running on a developer machine.
Weight is probably easier to tune. Time is easier to implement as the TUs are independent.

So this gives us some options (assuming we want some caching, but not unlimited):

- Evict if old - this helps hosted a lot, and is simple to implement.
- Evict if full (this patch) - this helps standalone, and is more complex.
- Evict if full AND old - this can be tuned nicely for hosted and standalone. Most complex, but only a little more than the previous option.
- Evict if full OR old - this puts a hard limit on resource usage and controls idle, but doesn't seem useful for hosted as it can't drive idle to zero.

I think the main design decision is whether the cache logic requires TUs to interact (vs simple time eviction). If we pay that complexity cost we get a lot of flexibility to tweak the cache. It's the hosted stuff that's driving this (for Google) right now, but maybe that's just because we're not really measuring impact on workstations yet :)
So I think I like this policy as a starting point, but we should plan to bolt on time-based-expiry in the near future.

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D47063