[PATCH] D115243: [clangd] Extend SymbolOrigin, stop serializing it

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Dec 10 06:44:12 PST 2021


kadircet accepted this revision.
kadircet added a comment.
This revision is now accepted and ready to land.

Thanks, LGTM!

---

> The problem is these methods yield SymbolSlabs, and the symbols within them are frozen/const. There's no provision to "just tweak some bitfield" - we'd have to copy the whole slab.

Yes and no. As mentioned SymbolSlabs are frozen once created and I totally agree with the reasons there. But AFAICT in none of the indexes (apart from monolithic FileIndex) we ever have a single SymbolSlab.
We always have multiple of those and periodically build a single slab out of them, by copying/merging each symbol so that we can build a SymbolIndex on top of the final merged SymbolSlab. I was suggesting to perform ultimate Origin tweaking there (we can even respect the Origins of each individual slabs too).

Couple reasons:

- Definitely less plumbing, we just let FileSymbol now what it should be called during construction. Background/RemoteIndex already has this knowledge hard coded.
- In theory those indexes that operate on top of a multiple SymbolSlabs should be inserting their own "origin" into the final merged slab as it contains some extra logic, just like MergeIndex.

Downside is the possible discrepancy between the invocation of SymbolCollector and the serving of the SymbolSlab acquired from it.
The discrepancy is only possible when we serialize/deserialize the slab between collection and serving, I'd say in those cases the symbol origin should just be Static (and the server should add any extra semantics).

No matter where we do the "overriding" this discrepancy will become an issue today anyways, the consumer of the serialized slabs needs to overwrite them.
I just feel like keeping that semantic to `FileSymbol`/`Background/RemoteIndex` isolates the logic better.

---

I don't feel too strongly about it though, feel free to roll with this version if you don't think that isolation is worth the changes.



================
Comment at: clang-tools-extra/clangd/index/SymbolOrigin.h:26
+  Static = 1 << 2,     // From a static, externally-built index.
   Merge = 1 << 3,      // A non-trivial index merge was performed.
   Identifier = 1 << 4, // Raw identifiers in file.
----------------
sammccall wrote:
> kadircet wrote:
> > do we think this still provides a meaningful signal ?
> > It only provides value only when multiple indices of same type was involved in providing the result (and only that single type of indices). I suppose after this patch it can only happen for `SM` (as there are certain remote-index implementations that mark themselves as static) or `RM` (well this is not possible today, but some day we might have a stack of remote-indices).
> > As soon as there's a different type of index is involved `M` no longer has any meanings, as it's clear that there was some sort of merge happening (or am I missing something obvious?)
> > 
> > ---
> > 
> > Before this patch situation is a little bit different since `FileIndex` just said `D` for both main file and preamble, but that's changing.
> Hmm, we can also get M when merging happens at indexing time due to multiple visits of the same symbol (`DuplicateHandling::Merge`, which is used by background indexing).
> 
> I don't have a strong opinion on its usefulness though. IIRC we added it (and origins) when trying to understand the paths various symbols took through indexes. As a debugging tool some redundancy maybe has some sanity-check value, but we could also drop it.
> 
> I don't think it has so much to do with this patch (unless you think we should reuse the bit instead of expanding to 16)...
> 
> > Before this patch situation is a little bit different since FileIndex just said D for both main file and preamble
> 
> ...it makes sense that this could have been a reason for the flag, but I don't think it actually was :-)
> I don't think it has so much to do with this patch (unless you think we should reuse the bit instead of expanding to 16)...

Nope it isn't directly related to this patch (also, we soon plan to introduce the stdlib kind here, so doesn't really let us get back to 8).
Mostly checking if we can get rid of it and some code governing it.

> Hmm, we can also get M when merging happens at indexing time due to multiple visits of the same symbol (DuplicateHandling::Merge, which is used by background indexing).
> I don't have a strong opinion on its usefulness though. IIRC we added it (and origins) when trying to understand the paths various symbols took through indexes. As a debugging tool some redundancy maybe has some sanity-check value, but we could also drop it.

I see, yeah I am not sure about its usefulness either, sounds like there might still be rare cases this provides value (like debugging some behaviour in background index).
Thanks for the thoughts, definitely no action needed here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115243/new/

https://reviews.llvm.org/D115243



More information about the cfe-commits mailing list