[PATCH] D94952: [clangd] Take into account what is in the index (symbols, references, etc.) at indexes merge
Aleksandr Platonov via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Feb 4 02:59:13 PST 2021
ArcsinX added inline comments.
================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:433
PreambleSymbols.update(
- Uri, std::make_unique<SymbolSlab>(std::move(*IF->Symbols)),
+ FilePath ? *FilePath : (consumeError(FilePath.takeError()), Uri),
+ std::make_unique<SymbolSlab>(std::move(*IF->Symbols)),
----------------
ArcsinX wrote:
> sammccall wrote:
> > ArcsinX wrote:
> > > ArcsinX wrote:
> > > > sammccall wrote:
> > > > > ArcsinX wrote:
> > > > > > sammccall wrote:
> > > > > > > Is this change related? It changes the key scheme for the preamble index from URIs to paths, but I'm not sure why.
> > > > > > >
> > > > > > > Do we have multiple URIs pointing to the same path? What are they concretely?
> > > > > > This is the main thing in this patch. I will try to explain.
> > > > > > We use these keys to create the file list, which is used by `indexedFiles()`.
> > > > > > Currently, the preamble index contains URI's instead of paths (as a file list), that leads to the function returned by `PreambleIndex::indexedFiles()` always return `false` (because we pass to this function paths, not URI's). So, we always take data from the preamble index (but maybe we should not in some cases).
> > > > > >
> > > > > Oooh... I'm not sure how I misunderstood the original so much :-( And I missed it in this patch description as well, apologies.
> > > > >
> > > > > My impression was that the file list was derived from the index data, rather than from the keys, which were always intended to be opaque/arbitrary.
> > > > > (At various times, these have been filenames, URIs, and other things IIRC. And until relatively recently, the preamble index keys were the file the preamble was built from, not the file containing the symbol!)
> > > > >
> > > > > It feels like using URIs extracted from symbols might not be *completely* robust. Because having CanonicalDeclaration etc set to a file might not line up exactly with the idea that we indexed the file. But we do use this partitioning for FileShardedIndex, so it has to work well enough.
> > > > >
> > > > > The advantage of using the extracted URIs would be: also works for non-file-sharded indexes like --index-file, avoid a bunch of conversion between URI and path, and we get to keep the simpler/flexible design for FileSymbols where the key is opaque.
> > > > >
> > > > > Does this seem feasible to you?
> > > > > that leads to the function returned by `PreambleIndex::indexedFiles()` always return `false` (because we pass to this function paths, not URI's)
> > > >
> > > > This is a bit incorrect.
> > > > We pass to this function URI, but this URI is converted to path. i.e. `MemIndex::indexedFiles()`, `Dex::indexedFiles()` expect that `Files` are paths, but they are URI's for the preamble index. That's why `PreambleIndex::indexedFiles()` always return `false`.
> > > I also do not like these path <=> URI conversions. But what about empty files?, e.g.:
> > > - open a file
> > > - remove everything from this file
> > > - the dynamic index has no symbols with definition/declaration from this file, so we do not have this file in the dynamic index file list.
> > > - the static index has symbols with definition/declaration from this file, so we have this file in the static index file list.
> > > - we will show stale results from the static index for this file.
> > >
> > >
> > > Unsure, maybe it's ok to ignore the problem with empty files, seems this is the only case when the file was indexed, but we have no symbols located there.
> > >
> > > Overall, I like the idea to use URI's instead of paths. I think we could implement it first as a separate patch and after that return to this one.
> > >
> > > What do you think?
> > Right, completely empty files are going to be mishandled., in the same way that before your changes we mishandled files that had no results from the query.
> >
> > I do still think this is the way to go though, because:
> > - completely empty files are much rarer
> > - again this is a peer to an FileShardedIndex issue where empty files get no shards. Therefore *within* the dynamic index, we'll never clear out the symbols for a file while the file is emptied
> > - having SymbolCollector explicitly the files indexed is the principled solution to both problems, and that seems feasible for us to do at some point.
> >
> > > Overall, I like the idea to use URI's instead of paths. I think we could implement it first as a separate patch and after that return to this one.
> >
> > That sounds great if you don't mind!
> > (Is there a reason we can't land the rest of this patch as-is already?)
> > (Is there a reason we can't land the rest of this patch as-is already?)
> Seems we have no reason =)
> Overall, I like the idea to use URI's instead of paths. I think we could implement it first as a separate patch and after that return to this one.
Seems we need to land this patch first. Without `index contents` it's impossible to use URI's everywhere in the index file list (because the preamble index does not contain references, but without this patch we do not have this information at indexes merge).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94952/new/
https://reviews.llvm.org/D94952
More information about the cfe-commits
mailing list