[PATCH] D39050: Add index-while-building support to Clang

Marc-Andre Laperle via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 16 11:49:39 PDT 2018

malaperle added a comment.

In https://reviews.llvm.org/D39050#1037796, @akyrtzi wrote:

> Hey Marc,
> > The fact that both clang and clangd have to agree on the format so that index-while-building can be used seems to make it inherently not possible to extend
> I don't think "not possible to extend" is quite correct, we can make it so that the format allows optional data to be recorded.

That would be good. How would one go about asking Clang to generate this extra information? Would a Clang Plugin be suitable for this? I don't know much about those but perhaps that could be one way to extent the basic behavior of "-index_store_path" in this way?

>>   I changed my prototype so that the end-loc is not stored in the index but rather computed "on the fly" using SourceManager and Lexer only.
> I assume you used SingleFileParseMode+SkipFunctionBodies for this, right ?

No, sorry the end-locs I meant there is for occurrences. Only the lexer was needed to get the end of the token. So for "MyClass o1, o2;" o1 and o2 get highlighted as references to the MyClass constructor.

>> For my little benchmark, I used the LLVM/Clang/Clangd code base which I queried for all references of "std" (the namespace) which is around 46K references in the index.
> This is an interesting use case, and I can say we have some experience because Xcode has this functionality without requiring the end-loc for every reference.
>  So what it does is that it has a 'name' to look for (say 'foo' for  the variable foo) and if it finds the name in the location then it highlights, otherwise if it doesn't find it (e.g. because it is an implicit reference) then it points to the location but doesn't highlight something.

I think it's useful to highlight something even when the name is not there. For example in "MyClass o1, o2;" it feels natural that o1 and o2 would get highlighted.

> The same thing happens for operator overloads (the operators get highlighted at the reference location).

It does? I can only seem to do a textual search. For example, if I look at "FileId::operator<", if I right-click in the middle of "operator<" and do "Find selected symbol in workspace", it seems to start a text based search because there are many results that are semantically unrelated.

> For implicit references it's most likely there's nothing to highlight so the end-loc will most likely be empty anyway (or same as start-loc ?) to indicate an empty range.

I think for those cases the end of the token is probably suitable. Can you give examples which implicit references you have in mind? Maybe another one (other than the constructor mentioned above) could be a function call like "passMeAStdString(MyStringRef)", here the "operator std::string" would be called and MyStringRef could be highlighted, I think it would make sense to the user that is gets called by passing this parameter by seeing the highlight.

> Going back to the topic of what use cases end-loc covers, note that it still seems inadequate for peek-definition functionality. You can't set it to body-end loc (otherwise occurrences highlighting would highlight the whole body which I think is undesirable) and you still need to include doc-comments if they exist.

I think maybe I wasn't clear, I was thinking about two end-locs: end-locs of occurrences and end-locs of bodies. The end-loc of occurrences would be used for highlight when searching for all occurrences and the end-loc for bodies would be used for the peek definition. I think we can disregard end-locs of bodies for now.


More information about the cfe-commits mailing list