[PATCH] D78740: [clangd] Handle PresumedLocations in IncludeCollector

Fri May 1 17:17:24 PDT 2020

kadircet added inline comments.

================
Comment at: clang-tools-extra/clangd/Headers.cpp:40
+      auto PreLoc = SM.getPresumedLoc(HashLoc);
+      if (auto FE = SM.getFileManager().getFile(PreLoc.getFilename())) {
+        if (SM.getFileEntryForID(SM.getMainFileID()) == *FE) {
----------------
sammccall wrote:
> wouldn't be SM.getFileEntryForID(PreLoc.FID) be more efficient?
well, that would've been even easier to just check for `PreLoc.FID == SM.getMainFileID()`, but unfortunately `PresumedLoc.FID` is invalid in presence of line directives :/

but we can do something like storing the MainFilePath and then just compare the PreLoc.getFileName with it.

================
Comment at: clang-tools-extra/clangd/Headers.cpp:44
+                                        PreLoc.getColumn());
+          PreLoc = SM.getPresumedLoc(FilenameRange.getBegin());
+          auto FileNameBegin = SM.translateLineCol(
----------------
sammccall wrote:
> This part looks a little iffy to me, with all the coordinate transforms.
> 
> If we're synthesizing the include, chars don't have to match 1:1 right?
> e.g. if the original code was `#   include     /* foo */ "bar.h" // baz` and we synthesize `#include "bar.h"`, how is this going to get the coordinates of "bar.h" right?
> 
> This seems awkward to resolve. `R` isn't actually used much though, go-to-definition looks at its line number only, and DocumentLink uses it (but it seems OK to just to do approximate re-lexing there). Maybe we can just drop it?
> 
> ---
> (Original comment disregarding above problem)
> 
> Isn't it the case that the filename expansion location has to be in the same file as the hashloc?
> So can we do something like:
> 
> ```
> FilenameRange = SM.getExpansionRange(FilenameRange);
> if (SM.getFileID(FilenameRange.start()) == SM.getFileID(FilenameRange.end()) == SM.getFileID(OrigHashLoc)) {
>   // all in the same file
>   // compute NewStart = OrigStart - OrigHashLoc + NewHashLoc, etc
> } else {
>   FilenameRange = CharSourceRange();
> }
> ```
> This part looks a little iffy to me, with all the coordinate transforms.
> 
> If we're synthesizing the include, chars don't have to match 1:1 right?
> e.g. if the original code was #   include     /* foo */ "bar.h" // baz and we synthesize #include "bar.h", how is this going to get the coordinates of "bar.h" right?

well, the patching actually ensures both `#` and `"filename"` starts at the correct offset, by padding them with whitespaces ignoring any comments and such.

> 
> This seems awkward to resolve. R isn't actually used much though, go-to-definition looks at its line number only, and DocumentLink uses it (but it seems OK to just to do approximate re-lexing there). Maybe we can just drop it?

I am fine with dropping it too, the padding looks really ugly in the patching code :D.

Regarding go-to-def, I suppose we can keep storing the include line, since we calculate it anyway while getting the presumed location for HashLoc.

For DocumentLink, I suppose we can either lex while handling the request or store those separately in parsedast. I would go with the former.

WDYT?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78740/new/

https://reviews.llvm.org/D78740