[PATCH] D52076: [CodeComplete] Add completions for filenames in #include directives.

Ilya Biryukov via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Sep 17 03:40:52 PDT 2018


ilya-biryukov added a comment.

Sorry for the late response.

Please see the comments about adding closing quotes and slashes. It does not seem to work in the clients that auto-add closing quotes or when completing inside existing includes.



================
Comment at: lib/Lex/Lexer.cpp:1931
+          StringRef PartialPath(AfterQuote, CurPtr - 1 - AfterQuote);
+          auto Slash = PartialPath.rfind('/');
+          StringRef Dir =
----------------
sammccall wrote:
> ilya-biryukov wrote:
> > It's also valid to use `\` for path separators on Windows, maybe also handle those?
> > Clang also supports this when running e.g. `clang-cl /c foo.cpp`.
> Ah yuck, that's true (for some value of "valid"). Pre-C++11 it was explicitly UB...
> 
> So if a codebase consistently uses backslashes, we're going to break that style at least sometimes: we can't know whether to complete `#include <foo` as `<foobar/` or `<foobar\`.
> 
> Any objection to just normalizing slashes for the whole include as part of completion?
> e.g. `#include <foo\ba` would complete as `#include <foo/bar/` and `#include <foo/baz.h>`?
> 
> Anything else is going to be fiddly, probably only help a few users a little bit, and leave edge cases.
If we normalize, we probably want a config option for Windows users to be friendly to them.
No good layer to add the option, though (except a compiler flag, but that looks weird), so I don't think that's directly addressable. 

Also, do we want to actually change slashes **before** the path component we are completing in the same include directive? 
1. If we do, I suspect we might have problems with the existing language clients. We ran into problems when trying to add `.` to `->` fix-its.
2. If we don't, the completion label would look differently from what's written. That's probably not a big deal.


================
Comment at: lib/Lex/Lexer.cpp:2065
+    if (C == '\n' || C == '\r' ||                // Newline.
+        (C == 0 && (CurPtr - 1 == BufferEnd))) { // End of file.
       // If the filename is unterminated, then it must just be a lone <
----------------
sammccall wrote:
> ilya-biryukov wrote:
> > Does that mean we're losing completion at the end of file?
> > Not sure if it's a big deal. The only common pattern I can come up with is starting with an empty file and writing:
> > ```
> > #include "^
> > ``` 
> > 
> > Should we handle that? WDYT?
> Ah no, it doesn't, though I thought so at first.
> 
> First: the address of `C` is `CurPtr-1` (getAndAdvanceChar is in spirit `return *CurPtr++`).
> 
> Second: the buffer is null-terminated one-past-the-end (`*BufferEnd == 0`) and **also** has an embedded null at the completion point. So when completing at EOF it looks like:
> ```
> # i n c l u d e < a \0 \0
>                     ^C ^CurPtr
> ^BufferStart    ^BufPtr^BufferEnd
> ```
> and we hit the case below. We only hit this case if we never saw a completion point.
Ah, I see. Thanks for explanation. LG


================
Comment at: lib/Sema/SemaCodeComplete.cpp:8021
+    // Directory completion is up to the slash, e.g. <sys/
+    TypedChunk.push_back(IsDirectory ? '/' : Angled ? '>' : '"');
+    auto R = SeenResults.insert(TypedChunk);
----------------
sammccall wrote:
> ilya-biryukov wrote:
> > Could we avoid adding `/`, `>` and `"` if it's currently missing?
> > Otherwise it'll be annoying for editors that auto-add closing quotes and completions when editing existing includes, i.e.
> > ```
> > // It's cool if we complete bar.h" (with closing quote) here:
> > #include "foo/^
> > // It's annoying if we complete an extra closing quote here:
> > #include "foo/^"
> > ```
> Don't editors that add quotes/parens typically avoid duplicating them when they're typed/completed?
> 
> This isn't actually easy to do, because after completing `file.h` you want the cursor after the closing quote, and after completing `dir/` you want the cursor inside the closing quote. I don't see a way to do this with CodeCompletionString other than generating the quote in the former case but not the latter. (At least not one that doesn't break something else...)
> Don't editors that add quotes/parens typically avoid duplicating them when they're typed/completed?
They do that on typing, but not on completions, which are treated literally, e.g. they'll an extra quote is actually added.
I've checked in VSCode and it produces double quotes in those cases. Not sure about the other editors that have this.
That looks like a corner case my intuition is that other editors won't do what we want either.


================
Comment at: lib/Sema/SemaCodeComplete.cpp:8028
+                                    CodeCompleter->getCodeCompletionTUInfo());
+      Builder.AddTextChunk(InternedPrefix);
+      Builder.AddTypedTextChunk(InternedTyped);
----------------
sammccall wrote:
> ilya-biryukov wrote:
> > Maybe leave this out?
> > When completing `std::^` the completion is `vector`, not `std::vector`.
> > In the same spirit, for includes I would expect completions for `#include <foo/^>` to be `bar.h`, not `<foo/bar.h>`.
> I tried a few iterations here and this was the only one that didn't seem totally odd. Basically it's about the fact that these are quoted strings.
> 
> Ergonomically, having the closing quote completed after a filename *feels* right - you can hit enter afterwards and go to the next line.
> If the closing quote is inserted, it has to be displayed (can't be avoided, also would be confusing).
> 
> So if you make the completion range just the filename, you end up with a completion list like:
> ```
> #include <foo/ba^`
>               bar.h>
>               baz/
>               backwards.h>  
> ```
> vs the current
> ```
> #include <foo/ba^`
>          <foo/bar.h>
>          <foo/baz/
>          <foo/backwards.h>  
> ```
> 
> So I see your point here but I'm not sure the other behavior is actually better.
> Ergonomically, having the closing quote completed after a filename *feels* right - you can hit enter afterwards and go to the next line.
It definitely gives the best experience in Vim (which does not add closing quotes automatically) when writing new code. 
The results are confusing in editors that insert quotes, e.g.
```
#include "foo/bar/^" --> #include "foo/bar/baz.h""
```
And when completing in existing code even in Vim:
```
#include "foo/bar^/baz.h" --> #include "foo/baraba//baz.h"
#include "foo/bar/baz.h^" --> #include "foo/bar/baz.h""
```

The added ergonomics is definitely nice in some cases, but the added confusion in other (not the most frequent, but still quite common) cases doesn't look nice.
WDYT?


================
Comment at: lib/Sema/SemaCodeComplete.cpp:8076
+      // header maps are not (currently) enumerable.
+      break;
+    case DirectoryLookup::LT_NormalDir:
----------------
sammccall wrote:
> ilya-biryukov wrote:
> > NIT: Maybe return from each branch and put llvm_unreachable at the end of the function?
> > To get an error in case the new enum value is added in addition to the compiler warning.
> > 
> > Feel free to ignore if you like the current version more, the compiler warning is not likely to go unnoticed anyway.
> That won't give an error, that will give undefined behavior! (If a new value is added and the warning is missed)
> The current code will just skip over the directory in that case, which I think is fine. (We have -Werror buildbots)
> 
> (Unless you mean return a dummy value, which is a little to unusual for my taste)
Agree, dummy values are definitely worse.
I thought it deterministically crashes when compiled without optimizations, so you get a runtime error in addition to the compilation warning.

But yeah, -Werror buildbots will quickly catch it anyway, so it's not a big deal.



Repository:
  rC Clang

https://reviews.llvm.org/D52076





More information about the cfe-commits mailing list