[PATCH] D112695: [clangd] IncludeCleaner: Skip non self-contained headers

Thu Oct 28 11:23:01 PDT 2021

sammccall added inline comments.

================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:247
   for (const Inclusion &MFI : Structure.MainFileIncludes) {
     // FIXME: Skip includes that are not self-contained.
     if (!MFI.HeaderID) {
----------------
kbobyrev wrote:
> sammccall wrote:
> > this is fixed now (though I think it would be better addressed ~here)
> That's what I had initially (you can see the previous version of the patch, after which I sent "oh, wrong place" and changed it). I think my idea was that this looks confusing from the API standpoint. Choosing to filter out non-guarded headers looks like a diagnostics decision and it looked awkward to me to do that here: what `getUnused` does is simply merge two data structures. I think it's somewhat similar to the patch where you mentioned "parse, don't validate" and it is a great argument: the `<built-in>` and `<scratch-space>` are indeed not "headers" in a sensible meaning of the term but non self-contained ones are.
> 
> I don't have a very strong opinion on this, so I put the filtering back here but I'm curious if you agree with my thoughs.
Sorry, I was unclear: I meant we should put the call to `mayConsiderUnused` here, not that we should split it up and put half of it here. So the "written with <" check would also be at this point.

> Choosing to filter out non-guarded headers looks like a diagnostics decision
I don't think so: we **don't know** that a header is unused if we suspect it of being an umbrella header or used for side effects. The reason we're choosing not to diagnose it doesn't have anything to do with **diagnostics** per se.
You could say "unused" has some more narrow definition, but that doesn't seem very useful to me.

> what `getUnused` does is simply merge two data structures
Sure, because the nontrivial parts haven't been implemented yet :-) But its role is to make filtering decisions of Inclusions from the AST to support a high-level concept of "unused", so I think the filtering fits well here.
If we support a non-strict policy, I'd expect getUnused to implement that (maybe with helpers) and that's going to involve some more analysis too.

On the other hand, I think the role of the loop in issueUnusedIncludesDiagnostics *is* simply converting: between an unused include list to diagnostics.

> I think it's somewhat similar to the patch where you mentioned "parse, don't validate" and it is a great argument: the <built-in> and <scratch-space> are indeed not "headers" in a sensible meaning of the term but non self-contained ones are.
I'm not sure the analogy holds: my argument was to do the filtering at the place where you're doing the conversion/parsing that might fail. But here we could perfectly well convert the unused FileIDs to unused Header IDs to unused Inclusions to diagnostics, it would just be logically wrong to do so. We have free choice of where to put that logic, and diagnostic conversion seems like the wrong place to me.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112695/new/

https://reviews.llvm.org/D112695