[PATCH] D135859: [Includecleaner] Introduce RefType to ast walking

Fri Oct 28 05:51:10 PDT 2022

kadircet marked 6 inline comments as done.
kadircet added inline comments.

================
Comment at: clang-tools-extra/include-cleaner/lib/AnalysisInternal.h:33
+/// Indicates the relation between the reference and the target.
+enum class RefType {
+  /// Target is explicit from the reference, e.g. function call.
----------------
sammccall wrote:
> kadircet wrote:
> > hokein wrote:
> > > I'm wondering what's our plan of supporting policy (different coding-style may have different decisions about which includes are used)?
> > > 
> > > IIUC, the RefType is part of the picture, providing fine-grained information about each reference, and the caller can make their decisions based on it?
> > > 
> > > Thinking about the `Implicit` type, I think cases like non-spelled constructor call, implicit conversion calls (maybe more?) fall into this type, if we support `Policy.Constructor`, and `Policy.Conversion`, how do we distinguish with these cases? We can probably do some ad-hoc checks on the `TargetDecl`, but I'm not sure that the tuple `<SourceLocation, TargetDecl, Ref>` will provide enough information to implement different policy .
> > > IIUC, the RefType is part of the picture, providing fine-grained information about each reference, and the caller can make their decisions based on it?
> > 
> > Exactly that's the idea.
> > 
> > > if we support Policy.Constructor, and Policy.Conversion, how do we distinguish with these cases? We can probably do some ad-hoc checks on the TargetDecl
> > 
> > In our previous discussions we've also come to the conclusion that TargetDecl will have enough information for the caller to infer such information later on. We've decided to not make it part of the public api (at least initially) because it's unclear how widely useful those signals will be. But if it turns out to be needed by a variety of applications the design is extensible to either provide a:
> > - inferDetails(Symbol) helper, which would derive a signal structure that extracts most of those signals needed, or
> > - update Callback signature to also carry that information derived from the Symbol
> > 
> > Does that make sense?
> > In our previous discussions we've also come to the conclusion that TargetDecl will have enough information for the caller to infer such information later on
> 
> It was a while ago and my memory is bad, but this isn't my recollection (maybe you're talking about different discussions). I think it was rather "this is the bare minimum we need to get going, and we want to punt on real policy support for now".
> 
> At least "TargetDecl will have enough information" seems implausible on its face, e.g. how could you implement "types of variables are used, but types of expressions are not used" based on inspecting TargetDecl?
> It was a while ago and my memory is bad, but this isn't my recollection (maybe you're talking about different discussions). I think it was rather "this is the bare minimum we need to get going, and we want to punt on real policy support for now".
> At least "TargetDecl will have enough information" seems implausible on its face, e.g. how could you implement "types of variables are used, but types of expressions are not used" based on inspecting TargetDecl?

Right. I wasn't talking about inferring signals about how a decl was found (e.g. type of an expression, as you've also mentioned), but rather talking about information about the decl itself (is this a member, constructor etc.).

================
Comment at: clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp:107
 TEST(WalkAST, DeclRef) {
-  testWalk("int ^x;", "int y = ^x;");
-  testWalk("int ^foo();", "int y = ^foo();");
-  testWalk("namespace ns { int ^x; }", "int y = ns::^x;");
-  testWalk("struct S { static int ^x; };", "int y = S::^x;");
+  testWalk("int $explicit^x;", "int y = ^x;");
+  testWalk("int $explicit^foo();", "int y = ^foo();");
----------------
sammccall wrote:
> kadircet wrote:
> > sammccall wrote:
> > > This is very noisy. please find a way to avoid having to mark all test cases as `$explicit`, e.g. by splitting implicit cases into different tests
> > what about treating non-named points as explicit? that way we would only annotate non-explicit references (which are rare).
> Not sure implicit references are actually rare, more like our examples are trivial for now :-)
> 
> But I might be wrong, up to you.
> 
> (I would at least use $A^x instead of $ambiguous^x)
i don't like the idea of splitting tests because i feel like that'll lead to duplicating lots of test cases when we have a mixture of references happening (e.g. implicit type conversion and an explicitly spelled type from same reference).

so another option i can think of here is returning annotations for points from testWalk and building expectations on top of that, e.g:
```
EXPECT_REFERENCES("void ^foo();", "void bar() { ^foo(); }", {"explicit"}); // checks that points in target have respective reference types
EXPECT_EXPLICIT_REFERENCES("void ^foo();", "void bar() { ^foo(); }"); // shorthand representation for making sure all the references are with a single annotation.
```

WDYT?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135859/new/

https://reviews.llvm.org/D135859