[PATCH] D56444: [AST] RecursiveASTVisitor visits lambda classes when implicit visitation is on.

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jan 9 06:56:15 PST 2019


aaron.ballman added a comment.

In D56444#1351067 <https://reviews.llvm.org/D56444#1351067>, @klimek wrote:

> In D56444#1351056 <https://reviews.llvm.org/D56444#1351056>, @aaron.ballman wrote:
>
> > In D56444#1350849 <https://reviews.llvm.org/D56444#1350849>, @JonasToth wrote:
> >
> > > In D56444#1350746 <https://reviews.llvm.org/D56444#1350746>, @sammccall wrote:
> > >
> > > > @klimek: would it be better to preserve the odd behavior of the `functionDecl()` matcher, and add a new `functionOrLambdaDecl()`? It seems too surprising to me.
> > >
> > >
> > > We can change the clang-tidy check as well. I think lambdas should be considered functionDecls, shouldnt they? WDYT @aaron.ballman ?
> >
> >
> > I think it depends on whether we want the AST matchers to match what the user wrote (syntax) or how the program behaves (semantics). If the user writes a lambda, they did not syntactically write a function declaration and so it makes sense for `functionDecl()` to not match. However, the semantics of a lambda are that they act as a functor (in effect) which includes a class declaration, a function declaration for the function call operator, conversion operators, etc and so it does make sense for users to want to match those semantic components. Given that, I kind of think we should have functionDecl() match only functions, and give users some other way to match the semantic declarations in a consistent manner. Alternatively, we could decide semantics are what we want to match (because it's what the AST encodes) and instead we give users a way to request to only match syntax.
>
>
> The problem is that while I agree it would be nice if we matched either exactly the semantic or syntactic form of the code, the reality is that we match whatever the AST represents when we visit all nodes. Instead o f promising folks something that we can't hold (because nobody has time to fully check which matchers will match which form), I think telling people that we match the AST is the more honest and less surprising result in the end.


I both agree and disagree. I think being able to point users to the concrete thing we're matching against is really useful (and honest). I also think surprising is subjective -- the current behavior with lambdas is baffling to me because it's an inconsistent mixture of syntax and semantics that we match against and that can be hard to remember when writing matchers. Also, it's hard to say we're matching the exact AST for lambdas.

  int main() {
    auto l = [](){ return 12; };
    int (*fp)() = l;
  }

gives us this AST

  TranslationUnitDecl 0x2c3521ed618 <<invalid sloc>> <invalid sloc>
  |-TypedefDecl 0x2c3521edef0 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128'
  | `-BuiltinType 0x2c3521edbb0 '__int128'
  |-TypedefDecl 0x2c3521edf58 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128'
  | `-BuiltinType 0x2c3521edbd0 'unsigned __int128'
  |-TypedefDecl 0x2c3521ee2a8 <<invalid sloc>> <invalid sloc> implicit __NSConstantString '__NSConstantString_tag'
  | `-RecordType 0x2c3521ee030 '__NSConstantString_tag'
  |   `-CXXRecord 0x2c3521edfa8 '__NSConstantString_tag'
  |-TypedefDecl 0x2c3521ee340 <<invalid sloc>> <invalid sloc> implicit __builtin_ms_va_list 'char *'
  | `-PointerType 0x2c3521ee300 'char *'
  |   `-BuiltinType 0x2c3521ed6b0 'char'
  |-TypedefDecl 0x2c3521ee3a8 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list 'char *'
  | `-PointerType 0x2c3521ee300 'char *'
  |   `-BuiltinType 0x2c3521ed6b0 'char'
  `-FunctionDecl 0x2c3521ee450 <C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:1:1, line:4:1> line:1:5 main 'int ()'
    `-CompoundStmt 0x2c3524c1d00 <col:12, line:4:1>
      |-DeclStmt 0x2c3524c1a40 <line:2:3, col:30>
      | `-VarDecl 0x2c3521ee590 <col:3, col:29> col:8 used l '(lambda at C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:2:12)':'(lambda at C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:2:12)' cinit
      |   `-LambdaExpr 0x2c3524c1830 <col:12, col:29> '(lambda at C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:2:12)'
      |     |-CXXRecordDecl 0x2c3524c1228 <col:12> col:12 implicit class definition
      |     | |-DefinitionData lambda pass_in_registers empty standard_layout trivially_copyable literal can_const_default_init
      |     | | |-DefaultConstructor defaulted_is_constexpr
      |     | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
      |     | | |-MoveConstructor exists simple trivial needs_implicit
      |     | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
      |     | | |-MoveAssignment
      |     | | `-Destructor simple irrelevant trivial
      |     | |-CXXMethodDecl 0x2c3524c1360 <col:15, col:29> col:12 used constexpr operator() 'int () const' inline
      |     | | `-CompoundStmt 0x2c3524c1568 <col:16, col:29>
      |     | |   `-ReturnStmt 0x2c3524c1558 <col:18, col:25>
      |     | |     `-IntegerLiteral 0x2c3524c1408 <col:25> 'int' 12
      |     | |-CXXConversionDecl 0x2c3524c16e0 <col:12> col:12 implicit used constexpr operator int (*)() 'int (*() const)()' inline
      |     | | `-CompoundStmt 0x2c3524c1c98 <col:12>
      |     | |   `-ReturnStmt 0x2c3524c1c88 <col:12>
      |     | |     `-ImplicitCastExpr 0x2c3524c1c70 <col:12> 'int (*)()' <FunctionToPointerDecay>
      |     | |       `-DeclRefExpr 0x2c3524c1c50 <col:12> 'int ()' lvalue CXXMethod 0x2c3524c1780 '__invoke' 'int ()'
      |     | |-CXXMethodDecl 0x2c3524c1780 <col:12> col:12 implicit used __invoke 'int ()' static inline
      |     | | `-CompoundStmt 0x2c3524c1c40 <col:12>
      |     | `-CXXDestructorDecl 0x2c3524c1860 <col:12> col:12 implicit referenced ~ 'void () noexcept' inline default trivial
      |     `-CompoundStmt 0x2c3524c1568 <col:16, col:29>
      |       `-ReturnStmt 0x2c3524c1558 <col:18, col:25>
      |         `-IntegerLiteral 0x2c3524c1408 <col:25> 'int' 12
      `-DeclStmt 0x2c3524c1ce8 <line:3:3, col:18>
        `-VarDecl 0x2c3524c1af0 <col:3, col:17> col:9 fp 'int (*)()' cinit
          `-ImplicitCastExpr 0x2c3524c1cd0 <col:17> 'int (*)()' <UserDefinedConversion>
            `-CXXMemberCallExpr 0x2c3524c1cb0 <col:17> 'int (*)()'
              `-MemberExpr 0x2c3524c1c10 <col:17> '<bound member function type>' .operator int (*)() 0x2c3524c16e0
                `-ImplicitCastExpr 0x2c3524c1bf8 <col:17> 'const (lambda at C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:2:12)' lvalue <NoOp>
                  `-DeclRefExpr 0x2c3524c1b50 <col:17> '(lambda at C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:2:12)':'(lambda at C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:2:12)' lvalue Var 0x2c3521ee590 'l' '(lambda at C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:2:12)':'(lambda at C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:2:12)'

but the matcher behavior is

  clang-query> m functionDecl()
  
  Match #1:
  
  C:\Users\aballman.GRAMMATECH\Desktop\test.cpp:1:1: note: "root" binds here
  int main() {
  ^~~~~~~~~~~~
  1 match.
  clang-query> m cxxRecordDecl()
  
  Match #1:
  
  1 match.

So we match the `CXXRecordDecl` for the lambda but none of the functions declared within it.

However, as you say, we may not be able to support syntactic vs semantic traversals. I was envisioning something along the lines of the syntactic traversal looking at the source location of the AST node when deciding whether to match -- if the location isn't somewhere in user code, then don't consider the node or its children for traversal. However, that may be insufficient and equally as mysterious behavior.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56444/new/

https://reviews.llvm.org/D56444





More information about the cfe-commits mailing list