[cfe-dev] libclang: Resolving dependent names

Jusufadis Bakamovic via cfe-dev cfe-dev at lists.llvm.org
Fri Jan 20 04:49:48 PST 2017


Hi,

I am using libclang API to implement semantic syntax highlighting. However,
there are a few rough edges which should be polished in order to get
satisfying results. Resolving dependent names ('CXType_DEPENDENT' AST nodes)
 is one of them that I've came across. libclang in this case does not
provide the same depth of information about the node(s) as it does for a
non-dependent-name node(s).

In particular, following example depicts what I am trying to explain
(please mind the comments in the code):

// demo.cpp
#include <vector>

template <typename T>
bool foo() {
std::vector<T> vec;
        vec._M_impl;          // 'vec' is identified as part of
'CXXDependentScopeMemberExpr' but access to member field '_M_impl' is not
and it's completely missing from the AST
return vec.empty(); // 'vec' is identified as part of
'CXXDependentScopeMemberExpr' but call expression 'empty' is not and it's
completely missing from the AST
}

bool bar() {
std::vector<int> vec;
vec._M_impl; // both 'vec' & access to member field '_M_impl' are
identified as expected
        return vec.empty(); // both 'vec' & 'empty' are identified as
expected
}

// output from clang -Xclang -ast-dump demo.cpp
|-FunctionTemplateDecl 0x564cf4e63c48 <demo.cpp:3:1, line:8:1> line:4:6 foo
| |-TemplateTypeParmDecl 0x564cf4e63af8 <line:3:11, col:20> col:20
referenced typename T
| `-FunctionDecl 0x564cf4e63ba0 <line:4:1, line:8:1> line:4:6 foo '_Bool
(void)'
|   `-CompoundStmt 0x564cf4e64190 <col:12, line:8:1>
|     |-DeclStmt 0x564cf4e64038 <line:5:2, col:20>
|     | `-VarDecl 0x564cf4e63fd8 <col:2, col:17> col:17 referenced vec
'std::vector<T>':'vector<T>'
|     |-CXXDependentScopeMemberExpr 0x564cf4e64078 <line:6:2, col:6>
'<dependent type>' lvalue
|     | `-DeclRefExpr 0x564cf4e64050 <col:2> 'std::vector<T>':'vector<T>'
lvalue Var 0x564cf4e63fd8 'vec' 'std::vector<T>':'vector<T>'
|     `-ReturnStmt 0x564cf4e64178 <line:7:2, col:19>
|       `-CallExpr 0x564cf4e64150 <col:9, col:19> '<dependent type>'
|         `-CXXDependentScopeMemberExpr 0x564cf4e640f8 <col:9, col:13>
'<dependent type>' lvalue
|           `-DeclRefExpr 0x564cf4e640d0 <col:9>
'std::vector<T>':'vector<T>' lvalue Var 0x564cf4e63fd8 'vec'
'std::vector<T>':'vector<T>'
`-FunctionDecl 0x564cf4e641e0 <line:10:1, line:14:1> line:10:6 bar '_Bool
(void)'
  `-CompoundStmt 0x564cf4e76dd8 <col:12, line:14:1>
    |-DeclStmt 0x564cf4e76898 <line:11:2, col:22>
    | `-VarDecl 0x564cf4e646c8 <col:2, col:19> col:19 used vec
'std::vector<int>':'class std::vector<int, class std::allocator<int> >'
callinit
    |   `-CXXConstructExpr 0x564cf4e76868 <col:19>
'std::vector<int>':'class std::vector<int, class std::allocator<int> >'
'void (void)'
    |-MemberExpr 0x564cf4e768f8 <line:12:2, col:6> 'struct
std::_Vector_base<int, class std::allocator<int> >::_Vector_impl' lvalue
._M_impl 0x564cf4e6bb78
    | `-ImplicitCastExpr 0x564cf4e768d8 <col:2> 'struct
std::_Vector_base<int, class std::allocator<int> >' lvalue
<UncheckedDerivedToBase (_Vector_base)>
    |   `-DeclRefExpr 0x564cf4e768b0 <col:2> 'std::vector<int>':'class
std::vector<int, class std::allocator<int> >' lvalue Var 0x564cf4e646c8
'vec' 'std::vector<int>':'class std::vector<int, class std::allocator<int>
>'
    `-ReturnStmt 0x564cf4e76dc0 <line:13:2, col:19>
      `-CXXMemberCallExpr 0x564cf4e76d50 <col:9, col:19> '_Bool'
        `-MemberExpr 0x564cf4e76d18 <col:9, col:13> '<bound member function
type>' .empty 0x564cf4e6fc20
          `-ImplicitCastExpr 0x564cf4e76da8 <col:9> 'const class
std::vector<int, class std::allocator<int> >' lvalue <NoOp>
            `-DeclRefExpr 0x564cf4e76cf0 <col:9> 'std::vector<int>':'class
std::vector<int, class std::allocator<int> >' lvalue Var 0x564cf4e646c8
'vec' 'std::vector<int>':'class std::vector<int, class std::allocator<int>
>'

Is this really a technical issue from language POV or an implementation
detail that is still missing? 14.6.2 [temp.dep] defines dependent-names as
constructs whose semantics may differ from one instantiation to another.
However, I am not quite sure if I understand this correctly because
semantics of something being a data member or a function member cannot be
really changed across different instantiations?

I managed somehow to workaround this issue by tokenizing such (dependent)
nodes and then trying to deduce their kinds by checking up their parent
from the AST. I.e. 'MEMBER_REF_EXPR' dependent-name nodes
('CXType_DEPENDENT') which have 'CALL_EXPR' as their direct AST parent can
be resolved as 'CXCursor_CXXMethod'. Otherwise, they would have been
resolved as 'CXCursor_FieldDecl'. This satisfies the use-case which I am
currently trying to cover and it seems to be working but I thought it would
be better to have this functionality provided by the library, and probably
a more generic solution which would fit other use-cases as well (i.e.
non-'MEMBER_REF_EXPR' nodes). If this is possible I would be happy to
contribute.

Cheers,
Adi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170120/f9ccf6ca/attachment.html>


More information about the cfe-dev mailing list