[cfe-dev] walking macros with clang-c

Daniel Herring dherring at tentpost.com
Sun May 22 20:47:16 PDT 2011


On Wed, 11 May 2011, Douglas Gregor wrote:
> On May 11, 2011, at 6:23 PM, Daniel Herring wrote:
>> Is there a clang-c equivalent of the MacroInfo class?
>> http://clang.llvm.org/doxygen/classclang_1_1MacroInfo.html
>>
>> In particular, I want to know isFunctionLike/isObjectLike and to access
>> the argument and token lists.  Variadic and builtin information would be
>> nice but not necessary at the moment.
...
> libclang doesn't actually expose that information, but it would be really cool if it did.
...
> Here's how I'd tackle it: add a function that takes a macro-definition cursor, which would go into tools/libclang/CIndex.cpp. This function would re-preprocess that line of source code to create a MacroInfo object and return information about it (e.g., the tokens can be CXTokens, the macro arguments can be an array of strings, etc.).

Thanks for the reply.  Finally made some progress on this.  My initial 
experiment is giving "interesting" results...  The following snippet is C 
code, but compiled with C++ because I was experimenting with MacroInfo in 
the same file.  Haven't gotten far enough to know whether the MacroInfo 
API will help.

   if(kind==CXCursor_MacroDefinition)
     {
       CXSourceLocation sl=clang_getCursorLocation(cursor);
       CXSourceRange sr=clang_getCursorExtent(cursor);

       CXToken *tokens;
       unsigned num;
       clang_tokenize(tu, sr, &tokens, &num);
       for(int i=0; i<num; i++)
 	{
 	  clang_getTokenKind(tokens[i]);
 	  CXString s=clang_getTokenSpelling(tu, tokens[i]);
 	  CXTokenKind tk=clang_getTokenKind(tokens[i]);
 	  printf("\ttoken, %s: %s\n", TokenSpellings[tk], clang_getCString(s));
 	  clang_disposeString(s);
 	}
       clang_disposeTokens(tu, tokens, num);
     }


Here's a sample header file snippet to be walked.

#define H5T_UNIX_D32LE          (H5OPEN H5T_UNIX_D32LE_g)
#define H5T_UNIX_D64BE          (H5OPEN H5T_UNIX_D64BE_g)
#define H5T_UNIX_D64LE          (H5OPEN H5T_UNIX_D64LE_g)
H5_DLLVAR hid_t H5T_UNIX_D32BE_g;
H5_DLLVAR hid_t H5T_UNIX_D32LE_g;


and here's some sample output.

visited H5T_UNIX_D32LE (macro definition)
         token, Identifier: H5T_UNIX_D32LE
         token, Punctuation: (
         token, Identifier: H5OPEN
         token, Identifier: H5T_UNIX_D32LE_g
         token, Punctuation: )
         token, Punctuation: #
visited H5T_UNIX_D64BE (macro definition)
         token, Identifier: H5T_UNIX_D64BE
         token, Punctuation: (
         token, Identifier: H5OPEN
         token, Identifier: H5T_UNIX_D64BE_g
         token, Punctuation: )
         token, Punctuation: #
visited H5T_UNIX_D64LE (macro definition)
         token, Identifier: H5T_UNIX_D64LE
         token, Punctuation: (
         token, Identifier: H5OPEN
         token, Identifier: H5T_UNIX_D64LE_g
         token, Punctuation: )
         token, Identifier: H5_DLLVAR
visited H5T_C_S1 (macro definition)
         token, Identifier: H5T_C_S1
         token, Punctuation: (
         token, Identifier: H5OPEN
         token, Identifier: H5T_C_S1_g
         token, Punctuation: )
         token, Identifier: H5_DLLVAR


Notice how each set of macro tokens is including the first token from the 
next line?  Either there's a bug in clang_getCursorExtent, or I 
misunderstand its documentation.  I can always skip the last token if 
that's the right thing to do.

Clarification from somebody with experience would be appreciated.  I'd 
like to understand this before converting the CXSourceRange back into C++.

- Daniel



More information about the cfe-dev mailing list