[cfe-dev] libclang returning extra tokens in declarations?

Jorge Fierro jorge at jorgefierro.com
Thu Dec 18 13:12:43 PST 2014


Hi all,

I'm just starting to learn the libclang API and I noticed something
about clang_tokenize() which I though could be a bug. I wanted to ask
if anyone is aware of it because it sounds like an important feature
so which made me question whether it's expected behavior or not.

The problem itself is that libclang seems to return extra tokens for
declarations. Take the following snippet:

int main(void)
{
   int result = 0;
   int a = 1;

   return 0;
}

The declaration statement(?) corresponding to the 'result' variable
would be tokenized as the following sequence of tokens (as reported by
clang_getTokenSpelling()): 'int', 'result', '=', '0', ';' and 'int'.
(Here, I did not expect to receive the 'int' that's part of the
following line.) Likewise, the declaration statement corresponding to
the 'a' variable would be tokenized as 'int', 'a', '=', '1', ';' and
'return'. More so, the source range does not actually include the
extra token and is limited to only the corresponding line where the
declaration is made.

I'm using release 3.5 in linux.



More information about the cfe-dev mailing list