[cfe-dev] Relexing more than one tokens

Tue Jul 7 22:19:21 PDT 2009

Chris Lattner ha scritto:
> 
> On Jul 6, 2009, at 4:59 AM, Abramo Bagnara wrote:
> 
>> Once obtained a SourceLocation (or a SourceRange) we'd like to relex the
>> preprocessed stream to check for the presence of some tokens.
>>
>> An example of use would be to check if the int type in an AST
>> declaration was written with "signed" or not.
>>
>> We are able to relex a single token from a given SourceLocation, but we
>> haven't found a way to use a SourceRange to relex all the tokens
>> included in the range.
>>
>> Is there a way?
>>
>> Do you have hints for alternative way to accomplish the same aim?
> 
> We don't have a great way to do this right now.  The basic problem is
> that you could have something like this:
> 
>   foo bar baz
> 
> In order to lex from "foo" to "baz", you need to know (e.g.) if bar is a
> macro that expands to zero (or many) tokens.  From an arbitrary point in
> an ASTConsumer, you don't have this information, because the macros
> could be undef'd etc.

Taken for granted that currently from any SourceLocation I can obtain
the related token, a possibility could be to have in class SLocEntry a
reference to next token's SourceLocation in preprocessed stream. It
should not be too hard to implement, but this means to add 32 bits to
each SLocEntry and to keep all translation unit source locations in memory.

> However, not all hope is lost.  It is very reasonable for an ASTConsumer
> to construct ASTs for a translation unit *AND* then preprocess the whole
> file again to get the tokens in a big vector.  Given that, you could map
> from the AST node to an index in the vector, then scan around in the
> vector of tokens looking for what you want.

How you'd map the AST node to the relexed tokens vector?