[cfe-dev] Relexing more than one tokens

Tue Jul 7 23:07:12 PDT 2009

Chris Lattner ha scritto:
> 
> On Jul 7, 2009, at 10:19 PM, Abramo Bagnara wrote:
> 
>>>
>>> In order to lex from "foo" to "baz", you need to know (e.g.) if bar is a
>>> macro that expands to zero (or many) tokens.  From an arbitrary point in
>>> an ASTConsumer, you don't have this information, because the macros
>>> could be undef'd etc.
>>
>> Taken for granted that currently from any SourceLocation I can obtain
>> the related token, a possibility could be to have in class SLocEntry a
>> reference to next token's SourceLocation in preprocessed stream. It
>> should not be too hard to implement, but this means to add 32 bits to
>> each SLocEntry and to keep all translation unit source locations in
>> memory.
> 
> We can't do that, SourceLocation has to stay 32-bits, it is very pervasive.

SourceLocation size would not change, the link to next SourceLocation
would be added to SLocEntry (similarly to IncludeLoc and SpellingLoc).

>>> However, not all hope is lost.  It is very reasonable for an ASTConsumer
>>> to construct ASTs for a translation unit *AND* then preprocess the whole
>>> file again to get the tokens in a big vector.  Given that, you could map
>>> from the AST node to an index in the vector, then scan around in the
>>> vector of tokens looking for what you want.
>>
>> How you'd map the AST node to the relexed tokens vector?
> 
> Just compare the source locations.

Are you meaning that preprocessing the whole translation unit again I'd
get the same SourceLocation opaque ID?