[cfe-dev] Preprocessed loc/token retrieval dream (almost) come true

Abramo Bagnara abramo.bagnara at gmail.com
Mon Oct 3 11:57:36 PDT 2011

Il 03/10/2011 20:25, Argyrios Kyrtzidis ha scritto:
> Hi Abramo,
> Sorry to disappoint you but I think the dream remains unfulfilled ;-)

You make me sad for a few minutes... but let try to find a solution: I
think that to get preprocessed tokens has too many benefits to stop only
a few steps before to accomplish that.

Let me know if you don't see strong benefits in the possibility to get
the preprocessed tokens in a range.

First the easy part:

> Apart from that, this is trying to deal with macro expansions; how are
you handling preprocessor directives ? e..g:
> X
> #if  ...
> Y
> #else
> X
> #endif
> How do you find out what comes after 'X' if you don't preprocess ?

Preprocessor callbacks give us complete info about skipped area so the
helper just have to take in account that.

The same is true for file changes:

#include "..."

> The code that you posted was a bit hard to follow but correct me if I'm wrong;
> You are recording all macro expansion points and once you hit one, you enter the SLocEntry for the macro expansion and start lexing it, is this correct ?

Yes, and my tests show that it works very well in most cases.

> This may seem to work but it is not reliable. The main issue is that for macro arguments expansion we do *not* guarantee that the range of the SLocEntry contains only the tokens that were actually lexed.
> This is because we aggressively "merge" them to reduce the number of needed SLocEntries.
> Here's an example:
> #define M1 1
> #define M2 2
> #define M3 3
> #define MA1(a,b,c) a c
> #define MA2(x) x
> MA2( MA1(M1, M2, M3) )
> The tokens that MA2 ultimately receives are '1' and '3' but if you follow through and lex the SLocEntry that gets created for the macro arg expansion for MA2, you will notice that the length is 5 and it is actually a chunk encompassing "1 2 3".
> So, from this chunk, only '1' and '3' and their respective locations were actually passed to the parser but you don't know that just by looking at the SLocEntry.

How can I avoid that "optimization" and thus verify the real memory
impact with some huge and relevant testcases?

Many thanks for your help and your review.


More information about the cfe-dev mailing list