[cfe-dev] Macro expansion in the Rewriter?
Abramo Bagnara
abramo.bagnara at bugseng.com
Tue Nov 27 04:26:51 PST 2012
Il 27/11/2012 10:23, David Chisnall ha scritto:
> On 27 Nov 2012, at 00:30, Eli Friedman wrote:
>
>> It sounds like useful functionality. We don't store whether an
>> identifier is an expanded macro or what it expanded to in any
>> convenient way, though, so it would be a pain to implement.
>
> I investigated this over the weekend and came to a similar
> conclusion. I have a student currently working on a code
> reformatting tool who wants to be able to see, from libclang, if a
> macro expansion contains open or close braces. I'd assumed that this
> would be something easy to expose, but it seems that we don't
> actually have any way of finding the sequence of tokens generated by
> a macro expansion (this is generated by the preprocessor, but not
> stored anywhere). Even the HTML Rewriter, which (given the output in
> the static analyser) I assumed would already have code for doing it
> contains a half-implemented duplication of the macro expansion
> logic.
>
> If someone's looking for a project, then factoring the macro
> expansion code out so that it could be rerun (the current code is
> destructive) would be very helpful. It would also improve
> diagnostics a lot if you could say exactly what the macro expansion
> was, not just the chain of macros that caused it.
We have investigated this possibility in past (see
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2011-October/017638.html),
but we didn't find a suitable solution that avoid the veto about making
Preprocessor slower in non negligible way.
Recently I've thought about a possibility that should have a minimal impact:
- suppose that the last two tokens preprocessed have respectively as
location Loc1 and Loc2
- if Loc1 and Loc2 come from the same FileID (i.e. their spelling
location are consecutive in source) nothing happens (the case
statistically far more frequent), otherwise a callback is invoked
passing Loc1 and Loc2 as arguments
- the program using clang library can implement such callback so to
store the locs in a jump table (a DenseMap)
When preprocessed token sequence is needed, ordinary relexing is used,
but using the jump table when we reach a location present in such table.
This permits not only to known the exact preprocessed token stream but
also to have every detail about every single token expansion in the
sequence.
I hope that this time we obtain a general consensus about adding this so
important missing feature.
--
Abramo Bagnara
BUGSENG srl - http://bugseng.com
mailto:abramo.bagnara at bugseng.com
More information about the cfe-dev
mailing list