[cfe-dev] Preprocessing and whitespace collapsing

Anders Bakken agbakken at gmail.com
Mon Jan 13 21:46:57 PST 2014


Hi

We're using to make an indexer daemon which can be queried from emacs
(or other editors if so desired). In doing this we, for various
reasons, preprocess the file first using internal APIs and then invoke
clang_parseTranslationUnit on the preprocessed content. The code for
this is here:

https://github.com/Andersbakken/rtags/blob/multi-process/src/RTagsClang.cpp

Starting at line 450.

The problem we're facing is that clang seems to collapse things like this:

struct A
{
    void               foo();
};

into this:

struct A
{
    void foo();
};

int main()
{
    A a;
    a.foo();
}

I realize the standard (at least according to GCC) permits this and
that it of course would speed up parsing of the resulting code ever so
slightly. My question is, is it possible to:

A) Turn this behavior off with an option?
or
B) Someone query which lines had what amounts of space removed.

I assume something can be done since if I let
clang_parseTranslationUnit index it without preprocessing myself
everything works as expected.

>From (http://gcc.gnu.org/onlinedocs/gcc-3.3/cpp/Preprocessor-Output.html)

The ISO standard specifies that it is implementation defined whether a
preprocessor preserves whitespace between tokens, or replaces it with
e.g. a single space. In GNU CPP, whitespace between tokens is
collapsed to become a single space, with the exception that the first
token on a non-directive line is preceded with sufficient spaces that
it appears in the same column in the preprocessed output that it
appeared in the original source file. This is so the output is easy to
read. See Differences from previous versions. CPP does not insert any
whitespace where there was none in the original source, except where
necessary to prevent an accidental token paste.

Thanks and kudos for an awesome project.

regards

Anders



More information about the cfe-dev mailing list