[cfe-dev] Language not recognized: 'c++-module-cpp-output'

Mon Jun 5 10:33:03 PDT 2017

On 4 June 2017 at 02:38, Boris Kolpackov <boris at codesynthesis.com> wrote:

> Richard Smith <richard at metafoo.co.uk> writes:
>
> > I think you misunderstood my question. What difference do you expect the
> -x
> > c++-cpp-output / -x c++-module-cpp-output flag to make, compared to
> passing
> > -x c++ / -x c++-module for the same input file? What do you think that
> flag
> > does?
>
> I believe that -x *cpp-output tells the compiler that what's being
> compiled has previously gone through -E. At least that's the GCC's
> semantics.
>
> It is also now clear that this semantics is pretty useless since it
> looses information and that's why we have -frewrite-includes and
> -fdirectives-only.
>

Yes. It's also pretty much pointless since the point of -E is to produce a
source file that is still a valid input in the original language.

> Now if you ask me what I would like -x *cpp-output to mean, it would
> be this: The input can still contain comments and line continuations
> but no macro expansions/conditions or #include directives.
>
> My understanding is that the preprocessor is essentially a tokenizer
> for the compiler frontend. So in this model a compiler could
> substitute a "full preprocessor" with a simpler and maybe faster
> tokenizer.
>

For us at least, this would add complexity (by adding a "no preprocessing"
mode) and likely not actually bring about any performance improvement --
the additional checks for "does this identifier have a defined macro" and
"is this a # at the start of a line" are extremely cheap. Plus, as you
mentioned above, this actually isn't what you want -- for compilers like
Clang (and recent versions of GCC) that take into account the provenance of
tokens (via macro expansion etc) when issuing diagnostics, preprocessing
prior to compilation proper harms the quality of experience of your users.

For what it's worth, I've implemented such a tokenizer in build2[1]
> and it turned out not too hairy. We use[2] it to extract module
> information from translation units.
>
> Its performance is about the same as Clang's full preprocessor (-E)
> which I think is not bad considering I haven't done any optimization
> work and it uses std::istream to read the data.
>

That's impressive, considering that we have done a lot of tuning on our
preprocessor. Perhaps it's time for us to stare at some profiles again and
see where we're wasting time.

> [1] https://git.build2.org/cgit/build2/tree/build2/cc/lexer.hxx
>     https://git.build2.org/cgit/build2/tree/build2/cc/lexer.cxx
>
> [2] https://git.build2.org/cgit/build2/tree/build2/cc/parser.hxx
>     https://git.build2.org/cgit/build2/tree/build2/cc/parser.cxx
>
> Boris
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170605/22c855d5/attachment.html>