[cfe-dev] How do I try out C++ modules with clang?

Sean Silva chisophugis at gmail.com
Fri Oct 31 16:53:23 PDT 2014


(sorry for the delay, the dev meeting has been keeping me busy from sun-up
to sun-down for the last two days)

On Tue, Oct 28, 2014 at 4:03 PM, Stephen Kelly <steveire at gmail.com> wrote:

> Sean Silva wrote:
>
> >> I did this for Qt5Core and came up with
> >>
> >>  http://www.steveire.com/Qt5Core.modulemap
> >>
> >> What is the next step?
> >>
> >
> > It depends on what your goals are.
>
> My goal is to gain experience with clang modules for C++, and possibly to
> be
> able to communicate the principles behind the design of them, the gotchas,
> the limitations, the requirements and the benefits. I'm learning what I'd
> need to know to write useful blogs of give a useful talk at a conference or
> meetup.
>
> So, that's why I want to get from 'this file I hacked together does not
> cause explosions' to 'Here are the other things you can put in a modulemap
> file and why you would want to'.
>

Great question. Right now, I can think of two really useful things that
clang's functionality will give you:

1. A possibly very significant build time speedup. You will be getting >90%
of the benefit in this regard once your build "works" with modules and all
the time-consuming headers end up in one of your module maps (I'll explain
this a bit more below; there are many ways for it to "work"). For example,
if 40% of your build time is spent parsing the QtCore headers, say 500
times total across a 700 file build, then you will end up with pretty much
a 40% build time improvement (modulo Amdahl with the linker).

2. A way to check certain aspects of "modularity" of your headers. If you
want to track down recursive dependencies, then put each header in it's own
*top-level* module. If you want to verify that code that imports your
headers is only relying on the thing that your header is explicitly meant
to export, then you can replace `export *` with a whitelist of explicit
exports.



>
> > Once I got an initial module map
> > working, the first thing I did was to analyze build time. If you are on
> > Mac, then the attached patch to clang and the attached DTrace script
> could
> > prove useful.
>
> Cool, thanks. I'll try this at some point too for sure.
>
> > The "toss them all in a single top-level module" approach causes *all*
> the
> > headers to be included if any of them is included.
>
> Ah! Wow, that's certainly slide-worthy. So
>
>  #include <QtCore/QString>
>
>  int main(int, char **)
>  {
>    QString s;
>    QObject o;
>    QStringListModel m;
>    return 0;
>  }
>
> compiles with -fmodules but not without it.
>

Perfect slide-sized example!


>
> That's weird at least.
>
> > The `export *` just means "export declarations from any modules I depend
> > on", which is what you almost always want at the moment since that
> matches
> > textual inclusion semantics. I.e. if someone was including "QSubclass"
> and
> > relying on that include to bring in "QBaseclass" (or a system header or
> > whatever).
>
> ... ie
>
>  #include <QWidget>
>
>  int main(int, char **)
>  {
>    QObject o;
>    return 0;
>  }
>
> compiles without -fmodules and fails with it, until I add export *.
>

Another great slide-sized example!


>
> I'll add that to my qmake patch for generating the modules for now. In that
> patch I can be more-exact though about what to export (qmake knows which
> modules are dependencies). Is there any advantage to doing that? Although I
> can think of a case in the QtSql module I'd have to examine for a breakage
> case (QSqlRelationalDelegate includes QtWidget classes, but qmake may not
> consider that a module-dependency).
>
> >> I *could* include qatomic_x86.h to the modulemap, but that header is not
> >> designed for users to include. Users include qatomic.h instead, which
> >> includes the appropriate _foo header.
> >>
> >
> > This will break if multiple headers in the module map end up including
> > qatomic.h (whichever one comes first will include the declarations of
> > qatomic.h). If qatomic.h is only included from a single header in the
> > module map you should be fine though.
>
> It is included by multiple headers. I don't understand the problem though
> you're bringing up though. All TUs are expected to use the same
> qatomic_foo.h.
>
> What is the breakage case you're warning against?
>

The easiest way to explain this is to explain the present implementation.

One thing you should know at the outset is that clang generates one
serialized AST file in the modules cache path for every *top-level* module
and for every "configuration" (think about this as sets of compiler flags
that could cause different behavior; e.g. -fno-rtti).

Building an AST file for a top-level module is done as follows:
1. Create a "unity build" of all the headers mentioned in this top-level
module (including all submodules). Clang literally creates an in-memory
file called `<module-includes>` that just does a #include of every one of
the headers mentioned in the top-level module.
2. Clang parses this header and then generates a serialized AST file.

Now, how does clang distinguish submodules? Well, it basically does it by
using the source location to include only those headers for the submodule
that contains the `header` declaration. Every header mentioned in the
module map essentially contributes all of the declarations within it or
included by it. However, everything *not* mentioned in the module map ends
up being textually included by one of the headers in the module map, and
all of its declarations are "owned" by the corresponding header. Due to
include guards, basically this means there is a "first one wins" situation,
where during this "header unity build", whichever file includes a
non-mentioned header ends up "owning" all of the declarations.

So basically, suppose you have:

module TopLevel {
  module QFoo {
    header "QFoo"
    export *
  }
  module QBar {
    header "QBar"
    export *
  }
}

Say both the header "QFoo" and the header "QBar" end up including
`qatomic.h`. Then Clang's "unity build" will see:

#include <QFoo>
#include <QBar>

And clang's parsing will go something like this:

enter QFoo
include qatomic.h
exit QFoo
enter QBar
skip including qatomic.h due to header guard.
exit QBar

So now, if someone does
#include <QFoo>
 they will see qatomic.h, but if they do
#include <QBar>
they will not see qatomic.h, since nothing about qatomic.h was present
within the #include of QBar during the "header unity build". If qatomic.h
is mentioned in a module map, then clang will make a note to itself when it
sees those includes and so they will be properly exported.

Missing files in the module map can also cause some very cryptic errors
like the following:
foo.h:10 error: redeclaration of Foo
foo.h:10 note: previous declaration here
This basically means that a non-mentioned header is being entered through
two different inclusion paths (if it is mentioned in a module map, then it
will only be entered once when building the module). To debug this, there
is a secret clang flag `-fdiagnostics-show-note-include-stack` that will
show you the inclusion path on the note. The inclusion stack that starts
with the `<module-includes>` file is from the module and is as expected;
you will need to place at least one of the headers in the other inclusion
stack into a module map so that it gets turned into a module import. This
may be able to happen between two modules (i.e. both inclusion stacks start
from `<module-includes>` (but they are different top-level modules)).

Hope that helps. Keep the questions coming!

-- Sean Silva


>
> Thanks,
>
> Steve.
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141031/7aba3358/attachment.html>


More information about the cfe-dev mailing list